AWS
Frequently used interaction patterns with AWS.
CLI
To create a new bucket, use
aws s3 mb bucketname
.To add a subfolder to a bucket, use
aws s3api put-object --bucket bucketname --key foldername
Setup
There are multiple ways to access your AWS account. I store config and credential files in
~/.aws
as discussed here. AWS access methods find these files automatically so I don’t have to worry about that.What I do have to worry about is choosing the appropriate profile depending on what AWS account I want to interact with (e.g. my personal one or one for work). This is different for each library, so I cover this below.
s3fs
Built by people at Dask, s3fs
is built on top of botocore
and provides a convenient way to interact with S3. It can read and – I think – write data, but there are easier ways to do that, and I use the library mainly to navigate buckets and list content.
Navigate buckets
|
|
To choose a profile other than default
, use:
|
|
# Read and write directly from Pandas
Pandas can read and write files to and from S3 directly if you provide the file name as
s3://<bucket>/<filename>
.By default,
Pandas
uses the default profile to access S3. Recent versions ofPandas
have astorage_options
parameter that can be used to provide, among other things, a profile name.
Basics
|
|
|
|
This works well for simple jobs, but in a large project, passing the profile information to each read and write call is cumbersome and ugly.
Simple improvement using functools.partial
functools.partial
provides a simple solution, as it allows me to create a custom function with a frozen storage options argument.
|
|
More flexible solution with custom function
Often, I run projects on my Mac for testing and a virtual machine to run the full code. In this case, I need a way to automatically provide the correct profile name.
|
|
|
|
The above is not ideal, as it requires cumbersome unpacking of return. Maybe using decorator is better.
awswrangler
A new library from AWS labs for Pandas interaction with a number of AWS services. Looks very promising, but haven’t had any use for it thus far.