Get all files (or file names) out of s3 bucket for specific date

1

1

I need to get all the files from yesterday from the s3 bucket, i know how to do this in the CLI when i know the file name: #aws s3 cp s3:{Path}/{FileName} {diretoryToCopyTo}

but how would I do this for files of a specific date? Just getting the list of file names for a specific date would do as well, from what ever method, does not need to be CLI.

The catch: The bucket has a few million files so I am also looking for a cost effective way.

Vincent

Posted 2019-08-15T23:58:32.220

Reputation: 111

Answers

1

If the filename contains the date, you can use include and exclude filters: aws s3 cp s3:{path}/ {directoryToCopyTo} --exclude "*" --include "*2019-09-09*"

If the date is only in the file metadata, there is not currently a method of server-side filtering by date. There's a github discussion where willstruebing has a method using s3api:

here is a way to do this with the s3api and the --query function. This is tested on OSX: aws s3api list-objects --bucket "bucket-name" --query 'Contents[?LastModified>=2016-05-20][].{Key: Key}' You can then filter using jq or grep to do processing with the other s3api functions. Edit: not sure why they are not showing up, but you have to use backticks to surround the date that you are querying

...but this does not reduce the number of api calls, as filtering is done on the client-side.

enharmonic

Posted 2019-08-15T23:58:32.220

Reputation: 203