Drop/Truncate DynamoDB table with AWS Data Pipeline

Question

I have set up a Data Pipeline that imports files from an S3 bucket to a DynamoDB table, based on the predefined example. I want to truncate the table (or drop and create a new one) every time the import job starts. Of course this is possible with the AWS SDK but I would like to do it only by using the Data Pipeline.

Is is possible to do so?

Thanks for any help

how to truncate the table using AWS SDK? – sid8491 Apr 17 '18 at 10:02 — sid8491, Apr 17 '18 at 10:02

Paulo Miguel Almeida · Accepted Answer · 2014-09-23T21:50:53.490

I'm not sure if you still need to perform this operation since you asked it many months ago, but due to the lack of information on internet about this subject I've decided to create a tutorial and post it here to help other people who's facing the same situation .

This is what worked for me.

Basically you'll need the following:

S3 bucket ( Where you'll upload a shell script to be executed )
AMI EC2 ( That will execute that script above )
A Pipeline ( That already imports DynamoDB data to a S3 bucket )

If you already got all of them, then we're good to go!

Follow these steps:

Add an activity and name it as 'CleanTableJob'

enter image description here

On CleanTableJob set settings accordingly to this: ( On Runs on -> Select New Resource and name it as CleanDynamodbTableResource)

enter image description here

On CleanDynamodbTableResource set settings accordingly to this:

enter image description here

On your S3 bucket you may provide whatever that handles deleting data on DynamoDB like that:

java -jar /home/ec2-user/downloads/dynamodb_truncate_table-1.0-SNAPSHOT.jar
That's it:

enter image description here

Hope it helps you guys out

Thanks for writing this little tutorial. In the meantime I solved it myself with a pretty similar approach. — FLXN, Sep 24 '14 at 10:06
this link also helps: https://forums.aws.amazon.com/thread.jspa?threadID=215056 — Matias Elorriaga, Mar 27 '18 at 21:01

Drop/Truncate DynamoDB table with AWS Data Pipeline

1 Answers1