0

I have some terabytes of data in our legacy system which runs SQL server. Our newer version runs on MongoDB. We are migrating this data to MongoDB. We have python scripts written and verfied, all data movement happens properly.

we did this on a lower machine which 4 cores, if we do it on bigger machine, its going to be very expensive. AWS Lambda has 15 minutes processing time, this takes more than 24 hrs for one iteration to finish. AWS step functions promises it, but not sure if it is the right one.

J Bourne
  • 101
  • 3
  • Does this answer your question? [Can you help me with my capacity planning?](https://serverfault.com/questions/384686/can-you-help-me-with-my-capacity-planning) – Gerald Schneider Nov 30 '21 at 12:56
  • Thanks for response, but Im not looking for capacity planning, I have those details, I need what is the best infra (mostly from AWS cloud) to run my scripts uniterruptedly for few days, which would migrate data of SQL to mongo, so that they are finished on time. – J Bourne Nov 30 '21 at 14:21

1 Answers1

2

Can you not do "mongoexport" locally, export to S3 (or a physical AWS Snowcone device), use an EC2 instance to "mongoimport", then run your script to do any updates since the dump?

As for how to run it, you would probably get away with using a spot EC2 instance, particularly if you use it outside peak hours for the region - perhaps a weekend. If your job can't be interrupted then on-demand EC2. An m5.xlarge with 4 cores / 16GB RAM is $0.20 per hour, a couple of days of that is $10.

I'll also point out that say 3TB at 100Mbps will take 2.6 days to send, but at 800Mbps will take 7 hours - but sustaining that bandwidth may be difficult without DirectConnect. You might be best off using an AWS Snowcone which is a physical device you copy data to then ship to AWS.

I would suggest using AWS Database Migration Service to migrate from MongoDB to AWS DocumentDB, which is their version of MongoDB with a different name. DMS will migrate the data, then you just point your application at the new instance and turn the old one off.

Tim
  • 30,383
  • 6
  • 47
  • 77
  • For "only" 3 TB a Snowball maybe overkill. Perhaps consider a Snowcone? – Oscar De León Nov 30 '21 at 22:32
  • 1
    Yeah, I was thinking the Snow family, I'll edit to be more precise. – Tim Dec 01 '21 at 00:08
  • Im already using m5.xlarge, its not helping much. but thanks for snowcone and snow family, will take a look at it. – J Bourne Dec 01 '21 at 17:36
  • You're best placed to consider instance requirements since you have access to metrics - do you need more CPU, more RAM, more or faster storage, or something else? I do think a snow family device will be a good choice for this. – Tim Dec 01 '21 at 19:10
  • Yes, I need more CPU and RAM, highspeed, Im not doing offline datamigration, Im doing from one live DB server to another live DB server. – J Bourne Dec 03 '21 at 18:01
  • The cost of a larger instance for a day or two to migrate a huge database shouldn't be much. I would look at using AWS Database Migration Service to migrate to AWS DocumentDB which is basically MongoDB with a different label, instead of using custom scrips and your own EC2 instance. Answer updated. – Tim Dec 03 '21 at 19:21