I've been having trouble uploading a large (800MB) file to s3, using the aws
commandline tool. The first attempt completed (after many hours) but was not visible, and I was advised (here) that it has been eaten by goblins and I need to start again.
I did a test with a 16MB file, and it did a 3-part upload, and completed with no problems. I can see it there with aws s3 ls s3://mybucket
.
So then I tried with aws s3 cp bigfile.tgz s3://mybucket
. But 26 minutes in, I noticed I have got three upload failures, each looking like this:
upload failed: ./bigfile.tgz to s3://mybucket/bigfile.tgz
HTTPSConnectionPool(host='s3-eu-west-1.amazonaws.com', port=443): Max retries exceeded with url: /mybucket/bigfile.tgz?partNumber=8&uploadId=m_jMF.[elided]UPz (Caused by <class 'ConnectionResetError'>: [Errno 104] Connection reset by peer)
Actually the 3rd message says: "Caused by : [Errno 32] Broken pipe)", rather than "Caused by : [Errno 104] Connection reset by peer".
At this point it is still running, and says:
Completed 16 of 120 part(s) with -2 file(s) remaining
This happened before, and I just ignored it, assuming if it was a fatal error it would have stopped. Now I wondering if it is going to spend another 3hrs chugging away and give me an invisible file again, because some parts failed to upload?
If this is the case, my question is: how do I upload a large file to s3 over an internet connection that sometimes has these issues? Is there a way to tell it not to give up so easily?
UPDATE: I tried with the free wifi at a different location, and the file completed quickly, and with none of those failure messages. So, nothing wrong with the file, or my s3 setup. Still hoping to find some configuration option so I can tell it to keep re-trying each part forever.