We were performing a test deployment of an application, that utilizes DynamoDB for persistency. A number of tables was created in the us-east region. Then we ran some tests against the application, that resulted in a significant number of writes and reads of those tables, exceeding the throughput thresholds. All of a sudden, though, the requests to the DynamoDB stopped coming through at all from that particular machine. We recreated the tables in the eu-west region and ran the tests again. It worked for some time, but in the morning it was discovered, that the same thing happened to the eu-west installation, but at the same time, the requests against the us-west one started coming through.
There's more, after a bit of investigation, it was discovered, that if, at the time, when all requests against some region failed, we could not even open a connection to the DynamoDB endpoint for that region (basically, "wget https://dynamodb.us-west-1.amazonaws.com" failed with a timeout).
Even more, at the time, when we could not connect to a particular DynamoDB endpoint, all other machines could do that just fine. Even the ones, that were in the same subnet with the affected machine and behind the same NAT (therefore, sharing its source IP address!).
All the machines, that I am talking about are actually EC2 instances so there's no real hardware involved on our side.
Any idea, what could be wrong?
We didn't touch the network configuration for the duration of the tests. Could it be some form of throttling that we were experiencing?