5

We were performing a test deployment of an application, that utilizes DynamoDB for persistency. A number of tables was created in the us-east region. Then we ran some tests against the application, that resulted in a significant number of writes and reads of those tables, exceeding the throughput thresholds. All of a sudden, though, the requests to the DynamoDB stopped coming through at all from that particular machine. We recreated the tables in the eu-west region and ran the tests again. It worked for some time, but in the morning it was discovered, that the same thing happened to the eu-west installation, but at the same time, the requests against the us-west one started coming through.

There's more, after a bit of investigation, it was discovered, that if, at the time, when all requests against some region failed, we could not even open a connection to the DynamoDB endpoint for that region (basically, "wget https://dynamodb.us-west-1.amazonaws.com" failed with a timeout).

Even more, at the time, when we could not connect to a particular DynamoDB endpoint, all other machines could do that just fine. Even the ones, that were in the same subnet with the affected machine and behind the same NAT (therefore, sharing its source IP address!).

All the machines, that I am talking about are actually EC2 instances so there's no real hardware involved on our side.

Any idea, what could be wrong?

We didn't touch the network configuration for the duration of the tests. Could it be some form of throttling that we were experiencing?

shylent
  • 792
  • 10
  • 22

2 Answers2

1

Have you tried rebooting your router? The fact that some of the servers behind your NAT gateway work, but others do not leave me to believe that the problem is on your end, not Amazon's.

If it's a consumer grade device, try updating the firmware. What brand/model is it?

jamieb
  • 3,387
  • 4
  • 24
  • 36
  • All of the machines (including the one, where NAT is performed) are EC2 instances, so there's no "device" to reboot or update the firmware on. – shylent Jul 23 '12 at 03:25
  • When DynamoDB throttles, it returns HTTP 400 errors, it doesn't just drop packets. How reliability are you able to reproduce the problem? – jamieb Jul 23 '12 at 23:39
  • Very reliably indeed. I've just made a couple hundred requests against the eu-west-1 endpoint and, there it is - I am timing out while trying to connect to it from that machine. – shylent Jul 24 '12 at 06:53
  • 1
    Have you taken a look at your CloudWatch metrics for DyanmoDB? Are you running into the limits of your provisioned capacity? Also, what library/SDK are you using to connect to DynamoDB? – jamieb Jul 25 '12 at 14:59
0

Have you checked the read/write capacity of your Dynamodb tables. Every table has read/write capacity associated with it. If you reach max capacity, it stops receiving connection. Also there is limit for updating these read/write of Dynamodb in one day. Check that as well. I hope this helps.

Shailesh Sutar
  • 1,427
  • 4
  • 22
  • 40