18

I'm wondering about the rationale for using ElastiCache/SimpleQueue vs just having "Cache" and "Queue" tables inside of DynamoDB respectively.

It seems that the network latency to the Cache/Queue services would trump a lot of the performance gains, and that having EC2 treat Dynamo as it's cache/queue service would offer the same latency and throughput (since Dynamo allows a fixed low latency under any load).

Is it mainly about the price of dynamo vs other services under load?

Does anyone have any rough latency numbers comparing Dynamo with ElastiCache/SQS?

Are there other more important considerations that I'm missing which justify the additional complexity?

Thanks.

Scott Klarenbach
  • 559
  • 2
  • 8
  • 19

3 Answers3

13

We're using DynamoDB and ElastiCache Redis for different reasons.

DynamoDB:

  • Has a query language which is able to do more complex things (greater than, between etc.)
  • Is reachable via an external internet-facing API (different regions are reachable without any changes or own infrastructure)
  • Permissions based on tables or even rows are possible
  • Scales in terms of data size to infinity
  • You pay per request -> low request numbers means smaller bill, high request numbers means higher bill
  • Reads and Writes are different in costs
  • Data is saved redundant by AWS in multiple facilities
  • DynamoDB is highly available out-of-the-box
  • Autoscaling is available in the service itself

ElastiCache Redis:

  • Simple query language - no complex features
  • Is (out-of-the-box) not reachable from other regions.
  • You're always limited to the amount of memory (or the sum of all primary instances in a cluster)
  • Sharding over multiple instances is only possible within your application - Redis doesn't do anything here (Redis cluster helps here but the sharding logic is still inside of the driver/sdk you're using in your application) - scale-in and scale-out is not possible without downtime at the moment
  • You pay per instance no matter how the load or the number of requests are.
  • If you want redundancy of the data you need to setup replication (not possible between different regions)
  • You need to use replicas for high availability
  • No autoscaling available (see the part about no scaling at all above)

So our setup most of the time is: Simple caches with high volume of requests in Redis backed by DynamoDB as the permanent and long-durable storage. With this we limit the costs as we get an implicit discount for our reads by the pay-per-instance model of Redis but also get the benefit of the redundancy of DynamoDB and are even able to use the DynamoDB query language for more complex stuff (if we need it).

Hope that helps!

Update: With the announcement of Amazon DynamoDB Accelerator (https://aws.amazon.com/de/dynamodb/dax/) we're switching over to use DAX as it is (in the end) exactly what we were doing with the combination of DynamoDB and Redis. As DAX ist fully-managed by AWS and gives us the chance to always use the DynamoDB language in our application but also get the benefits from a write-through cache like Redis.

Osterjour
  • 825
  • 8
  • 12
  • Very helpful, thanks. What I do not understand is how you backup redis with dynamodb: Is this a feature of AWS? or when and how do you create the backup? Thank, if you can clarify that! – badera Aug 23 '17 at 19:43
  • My "backed by" isn't meant as a backup in a traditional way. We're actually writing to DynamoDB all the time and using Redis for reading first. So even if Redis looses data we have that in DynamoDB available. With the use of DAX (https://aws.amazon.com/de/dynamodb/dax/) this use case got a lot easier now! – Osterjour Aug 25 '17 at 09:26
7

The main reason we use Elasticache rather than DynamoDB is the speed - you get sub 1ms round trip latency for small objects. The box is really close to your EC2 machine, and memory is that much faster than disk, even SSD.

There could also be a cost advantage given the different pricing models, although I haven't gone into that much detail there.

Maximilian
  • 171
  • 1
  • 3
  • Is it still relevant? Didn't DAX change that? – dmigo Jun 13 '19 at 14:42
  • 1
    DAX will solve much of the latency issues, yes. I'm not sure on the exact speed differences - for small objects network will be the major contributor to latency. – Maximilian Jun 13 '19 at 17:52
1

Redis/memcached are in-memory stores and should generally be faster than DynamoDB for cache/queue-type data. They also have handy additional items like expiring keys, Pub/Sub in Redis, etc. that Dynamo may not have.

ceejayoz
  • 32,469
  • 7
  • 81
  • 105
  • 2
    Thanks. I realize the cache is in memory but I've read that one can expect at least a ~10ms roundtrip to hit the cache and come back, which makes the performance characteristics the same as Dynamo. I guess you're right that you might want specific features in memcached not avail in dynamo. But for me, the main advantage of a RAM cache is the orders of magnitude performance increase over a durable store, which doesn't seem to apply here. – Scott Klarenbach Apr 02 '15 at 17:04