How do you decouple storage and compute resources without losing the benefits of locally attached storage?

Question

Services like DynamoDB (not specifically, but it was the first that came to mind) provide dynamic scaling on write/read capacity (i.e. compute) as well as storage capacity.

This means that you can have a DynamoDB table terabytes in size, with 0 provisioned capacity on reads or writes. Importantly, you are also only paying for the storage, as no reads/writes are being done.

If DynamoDB nodes use locally attached storage (presumably they need to for latency reasons), what do they do with the idle CPUs of those nodes?

The motivation for this question is because I am currently running a data store on AWS EC2 instances, already on instance types with the highest SSD capacity (i3 class), where storage capacity needs dramatically exceed compute/memory/network needs, resulting in most of the nodes having idle CPUs i.e. wasted money.

How do you provision storage and compute resources efficiently without losing the benefits of locally-attached storage? How do established systems like AWS DynamoDB do it?

With DynamoDB it will have shared resources, someone else will be using the CPUs. AWS control the hardware, provisioning, and load balancing, so they can do whatever they like. If you need a lot of fast storage but not much CPU then in AWS you tend to have to pay for everything. I think Google lets you mix and match more, or you can buy hardware. — Tim, Feb 24 '19 at 23:54
@Tim Hey, thanks for the comment. Do you know any literature describing how multi-tenancy is used to utilize idle CPUs in this scenario? Would it be somethin like Kubernetes? — cozos, Feb 25 '19 at 19:13
Nope, sorry. It just seems logical to ensure your resources are fully utilised. — Tim, Feb 25 '19 at 19:18
In addition to the possibility of using multi-tenancy, it would also be possible for DynamoDB to have some non-public instance type that is optimized for DynamoDB. That being said, in some circumstances AWS EBS can perform better than locally attached storage, and it's possible that DynamoDB has figured some way to take advantage of that to completely decouple their storage and compute. You might also be interested in how AWS Aurora works since they say that they have separated their storage and compute. https://aws.amazon.com/blogs/database/introducing-the-aurora-storage-engine/ — Matthew Pope, Mar 27 '19 at 17:26

How do you decouple storage and compute resources without losing the benefits of locally attached storage?

0 Answers0