2

I'm using an AWS i3.2xlarge EC2 instance and facing a limit of 10K IOPS. I wonder why that is?

I only write on the NVMe instance storage. No matter what I do I can't pass that limit. I thought that I3 class instance could go way above that?

Is there something that I am missing? Do I need a larger instance to reach higher IOPS? Before I try going higher i would like to understand if it is the typical limit for those instance or it is something in my set up?

Was anyone able to achieve higher throughput with that type of instance? Why is the limit 10K IOPS? What's the reason for it? And how to go above it?

Note: I'm running a database application, and making many update request on it.

I-P-X
  • 163
  • 10
MaatDeamon
  • 169
  • 1
  • 11
  • 1
    I dunno... This feels like a "call Amazon" or check with Amazon support thing, but maybe that's not how people approach systems these days. – ewwhite Nov 17 '18 at 04:30
  • Yeah, this is not something that random strangers on the internet can help you with. This is a matter of knowing your platform better. Search documentation and reach out to official support channels for canonical answers – Wesley Nov 17 '18 at 04:38

3 Answers3

6

Are you 100% sure you're writing to the local SSD storage?

It sounds like you may be accidentally using the EBS volume instead, the 10K IOPS limit would suggest that...

How to check: In Amazon Linux 2 on i3.2xlarge the NVMe instance storage is /dev/nvme0n1 while EBS volumes are /dev/xvd*. Check what device is mounted at the directory you're using / benchmarking:

[root@ip-172-31-41-210 ~]# mount
/dev/xvda1 on / type xfs (rw,noatime,attr2,inode64,noquota)
/dev/xvdba1 on /ebs-storage type ext4 (rw,relatime,data=ordered)
/dev/nvme0n1 on /local-storage type ext4 (rw,relatime,data=ordered)

Here I've got a second EBS volume mounted as /ebs-storage and the NVMe instance storage as /local-storage.

Note that the NVMe disk must be explicitly formatted (mkfs) and mounted before it can be used! By default the instance starts with just an EBS-backed root disk and the fast NVMe disk is not used!

Hope that helps :)

MLu
  • 23,798
  • 5
  • 54
  • 81
  • The machine was mounted by my ops. Will have to check that out – MaatDeamon Nov 17 '18 at 10:12
  • 1
    @MaatDeamon added instructions about checking the storage type. – MLu Nov 17 '18 at 11:08
  • @MaatDeamon how did you go with the checks? If the response helped resolve your problem please upvote and accept it. It's the way to say thanks to people who spent their free time helping others on ServerFault. Thanks! :) – MLu Nov 21 '18 at 00:27
  • Did the test and all but still not able to figure out the bottleneck – MaatDeamon Nov 21 '18 at 12:39
2

This page on AWS says the i3.2xlarge can do 412,500 random read IOPS and 180,000 write IOPS. In comparison the i3.16Xlarge can do 3.3 million random read IOPS and 1.4 million write IOPS.

It also says

As you fill the SSD-based instance store volumes for your instance, the number of write IOPS that you can achieve decreases. This is due to the extra work the SSD controller must do to find available space, rewrite existing data, and erase unused space so that it can be rewritten. This process of garbage collection results in internal write amplification to the SSD, expressed as the ratio of SSD write operations to user write operations. This decrease in performance is even larger if the write operations are not in multiples of 4,096 bytes or not aligned to a 4,096-byte boundary. If you write a smaller amount of bytes or bytes that are not aligned, the SSD controller must read the surrounding data and store the result in a new location. This pattern results in significantly increased write amplification, increased latency, and dramatically reduced I/O performance.

Suggest you run a benchmarking tool to see how many IOPS you can reach - I can't recommend a tool sorry. After that it's probably a case of optimising your database for the instance. There are probably articles on optimising your software for the i3 instances somewhere on the internet.

Otherwise, the advice to contact AWS support is good advice. Their support is excellent. You have to pay for their support, though developer level isn't that expensive.

Tim
  • 30,383
  • 6
  • 47
  • 77
2

On AWS the IOPS rate is limited per-volume.

If you need a higher throughput it's best to use a higher number of smaller volumes rather than one huge volume. You can then put the volumes e.g. to SW-RAID or configure your software to use multiple data stores.

Fer Dah
  • 224
  • 1
  • 8