5

How can I know how many IOPS I need my storage to deliver for my overloaded Linux server?

I have a server and I know it has storage as its bottleneck. I would like the bottleneck not to be storage, I thus need to size the storage array performance. That is, buy an array that delivers more IOPS than I need.

How can I know, given some system IO statistics, or other information, how to size my storage performance (what to buy) to serve more than I need (taking the worst case scenario -heavy IO contention- as reference).

For example, the iostat utility can give some interesting statistics about IO usage. Can I use that information to know what hardware performance I need? How?

This is a general question, the actual workload type or software does not matter (could be a database for example), I just need to be able to make a decision based on current IO statistics and usage.

Totor
  • 2,876
  • 3
  • 22
  • 31
  • 3
    I'm not sure how we can even hope to answer without knowing what services you're trying to run, how many users, etc... – derobert Feb 27 '14 at 18:29
  • 1
    @derobert I edited my question to clarify it. Is that better for you? – Totor Feb 27 '14 at 23:16

3 Answers3

4

iostat command will show you information you want. Just run:

iostat 1

The output will be something like this:

Device:            tps    kB_read/s    kB_wrtn/s    kB_read    kB_wrtn
sda              42.00       128.00        84.00        128         84

The tps is transactions per second which is same as ops.

This will make it update every second.

You usually need to have systat package installed on your Linux Distribution to have iostat available.

phoops
  • 2,073
  • 4
  • 18
  • 23
  • The problem with `transactions per seconds` is that 1 transaction is not always transferring the same amount of data than another "transaction", but IOPS numbers given by hardware vendors are for a given "block size" (always the same amount of data). So I can't rely on `tps` to size my needs in IOPS. So what do you think? – Totor Feb 27 '14 at 23:21
  • Furthermore, even if we take `tps` to be the same thing as IOPS, if storage is my bottleneck (i.e. it's doing the best it can, but that's not enough), I suppose that `iostat` will give me a `tps` number that would be less than what I actually need. I would probably give me the maximum `tps` (or IOPS) that my storage is able to deliver, right? – Totor Feb 27 '14 at 23:31
  • @Totor If you want IOPS per second at a certain block size, then you don't want IOPS, you want bandwidth (eg bytes per second), a completely different measurement. – phemmer Feb 28 '14 at 00:04
  • @Patrick Lets say *random read bandwith* and *random write bandwith* then, if you prefer (and 4kB block size). It's the same thing indeed. Also, note that in IOPS, **PS** already mean "per second". – Totor Feb 28 '14 at 00:13
  • 1
    The other problem is that this measure caps out at what the current storage can do. – Basil Feb 28 '14 at 04:07
  • *Basil* is right. This gives me **current** statistics, when I am trying to answer the question: "how many IOPS would have been enough?". It does not answer the question. – Totor Mar 02 '14 at 23:51
4

If you know you're storage bound, then benchmarks on your server won't definitively tell you how much you need. They can only tell how fast you can go while subject to the limited storage. In order to properly get the answer you're looking for, you need to, if possible, isolate the different ways you can be storage throttled and test them independently.

IOPS is of course the easy limit that everyone talks about, because disks are bad at seeking and databases like to seek. These days with cache and SSD, small block IO random seek reads are a lot easier than they used to be. A small tier of SSD and a large cache will probably ensure that if it really is IOPS (for small block "seek" type IO) that's your bottleneck, you won't be subject to it any more. Be careful about these benchmarks, though- you'll read all kinds of unrealistic numbers as people measure the number of IOs they can do straight to unmirrored cache. That's not going to help your linux server.

Another type of storage limit is bandwidth, or throughput. This one is hard to isolate, but if you know how much data you're trying to read or write and you know how long it takes you now, pick a new time target, and that'll be your new number. For example: if you observe your application spending 4 hours to do a large backup or something, and at the end of it, it's moved 9 TB, that tells you your current throughput limit: about 650 MB/s. If you want to move 18 TB in that time, you need 1300 MB/s. For the most part, ethernet, fibre, and SAS can all be configured to go faster than storage hardware. The storage's ability to keep that transfer layer full is usually the real bottleneck. You want to look at the number of front end ports, and the benchmark numbers with cache mirroring turned on (to ensure there's no bottleneck between controllers mirroring cached writes).

Lastly, you can be limited by bad storage configuration in terms of SCSI queues. This is not ridiculously common, but is defined by being unable to push your storage hardware as fast as it should go. If you are seeing 500ms latency on writes from the host, but your storage reports 3ms 100% cache hits, that can be an issue with insufficient SCSI queues on the target. Basically the SCSI initiator is waiting up to 500ms to free up a slot in its queue it can use to take requests. You want to ask your storage vendor for the best practices on host queue depth settings and fan-out ratio for this.

I hope this helps, I know it's not as simple an answer as you were hoping for.

Basil
  • 8,811
  • 3
  • 37
  • 73
  • Thanks. Actually, I was not really hoping for a simple answer (like *edvinas.me*'s), because I know it is not such a simple problem, and in this regard, your answer has *thinking* behind, which is really appreciable. *You're not really answering about IOPS though, saying "SSD will solve it" is a little "too easy"*. I would have liked to know how to measure my need (i.e. how many IOPS would have been enough). – Totor Mar 02 '14 at 23:45
  • In your situation, it's impossible to accurately measure your need for IOPS, so I don't have much choice. Generally speaking, any workload that is IOPS bound will be best served by SSD, but that's actually a pretty rare circumstance (small transactional databases, for example). – Basil Mar 03 '14 at 02:28
  • Under what situation would it have been possible to "accurately measure my need for IOPS" then? – Totor Mar 05 '14 at 10:09
  • For example; if you're looking at changing storage and you're not currently storage bound, or if you're upgrading or changing the application in a way that you know the percent increase of IOPS it'll require and you're not currently storage bound. In general, though, if you are storage bound, the only quantifiable answer you can give is "something more than we currently get". Everything after that is guesswork. If you're interested, I could go on at *length* about variously accurate ways to estimate your requirements... – Basil Mar 05 '14 at 17:36
  • *"...in a way that you know the percent increase of IOPS it'll require"*: sure, **if** I know how much IOPS I'm going to need, it means I answered my question that is... knowing how much IOPS I need. **if** only... Please tell me rather how I can estimate my requirements: that's what I'm interested in. – Totor Mar 05 '14 at 23:36
  • Estimation always starts with the application sizing. Whoever decided what hardware is needed to do this job might have sizing rules you can use (like for every 100 users of our product, you want 1000 IOPS and 80MB/s). Is that the case? What kind of application is this running? – Basil Mar 06 '14 at 04:32
  • My question is general, but let's consider a virtualization cluster for example, i.e. various types of workloads. – Totor Apr 02 '15 at 22:55
1

If you can vary the load on the application from 1 TPS to well past the point of bottlenecking, you can build a model of the relationship of TPS and I/O operation rate and bandwidth.

Lets say:

  1 TPS causes   6 IOs and   2 KB of transfer, per second
 10 TPS causes  16 IOs and  11 KB
100 TPS causes 106 IOs and 101 KB
  but
200 TPS causes 107 IOs and 102 KB
300 TPS causes 107 IOs and 102 KB

1) Then you have a bottleneck at 100 TPS offered, plus

2) there is an overhead of 5 IOs and 1 KB, after which each transaction uses 1 IO and 1 KB of transfer

Now:

  1. is the limit of your existing device,
  2. is your budget, which you use for figuring how much to provision for each TPS you want to handle

If it says it's good for

10,000 IOPs and 100 KB/S, only the latter is meaningful to you. If it says it's good for 100 IOPS and 10,000 KB/S, only the the former is meaningful. Sometimes it will bottleneck on IPS initially, bandwidth in large configurations

To measure this, do lots of individual tests, with repetitions, and plot the results on a graph: your eyes are better at pictures than at tables of numbers.

The throughput graph should start out as a slope, something like /, then abruptly level off and go horizontal or sometimes back down again. If you plot response time, it will look like _/ The bends will line up, at around the bottleneck load.

And yes, it will be a scatterplot of dots approximating those curves, not nice straight lines (;-))

--dave

davecb
  • 211
  • 2
  • 5
  • I thought TPS were the same thing as IOPS. What is the difference then, and how can I measure IOPS? (I know TPS is available through `iostat`) – Totor Mar 02 '14 at 23:57
  • Transactions are what the user sees, IOPs are the disk IOs required to do the work. For example a bank transaction is debit followed by a credit, with each requiring a read to get the two accounts, followed by a write of the updates to both accounts. In that case, 1 TP takes 4 IOPs. 1 Transaction per second (TPS) will therefor require 4 IO Operations per second (IOPS), and so on. – davecb Mar 04 '14 at 15:08
  • I think you are confusing business transactions and Linux TPS (=transfers per second). Sure, in RAID arrays, there is a difference between frontend and backends IOPS (due to things like parity write penalties) but this is not what you're saying. – Totor Mar 04 '14 at 22:43