14

All other things being equal, how would a storage array's IOPS performance change if one used larger disks.

For example, take an array with 10 X 100GB disks.

Measure IOPS for sequential 256kb block writes (or any IOPS metric)

Let's assume the resulting measured IOPS is 1000 IOPS.

Change the array for one with 10 X 200GB disks. Format with same RAID configuration, same block size, etc.

Would one expect the IOPS to remain the same, increase, or decrease? Would the change be roughly linear? i.e. increase by 2X or decrease by 2X (as I've increased the disk capacity by 2X)

Repeat these questions with 10 X 50GB disks.

Edit: More Context

This question resulted as a conversation among my Sysadmin team that is not well versed in all things storage. (Comfortable with many aspects of storage, but not the details of managing a SAN or whatever). We are receiving a big pile of new Netapp trays that have higher disk capacity per-disk -- double capacity -- than our existing trays. The comment came up that the IOPS of the new trays would be lower just because the disks were larger. Then a car analogy came up to explain this. Neither comment sat well with me so I wanted to run it out to The Team, i.e. Stack-Exchange-land.

The car analogy was something about two cars, with different acceleration, the same top speed, and running a quarter mile. Then change the distance to a half mile. Actually, I can't remember the exact analogy, but since I found another one on the interwebz that was similar I figured it was probably a common IOPS analogy.

In some ways, the actual answer to the question doesn't matter that much to me, as we are not using this information to evaluate a purchase. But we do need to evaluate the best way to attach the trays to an existing head, and best way to carve out aggregates and volumes.

JDS
  • 2,508
  • 4
  • 29
  • 48
  • 2
    I/O operations per second aren't going to go up if the disk capacity goes up - they're related to transfer rate end to end and disk i/o rate (and caching). What's the specific problem you're trying to solve? – EightBitTony Jul 01 '14 at 20:07
  • 3
    Is this hypothetical (and thus off-topic) ? – mfinni Jul 01 '14 at 20:08
  • It doesn't really change... unless you're talking about limiting head movement across the platter via *short-stroking*... Or overprovisioned SSDs... – ewwhite Jul 02 '14 at 00:49
  • On a side note, larger disks usually contain more modern controllers, motors and heads, smaller disks usually just reuse the previous gen ones which are "good enough", so high capacity disks are often faster but not because they are larger but because they are better made. – Vality Jul 02 '14 at 09:40
  • 2
    @mfinni: Unfortunately cloudy services do exist that have an artificial restriction on IOPS based on the (virtual) disk size. See my answer for details. I've seen "devops" confused by this before. – dotancohen Jul 02 '14 at 10:29
  • @dotancohen - then that's an artificial limitation imposed by a specific service provider, and has no relevance to anyone that isn't their customer and could be changed at any time by the vendor. Bad example, in my opinion, of a scenario that's relevant to the intent of this site. – mfinni Jul 02 '14 at 12:44
  • The question was about storage arrays and larger disks, so clearly not "a virtual storage container that I'm renting from someone, 3000 miles away from me." – mfinni Jul 02 '14 at 12:45
  • @mfinni: I understand your point and agree with it 100%. I'm letting you know where the conflation between disk size and IOPS comes from. And it is not specific to AWS, I understand that the Rackspace cloud has a similar arrangement. This seems to be the norm for cloudy weather^h computing, and people are 'growing up' with it. – dotancohen Jul 02 '14 at 13:06
  • Sometimes it is surprising which questions I ask become popular(ish). I'm not sure which answer to give a green check to, though... – JDS Jul 02 '14 at 15:36
  • @JDS: I would say that there are a whole bunch of really informative answers and that I've learned a lot from this question. That is why the question became so popular. As to which answer to accept, I would say that there are two answers that directly _answer the question_ and one of them is [Ian's excellent answer](http://serverfault.com/a/609486/91213). – dotancohen Jul 02 '14 at 17:49

7 Answers7

11

I know this is probably a hypothetical question... But the IT world really doesn't work that way. There are realistic constraints to consider, plus other things that can influence IOPS...

  • 50GB and 100GB disks don't really exist anymore. Think more: 72, 146, 300, 450, 600, 900, 1200GB in enterprise disks and 500, 1000, 2000, 3000, 4000, 6000GB in nearline/midline bulk-storage media.

  • There's so much abstraction in modern storage; disk caching, controller caching, SSD offload, etc. that any differences would be difficult to discern.

  • You have different drive form factors, interfaces and rotational speeds to consider. SATA disks have a different performance profile than SAS or nearline SAS. 7,200RPM disks behave differently than 10,000RPM or 15,000RPM. And the availability of the various rotational speeds is limited to certain capacities.

  • Physical controller layout. SAS expanders, RAID/SAS controllers can influence IOPS, depending on disk layout, oversubscription rates, whether the connectivity is internal to the server or in an external enclosure. Large numbers of SATA disks perform poorly on expanders and during drive error conditions.

  • Some of this can be influenced by fragmentation, used capacity on the disk array.

  • Ever hear of short-stroking?

  • Software versus hardware RAID, prefetching, adaptive profiling...

What leads you to believe that capacity would have any impact on performance in the first place? Can you provide more context?

Edit:

If the disk type, form factor, interface and used-capacity are the same, then there should be no appreciable difference in IOPS. Let's say you were going from 300GB to 600GB enterprise SAS 10k disks. With the same spindle count, you shouldn't see any performance difference...

However, if the NetApp disk shelves you mention employ 6Gbps or 12Gbps SAS backplanes versus a legacy 3Gbps, you may see a throughput change in going to newer equipment.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • I edited my original question to add context. The numbers I chose were not real-world, they were just to make hypothetical calculations easier. Also, most of your other comments fall into the "all things being equal" column. Assume that the *only* thing changing is the capacity of the individual disks – JDS Jul 02 '14 at 13:23
  • @JDS See my edit above. – ewwhite Jul 02 '14 at 13:32
10

One place where there is a direct relationship between disk size and IOPS is in the Amazon AWS Cloud and other "cloudy services". Two types of AWS services (Elastic Block Store and Relational Database Service ) provide higher IOPS for larger disk sizes.

Note that this is an artificial restriction placed by Amazon on their services. There is no hardware-bound reason for this to be the case. However, I have seen devops types who are unfamiliar with unvirtualized hardware believing this restriction to be appropriate also for desktop systems and the like. The disk size / IOPS relationship is a cloud marketing restriction, not a hardware restriction.

dotancohen
  • 2,410
  • 2
  • 24
  • 38
  • 1
    That's a good point. We're looking at performance SLAs for delivering performance and capacity to customers. And we're looking at using a tier based model of 'iops per terabyte' - the idea being that we can use that to inform our upgrade cycles - buying SSDs if the IOP:TB ratio is high, and SATA if it's low. Not because of any array limits or constraints, but because we need to get a grip on cost vs. charging models. – Sobrique Jul 02 '14 at 10:38
  • 2
    Interesting. I didn't think of the *cloudy* context here. I guess that shows the perspective I'm coming from... – ewwhite Jul 02 '14 at 12:15
8

To answer your question directly - all other things being equal = no change whatsoever when GB changes.

You don't measure IOPS with GB. You use the seek time and the latency.

I could re-write it all here but these examples below do all that already and I would simply be repeating it:

https://ryanfrantz.com/posts/calculating-disk-iops.html

http://www.big-data-storage.co.uk/how-to-calculate-iops/

http://www.wmarow.com/strcalc/

http://www.thecloudcalculator.com/calculators/disk-raid-and-iops.html

Ian Macintosh
  • 945
  • 1
  • 6
  • 12
  • or http://www.techrepublic.com/blog/the-enterprise-cloud/calculate-iops-in-a-storage-array/ – Ian Macintosh Jul 02 '14 at 08:35
  • But don't the seek time and latency go up if a disk has greater capacity? – JDS Jul 02 '14 at 14:36
  • Not necessarily @JDS. Sometimes they do and sometimes they don't because manufacturers are continuously stuffing more bits onto platters (more GB) and also continuously improving other aspects of their hard drives. When the drive gets bigger it also often gets other hardware upgrades simultaneously which *could* lower your seek time or latency, thereby increasing your IOPS. But it's all a moot point because GB has no direct relationship to IOPS, only seek times and read and write latency affect IOPS. – Ian Macintosh Jul 02 '14 at 14:43
4

I should point out that IOPS are not a great measurement of speed on sequential writes, but lets just go with it.

I suspect the seek and write times of disk heads is pretty consistent despite the size of the disks. 20 years ago we we're all using 60GB disks with (roughly - certainly not linearly) the same read/write speeds.

I am making an educated guess but I dont think that the density of the disk relates linearly with the performance of the disk.

For example, take an array with 10 X 100GB disks.

Measure IOPS for sequential 256kb block writes (or any IOPS metric)

Let's assume the resulting measured IOPS is 1000 IOPS.

OK

Change the array for one with 10 X 200GB disks. Format with same RAID configuration, same block size, etc.

Would one expect the IOPS to remain the same, increase, or decrease?

Probably remain roughly equivalent to one another.

Would the change be roughly linear?

The history of spinning media tells me there is probably no relationship.

Repeat these questions with 10 X 50GB disks

Again, roughly equivalent.

Your speed, in all these cases comes from the fact that the RAID acts like one single disk with ten write heads, so you can send 1/10th of the work in parallel to each disk.

Whilst I have no hard numbers to show you, my past experience tells me that increasing your disks performance is not quite so simple as getting more capacity.

Despite what the marketing people tell you is innovation, before the start of cheap(er) solid state disks there has been little significant development in the performance of spinning media in the last 20 years, presumably theres only so much you can get out of rust and only so fast we can get our current models of disk heads to go.

Matthew Ife
  • 22,927
  • 2
  • 54
  • 71
3

The performance added to the storage scales with each spindle added. The rotational speed of the drive is the biggest factor, so adding a 10k RPM drive will give more performance (in terms of IO/s in random IO or MB/s in streaming IO) than a 7.2k RPM drive. The size of the drive has virtually no effect.

People say small drives go faster simply because you require more spindles per usable TB. Increasing the drive size of those spindles won't decrease performance, but it will allow you to fit more data on the disks, which may result in an increased workload.

Basil
  • 8,811
  • 3
  • 37
  • 73
2

If you assume all else is equal, performance characteristics of disks of larger capacity don't change very much. An 10K RPM FC drive has very similar characteristics regardless of whether it's 300GB or 3TB. The platters rotate at the same rate, and the heads seek at the same speed.

Sustained throughput likewise - not much difference. This is the root of a lot of the performance problems though, as in many cases, people buy terabytes, they don't buy IOPs or MB/sec.

And it'll take 10x as long to rebuild/copy a 3TB drive as a 300GB drive.

We've actually had to look at substantial overcapacity for storage projects as a result - drive sizes are still growing, but their performance capability isn't much. So in at least one case, we've bought ~400TB of storage to fill a 100TB requirement, because we need the spindles.

Sobrique
  • 3,697
  • 2
  • 14
  • 34
0

If you are rotating disks (not SSD) then everything else being equal, transfer speed is higher if you use the outer tracks of the disk. That would happen automatically if you use a disk that is only partially filled. At the same time, if a disk is only partially filled, your average head movement would be less, and the number of head movements would be less because there is more data per track.

That's true whether you use a single disk or a RAID drive.

Now if you are comparing 100GB and 2000GB disks, you can be sure that everything else is not equal. But if the same manufacturer offers 500GB, 1TB, 1.5TB and 2TB drives with one, two, three and four platters, then everything else is likely to be equal, and 10 x 500GB will be slower than 10 x 2TB to store 4TB of data (there will be no difference if you store 100 GB only, because the 500 GB drives will also be almost empty).

But for RAID drives, you will be not so much limited by transfer speed, but by rotational latency. So higher RPM will be more important. And you'll often find higher RPM together with lower capacity. On the other hand, if you go with high RPM/low capacity, then you might also look at SSD drives.