4

this is a follow-up to a previous question that I asked (Two servers with inconsistent disk speed).

I have a PowerEdge R510 server with a PERC H700 integrated RAID controller (call this Server B) that was built using eight disks with 3Gb/s bandwidth that I was comparing with an almost identical server (call this Server A) that was built using four disks with 6Gb/s bandwidth. Server A had much better I/O rates than Server B.

Once I discovered the difference with the disks, I had Server A rebuilt with faster 6Gbps disks. Unfortunately this resulted in no increase in the performance of the disks. Expecting that there must be some other configuration difference between the servers, we took the 6Gbps disks out of Server A and put them in Server B. This also resulted in no increase in the performance of the disks.

We now have two identical servers built, with the exception that one is built with six 6Gbps disks and the other with eight 3Gbps disks, and the I/O rates of the disks is pretty much identical. This suggests that there is some bottleneck other than the disks, but I cannot understand how Server B originally had better I/O that has subsequently been 'lost'.

Comparative I/O information below, as measured by SQLIO. The same parameters were used for each test. It's not the actual numbers that are significant but rather the variations between systems. In each case D: is a 2 disk RAID 1 volume, and E: is a 4 disk RAID 10 volume (apart from the original Server A, where E: was a 2 disk RAID 0 volume).

Server A (original setup with 6Gpbs disks)

D: Read (MB/s)     63 MB/s
D: Write (MB/s)    170 MB/s
E: Read (MB/s)     68 MB/s
E: Write (MB/s)    320 MB/s

Server B (original setup with 3Gpbs disks)

D: Read (MB/s)     52 MB/s
D: Write (MB/s)    88 MB/s
E: Read (MB/s)     112 MB/s
E: Write (MB/s)    130 MB/s

Server A (new setup with 3Gpbs disks)

D: Read (MB/s)     55 MB/s
D: Write (MB/s)    85 MB/s
E: Read (MB/s)     67 MB/s
E: Write (MB/s)    180 MB/s

Server B (new setup with 6Gpbs disks)

D: Read (MB/s)     61 MB/s
D: Write (MB/s)    95 MB/s
E: Read (MB/s)     69 MB/s
E: Write (MB/s)    180 MB/s

Can anybody suggest any ideas what is going on here?

The drives in use are as follows:

paulH
  • 183
  • 2
  • 3
  • 14
  • SATA disks everywhere? – TheCleaner Oct 29 '13 at 17:27
  • That's right yes. – paulH Oct 30 '13 at 14:53
  • The reason I ask is: http://www.dell.com/downloads/global/products/pvaul/en/perc-technical-guidebook.pdf seems to state that 6Gb/s SATA isn't even listed as a supported drive. – TheCleaner Oct 30 '13 at 16:07
  • Sorry, my bad. All the drives are SAS drives. Both servers have a combination of Seagate and Hitachi drives. I can provide links to the actual models if required. – paulH Oct 31 '13 at 10:45
  • It would help to know what is on each of these disks and what the server is being used for. – Techie Joe Nov 01 '13 at 16:55
  • The intention is to use them as SQL Server database servers, but at the moment they are not being used. They are freshly built with Windows Server 2008 R2, and have the same configuration[as far as I can determine!] after the rebuild as they did before the rebuild (when one of the servers was *much* quicker). – paulH Nov 01 '13 at 17:05
  • How many times did you run each test and were results the same (or close enough) each time? Can you run other testing (IOMeter,etc) tools to see if they show the same discrepancies? To be honest, the the numbers aren't that different. The difference in write speed on E: on the original server A is because you had set it up as RAID0. Are you use Server A wasn't originally setup as RAID0 for the D: drive as well? – Rex Nov 06 '13 at 04:49
  • I ran each test several times and the results were very consistent. My understanding was that a four disk RAID10 should have roughly the same write speed and double the read speed as a two disk RAID 0? In which case I am only getting 50% of the E: drive performance that I had with the original server A. – paulH Nov 06 '13 at 08:36
  • Yeah that performance is very much below par. Even with the original setup. – hookenz Nov 06 '13 at 23:40
  • what parameters are you using with SQLIO and how much System RAM, and RAID controller memory are there in the machines, as you may be using settings that are affected by caching. Also can you confirm if the raid controller's write-cache is enabled? Also the settings of SQLIO will affect how close the test will result to the ideal transfer rates. Also what data and how much of it was already on the disks at the time of testing? as the different areas of the disk perform significantly differently. And was this kept constant between tests and machines? – BeowulfNode42 Nov 13 '13 at 06:42
  • also what versions of windows are being used? – BeowulfNode42 Nov 13 '13 at 07:07
  • Each test was done with newly formatted disks with no other data on the disk. Each server had 32GB RAM. RAID Controller has 512MB cache. Each disk has Adaptive Read Ahead read policy and Write Back write policy. SQLIO parameters were 8 threads writing 64k blocks sequentially for 120 seconds to a 26GB file (sqlio -Fparam.txt -kW -fsequential -o8 -b64 -s120 -LSi -BN -t8). All of this was consistent for each test. – paulH Nov 14 '13 at 12:55

4 Answers4

3

You need to put less focus on the interface max speed and look more at the physical disk performance characteristics as this is typically the bottleneck. As described on this site for the Hitachi Hus153030vls300 300GB Server SAS disk you linked.

In terms of performance the important figures listed on the Hitachi pdf are

  • Data buffer (MB) 16
  • Rotational speed (RPM) 15,000
  • Latency average (ms) 2.0
  • Media transfer rate (Mbits/sec, max) 1441
  • Sustained transfer rate (MB/sec, typ.) 123-72 (zone 0-19)
  • Seek time (read, ms, typical) 3.6 / 3.4 / 3.4

As all of these figures mean the disk will not be able to saturate a 3 Gbps channel there is no point in it having a 6 Gbps channel.

I cannot imagine a raid controller that can utilise each disks' maximum performance in the same array at the same time. So assuming you have a RAID 1 with 2 disks, the first capable of 60MB/s sustained sequential read and write speed and the second only 50MB/s, then writing to the array will be limited to 50MB/s while a decent raid card will be able to have 2 simultaneous read streams, one at 60MB/s and the other at 50MB/s. The more complex the array the more complicated these figures become.

Some other notes

  • the maximum transfer rate of a disk is different in different areas of the disk, typically it is faster at the start of the disk.
  • sequential reads are the fastest sustained operations a disk can do and random read or writes are significantly slower.
  • typically a raid controller will disable a disks' onboard write cache and will only use its own cache for writes if it has a good battery, or you override its default.
  • I have read of some instances of some disk/raid firmware combo's that falsely detect a bad battery and disable all write cache. So update your firmware for both disk and raid controller

There are some disks advertised as 6 Gbps high performance disks that are in fact not that high performance, they just have the 6 Gbps interface, and couldn't even saturate a 3 Gbps link anyway (which would take 357 MiB/s).

The main benefit of 6Gbps sas/sata is for SSDs and port multipliers (ie attaching multiple disks to the 1 sas/sata port)

BeowulfNode42
  • 2,595
  • 2
  • 18
  • 32
  • I'm not sure where you got that from, but both Hitachi drives I linked to are 15k drives, and the model numbers in the document you linked to do not match the ones I quoted. – paulH Nov 06 '13 at 08:44
  • weird, amazon uk made a mistake gathering their data, and I copied the Hitachi series name from there. I will fix up the details in a bit but the pdf for the [actual drive is this](http://www.hgst.com/tech/techlib.nsf/techdocs/f0954c336af5e24f862572c200563cb3/$file/ultrastar_15k300_final_ds.pdf) and states the max sustained is 123-72MB/s (zone 0-19) – BeowulfNode42 Nov 07 '13 at 01:07
  • Ok that kind of makes sense to me now and the numbers for the 3Gbps drive do just about fit with those figures. I looked at the version of the document for the 6Gbpbs drive though and it suggests a sustained transfer rate of 198 to 119 MB/s which still makes my server with 6Gbps drives look a bit sick. And it still doesn't explain why the original setup was so much faster. – paulH Nov 08 '13 at 22:28
2

I'm not very familiar with Windows systems, but here are some points to take in consideration when benchmarking, especially with IOs.

Keep in mind this schema representing the layers between your application and the disks:

Application <=> Filesystem (OS) <=> Disk controller <=> Hard drive

And each part in this have his own method of moving information to the upper and lower part, have his own cache, configuration, etc...

  • Application: (here your tool). Writing large modifications in a big block is better than doing many little writes. Are you waiting for a full flush to the disk, are you doing sequencial access or random access ?
  • Filesystem: There is many parameters here: caching by the OS, data pre-fetching, data block-size
  • Disk controller: He his the central point before accessing hard drives. His configuration will count for 30% of your tweaking. Among it, main points are :
    • Cache Ratio between Read/Write. Depending on your application which may be read or write intensive, you'll configure this ratio accordingly.
    • Battery caching, allowing WRITE-THROUGH or WRITE-BACK methods.
    • Raid level: you must choose the level according to your need of fail-tolerance. RAID0 for 0 tolerance but great performances, RAID1 for fault tolerence but 50% total disk space usable, RAID5/6 for compromise...
  • Hard drive: higher rotation speed will allow you to access quicker to data located in different drive regions. Thus, better for random seek

Also, search about data alignment: I saw Windows creating many times mis-aligned partitions. Thus, when the filesystem wants to write 1 block of 4kb, it results in 2 I/Os to the drive, because the FS block is located on 2 device blocks.

More details would help us to find the bottleneck.

Adrien.

Adrien M.
  • 226
  • 2
  • 5
1

You need to upgrade firmware of the H710, the HDDs and backplane if there is one. If you run Linux, you need to upgrade firmware only.

Also, before doing this you can install Dell Server Admin (OMSA) like 7.3.0.1 at the moment, to check if it will inform you about any issues with incompatibility.

You need to use also same type of drives in the same array if it's SAS.

So basically if you have wrong HDD firmware, old SAS firmware, various SAS drives (even if they are SATA they can run as SAS), there is no way you would ever get consistent performance across all drives.

Actually, if you got just different drive types that could cause this.

Andrew Smith
  • 1,123
  • 13
  • 23
  • I'll check next time that I'm in the office, but I'm pretty sure the disks are **not** mixed types within an array. One make of disk is used to build the 4-disk array and another make of disk is used to build the 2-disk array. The other stuff I'm not sure about, though I **am** sure that no firmware has been changed between the drives running quickly before the server rebuild, and them running slowly after the rebuild. – paulH Nov 01 '13 at 22:33
1

In my experience, I've seen a large variation in the performance of 15k SAS drives. You've mentioned a few drive swaps, but it seems like your focusing on 3Gig vs 6Gig bus speeds when that will have little bearing on the I/O numbers you've indicated. If I were in your shoes, I'd benchmark the drives individually to see if I have a slow drive.

What other settings are applied to your RAID setup. Write policies, caching, stripe size etc... Were they consistent between benchmarks?

Ryan
  • 912
  • 6
  • 12
  • Is it possible to benchmark the disks individually without removing them from their RAID array? If I have to reconfigure them then I may do that if I get the chance, but I'd be more likely to do it if I can do it as they are currently configured. Oh and as far as I know all the other settings were left as whatever the defaults were. From memory they did seem to be consistent. – paulH Nov 08 '13 at 21:52
  • The drives would have to be removed from the array to test. The default array settings might not be optimal for your use. The stripe size and caching which can impact I/O performance dramatically. – Ryan Nov 09 '13 at 01:10