11

Explanation:

We have a Server:

  • Model: HP ProLiant DL160 G6
  • 4 x 240GB SSD (RAID-10)
  • 72GB DDR3 RAM
  • 2 x L5639
  • HP P410 RAID Controller (256MB, V6.40, Rom version: 8.40.41.00)

SSD drives are 4 brand new 2.5" Intel 530 with 540MB/s read speed and 490MB/s write speed

  • CentOS 6
  • File systems are ext4

but this is the test result for read speed on raid 10:

hdparm -t /dev/sda

/dev/sda:
 Timing buffered disk reads:  824 MB in  3.00 seconds = 274.50 MB/sec
[root@localhost ~]# hdparm -t /dev/mapper/vg_localhost-lv_root

/dev/mapper/vg_localhost-lv_root:
 Timing buffered disk reads:  800 MB in  3.01 seconds = 266.19 MB/sec

and this is for write speed:

dd bs=1M count=512 if=/dev/zero of=test conv=fdatasync
512+0 records in
512+0 records out
536870912 bytes (537 MB) copied, 4.91077 s, 109 MB/s

we were hoping for 1GB read speed with raid 10 but 270MB isn't even the speed of a single disk!

Questions:

  1. Why is it so slow?
  2. Is it because of the RAID Controller?

Update 1 - Same Read/Write Speed:

After changing some settings as mentioned in answers i have the result below:

(Any one knows why it shows 4GB instead of 400MB as read speed?!)

EDIT: looks like the command was wrong and we should've used -s144g for this amount of ram, thats why it shows 4GB (as suggested in comments by ewwhite)

[root@192 ~]# iozone -t1 -i0 -i1 -i2 -r1m -s56g
        Iozone: Performance Test of File I/O
                Version $Revision: 3.408 $
                Compiled for 64 bit mode.
                Build: linux

        Record Size 1024 KB
        File size set to 58720256 KB
        Command line used: iozone -t1 -i0 -i1 -i2 -r1m -s56g
        Output is in Kbytes/sec
        Each process writes a 58720256 Kbyte file in 1024 Kbyte records

        Children see throughput for  1 initial writers  =  135331.80 KB/sec
        Children see throughput for  1 rewriters        =  124085.66 KB/sec
        Children see throughput for  1 readers          = 4732046.50 KB/sec
        Children see throughput for 1 re-readers        = 4741508.00 KB/sec
        Children see throughput for 1 random readers    = 4590884.50 KB/sec
        Children see throughput for 1 random writers    =  124082.41 KB/sec

but the old hdparm -t /dev/sda command still shows:

Timing buffered disk reads: 810 MB in 3.00 seconds = 269.85 MB/sec

Update 2 (tuned-utils pack) - Read Speed is now 600MB/s:

Finally some hope, we had disabled cache from raid controller and did some other things earlier with no luck, but because we reloaded the server and installed the OS again, we forgot to install "tuned-utils" as suggested in ewwhite's answer (Thank you ewwhite for this awesome package you suggested)

After installing tuned-utils and choosing enterprise-storage profile the read speed is now ~600MB/s+ but the write speed is still very slow (~160MB) (:

Here is the result for iozone -t1 -i0 -i1 -i2 -r1m -s144g command:

    Children see throughput for  1 initial writers  =  165331.80 KB/sec
    Children see throughput for  1 rewriters        =  115734.91 KB/sec
    Children see throughput for  1 readers          =  719323.81 KB/sec
    Children see throughput for 1 re-readers        =  732008.56 KB/sec
    Children see throughput for 1 random readers    =  549284.69 KB/sec
    Children see throughput for 1 random writers    =  116389.76 KB/sec

Even with hdparm -t /dev/sda command we have:

Timing buffered disk reads: 1802 MB in 3.00 seconds = 600.37 MB/sec

Any suggestions for the very slow write speed?

Update 3 - Some information requested in comments:

Write speed is still very low (~150MB/s which isn't even 1/3 of a single disk)

Output for df -h and fdisk -l:

[root@192 ~]# df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda1       441G  3.2G  415G   1% /
tmpfs            36G     0   36G   0% /dev/shm


[root@192 ~]# fdisk -l
Disk /dev/sda: 480.0 GB, 480047620096 bytes
255 heads, 63 sectors/track, 58362 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00040c3c

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1   *           1       58363   468795392   83  Linux
Ara
  • 181
  • 1
  • 6
  • For this test, you should probably specify double the amount of RAM installed in the server. – ewwhite Dec 02 '13 at 21:56
  • Can you tell us the firmware version of your Smart Array P410 controller? – ewwhite Dec 02 '13 at 22:03
  • @ewwhite thank you for your comment, firmware version is p410 (256MB,V6.40) and rom version is 8.40.41.00 , was my iozone command wrong? if yes could you please tell me the right command to test with? because everything i try i get the read speed in GB – Ara Dec 02 '13 at 22:18
  • Something is very wrong here. You're only getting ~122MB/s writes. The 4GB/s figures you see are operations from cache, so your command string should look like `iozone -t1 -i0 -i1 -i2 -r1m -s144g`. :( – ewwhite Dec 02 '13 at 22:34
  • @ewwhite Thank you very much, the read speed is now very good (detailed as update 2 in my question), but write speed is very disappointing, do you have any suggestions for that? – Ara Dec 03 '13 at 11:28
  • Where are you writing to? What is `/dev/sda`? Can you show the output of `df -h` and `fdisk -l`? Is LVM still configured? – ewwhite Dec 03 '13 at 14:00
  • Take the filesystem out of the equation and try using `fio` for testing on the raw block device. – MikeyB Dec 03 '13 at 19:10
  • @ewwhite No, LVM is not configured, updated the question with output of commands – Ara Dec 03 '13 at 20:52
  • @MikeyB i installed fio and tried `fio --filename=/dev/fioa --direct=1 --rw=randwrite --bs=1m --size=5G --numjobs=4 --runtime=10 --group_reporting --name=file1` but i got error `fio: looks like your file system does not support direct=1/buffered=0` , `fio: pid=6153, err=22/file:filesetup.c:573, func=open(/dev/fioa), error=Invalid argument` then i tried direct=0 which was ok, but results are not useful i guess? – Ara Dec 03 '13 at 20:52
  • you are not alone :) same disks, generic sata controller, raid1 – GioMac Oct 22 '16 at 01:04

2 Answers2

16

While the other answer here beings up some points, your specific issues are due to platform limitations and OS configuration:

  • You're throughput-limited by the use of consumer SATA SSDs on an HP Smart Array P410 RAID controller. SATA disks run at 3.0Gbps (3G) on these controllers rather than 6.0Gbps (6G). So that's a barrier that impacts the read speeds of your Intel SSDs; 300MB/s or less per drive.

  • The Smart Array P410 controller has specific requirements and best-practices when used with SSDs. In short, the controller is capable of 50,000 IOPS, disable the array accelerator for your SSD volume and performance tops out at ~6 drives.

  • Disk performance is not always about sequential read/write speed. Try benchmarking with a proper tool like iozone or bonnie++. You still get the random I/O benefits of your multiple drives.

  • At the operating system level, install the tuned-utils package and set the profile to enterprise-performance to remove write barriers from your filesystems and set the right I/O elevator for your setup. This is covered in other questions here, too.

  • It looks like you're using LVM. That can have an impact as well...

Here's an iozone report for a G7 ProLiant running with four consumer 6G SATA SSDs (downshifted to 3G speeds) on the same HP Smart Array P410 RAID controller.

You should be seeing ~470MB/s writes and 650MB/s+ reads.

[root@mdmarra /data/tmp]# iozone -t1 -i0 -i1 -i2 -r1m -s56g
        Iozone: Performance Test of File I/O
                Version $Revision: 3.394 $
                Compiled for 64 bit mode.
                Build: linux 

        Record Size 1024 KB
        File size set to 58720256 KB
        Command line used: iozone -t1 -i0 -i1 -i2 -r1m -s56g
        Output is in Kbytes/sec
        Each process writes a 58720256 Kbyte file in 1024 Kbyte records

        Children see throughput for  1 initial writers  =  478209.81 KB/sec
        Children see throughput for  1 rewriters        =  478200.84 KB/sec
        Children see throughput for  1 readers          =  677397.69 KB/sec
        Children see throughput for 1 re-readers        =  679523.88 KB/sec
        Children see throughput for 1 random readers    =  437344.78 KB/sec
        Children see throughput for 1 random writers    =  486254.41 KB/sec
ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Thank you very much, the model is "DL160 G6" , yes there is backplane involved :( i would even be happy with 600MB in here, but 270MB is really slow, what you think i should do, does software raid help? – Ara Nov 26 '13 at 14:02
  • 2
    Okay, with a DL160 G6, you should have two cables going from the RAID controller to the drive backplane... 8 drive slots. Your problem here is the drive throughput, your testing methodology and the server's settings. Try the other suggestions I listed in my answer. – ewwhite Nov 26 '13 at 14:15
  • Interesting bit about the P410 doing only 3G on SATA, and the specific best practices. Mod up. (LVM however shouldn't be a heavy hitter in terms of negative performance impact, also noted here: http://unix.stackexchange.com/questions/7122/does-lvm-impact-performance) – Roman Nov 27 '13 at 12:33
  • @ewwhite Thanks again for the rich update, i searched a lot but didn't find any HP 6G controller for SATA drives and RAID-10, i'll try your guide to reach 600MB and i'll update here if i succeeded, i was suggested to think of software raid which can reach 900MB on this situation too but i guess hardware raid is much better – Ara Nov 27 '13 at 12:45
  • 1
    @Ara There is no 6G SATA controller for the ProLiant. That's the point. Your SSDs will only run at 3G speed on this platform unless you use *SAS* SSDs. – ewwhite Nov 27 '13 at 13:28
  • 1
    @Ara Software RAID would entail connecting to a different controller, like a SAS HBA (e.g. the LSI 9211-8i)... it would allow you to see the full bandwidth of your disks, but there are other facets of SSD performance beyond pure sequential read/write bandwidth. – ewwhite Nov 27 '13 at 14:00
12

Oh dear, where to begin?

There is so much involved, and you need a good understanding of everything. Just throwing a bunch of disks against a RAID controller won't yield the results you're looking for.

This can't be easily answered. But at least, here is a list of stuff you have to look at:

  • Does the Controller even have the throughput needed? (-> Datasheets)
  • Does the Controller even have enough bandwidth to the Host (it does, even at v1.0, since it's x8)
  • Does the system's chipset have enough throughput (CPU-Controller)? (unknown)
  • What write strategy have you instructed the Controller to employ? (this is what most likely has bitten you)
  • Is everything aligned (Partition starts, LVs, PVs)?
  • Are the block sizes harmonized? (RAID stripe size, block size, FS blocks,...)
  • Is the filesystem optimized to the RAID setup? (Level and Block size)

Since your throughput against the whole RAID (disregarding FS) is significantly lower than a single disk, it's likely you have set up your write strategy wrongly; the Controller is probably waiting for all disks to confirm the write (and unless you've got the RAM on the controller battery backed, this might be in your best interest).

Roman
  • 3,825
  • 3
  • 20
  • 33
  • this is really harder than i thought, i thought its as simple as upgrading Raid Controller ram to 512MB ! , thank you Roman, let me check some of what you just said hoping to find the reason – Ara Nov 26 '13 at 13:01
  • Most of the above are valid troubleshooting strategies for edge cases, but don't apply to the specifics of the original poster's configuration. The controller here has the required throughput and bandwidth, the CPU is fine, the RAID controller defaults to 25:75 R/W cache ratio (should be disabled entirely for SSD use), CentOS6 aligns partitions correctly and there's too much potential abstraction to get the block sizes "harmonized". The three issues at hand are the 6G disks are throttled at 3G speeds, filesystem write barriers are probably enabled and the I/O elevator is the default CFQ. – ewwhite Nov 27 '13 at 02:02
  • 1
    The server model unfortunately was not noted at the time of my answer. Good to know about the throttling. I think the barriers and CFQ did not cost a lot in this specific "benchmark", but valuable information nonetheless. – Roman Nov 27 '13 at 12:36
  • @Roman Sorry i didn't mentioned the model earlier, i really appreciate your help, i'm trying both yours and ewwhite's guides to reach 500-600MB, i guess that's the highest speed i can reach with this 3G limit – Ara Nov 27 '13 at 12:39
  • 1
    No problem at all. Make sure you follow the links in ewwhite's answers, as there are specific things to do with the P410 in conjunction with SSD's. **Also, make you that you distribute the four disks evenly across the two connections from the backplane to the controller.** – Roman Nov 27 '13 at 12:46