10

I'm benchmarking an application on two identical servers, one is Centos 5.8 and the other is Centos 6.2. My application is running much slower (50% or less) on the Centos 6.2 machine.

In attempting to diagnose the issue I'm tracking CPU, RAM, and IO throughout the benchmark run. I see that the disk reads are significantly higher on the Centos 6.2 box, as measured with iostat.

Both systems are running XFS where my benchmark is running. Both are HP servers with 512MB caching RAID controllers with 8 x 300GB SAS running RAID 10.

Here is the output of xfs_info for each:

centos5

meta-data=/dev/cciss/c0d0p5      isize=256    agcount=32, agsize=8034208 blks
         =                       sectsz=512   attr=0
data     =                       bsize=4096   blocks=257094144, imaxpct=25
         =                       sunit=32     swidth=128 blks, unwritten=1
naming   =version 2              bsize=4096 
log      =internal               bsize=4096   blocks=32768, version=1
         =                       sectsz=512   sunit=0 blks, lazy-count=0
realtime =none                   extsz=4096   blocks=0, rtextents=0

centos6

meta-data=/dev/sda5              isize=256    agcount=4, agsize=57873856 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=231495424, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=113034, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
ewwhite
  • 194,921
  • 91
  • 434
  • 799
tmcallaghan
  • 289
  • 3
  • 13
  • What exactly is your question? – Tim Brigham May 10 '12 at 13:59
  • Can you show your XFS mount and filesystem creation options? Can you describe the hardware and disk layout in more detail? – ewwhite May 10 '12 at 14:37
  • Quest is that all things being equal, at least those that I can see, Centos 6.2 is benchmarking 50% of Centos 5.8 for my application. The other difference I can currently measure is that reads are MUCH higher. – tmcallaghan May 10 '12 at 14:54

2 Answers2

10

Thank you for updating the post with more information.

You're running on ProLiant systems, so there's a certain amount of work required to optimize your controller and I/O situation. Also, your XFS mounts are using the default options. Remember that you're using a different driver between these operating systems. The EL5 server has cciss, while the EL6 system is using the hpsa module. There is a difference, however, the issue you're experience is probably related to the operating system differences. So here's what I'd check:

  • Change your XFS mounts to include noatime and to disable write barriers with nobarrier. Here's a sample mount string I use often.
  • I/O elevator behavior is different between your Linux versions. Try the deadline or noop I/O elevator on the CentOS 6 server. You can change that on the fly with echo deadline > /sys/block/cciss\!c0d0/queue/scheduler or by appending elevator=deadline in the grub boot entry.
  • Ensure that your read/write cache is optimal for your workload. I usually go with 75% write and 25% read.
  • Update the firmware on the server components. Each revision of the Smart Array RAID controller firmware tends to bring new functionality. This sounds like an HP Smart Array P410 controller, so make sure you're on version 5.14.

Edit:
I'm looking at the xfs_info output for your CentOS 5 and CentOS 6 systems. You formatted the XFS partitions with different parameters!

The EL5 system has 32 XFS allocation groups, while the EL6 system only has 4. Allocation groups allow XFS to parallelize concurrent filesystem operations.

Given the amount of space available and the CPU spec of your server, your existing EL6 setup is constrained by the low agcount. See Red Hat's notes on this. On hardware like this, where the storage is not in the multi-Terabyte range, I typically specify an allocation group per 4GB of partition space. At the very least, go to 32 to match your EL5 server... Try reformatting the EL6 partition with those parameters to see if there's a performance difference...

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • Tried all 4 of the above, none made any measurable difference in the performance of my benchmarks. – tmcallaghan May 16 '12 at 13:05
  • See my edit above. The two XFS partitions on the EL5 and EL6 were formatted with vastly different parameters. – ewwhite May 16 '12 at 13:37
  • Thanks for the continued assistance, I'll reformat my XFS and see if that helps. – tmcallaghan May 18 '12 at 12:24
  • Please post the results. – ewwhite May 19 '12 at 04:59
  • None of the file system changes made much of a difference, some boosted things slightly and others made it slightly worse. – tmcallaghan May 21 '12 at 18:02
  • 3
    We've found that transparent huge pages were the issue. By turning it off the performance of our Centos6 server is similar to Centos5. The command was "$ echo never > /sys/kernel/mm/redhad_transparent_hugepage/enabled". Other distributions do this via /sys/kernel/mm/transparent_hugepage/enabled" – tmcallaghan May 21 '12 at 18:03
  • Look into the [Red Hat tuned-adm package](http://docs.redhat.com/docs/en-US/Red_Hat_Enterprise_Linux/6/html/Power_Management_Guide/tuned-adm.html). It sets the sysfs parameter you mentioned by default. – ewwhite May 23 '12 at 14:24
0

When you run iotop, what is doing the disk reads on the 6.2 box?

Also what are your mount options on the device you are reading from? You may want to look into noatime and relatime

ckliborn
  • 2,750
  • 4
  • 24
  • 36