I am running two Dell R410 servers in the same rack of a data center (behind a load balancer). Both have the same hardware configuration, run Ubuntu 10.4, have the same packages installed and run the same Java web servers (no other load) and I'm seeing a substantial performance difference between the two.
The performance difference is most obvious in the average response times of both servers (measured in the Java app itself, without network latencies): One of them is 20-30% faster than the other, very consistently.
I used dstat
to figure out, if there are more context switches, IO, swapping or anything, but I see no reason for the difference. With the same workload, (no swapping, virtually no IO), the cpu usage and load is higher on one server.
So the difference appears to be mainly CPU bound, but while a simple cpu benchmark using sysbench
(with all other load turned off) did yield a difference, it was only 6%. So maybe it is not only CPU but also memory performance.
So far I've checked:
- Firmware revisions on all components (identical)
- BIOS settings (I did a dump using
dmidecode
, and that showed no differences) - I compared
/proc/cpuinfo
, no difference. - I compared the output of
cpufreq-info
, no difference. - Java / JVM Parameters (same version and parameters on both systems)
Also, I completely replaced the RAM some months ago, without any effect.
I am lost. What can I do to figure out, what is going on?
UPDATE: Yay! Both servers perform equally now. It was the "power CRAP" settings as jim_m_somewhere named them in the comments. The BIOS options for "Power Management" were on "Maximum Performance" on the fast server, and on "Active Power Controller" (default setting from Dell) on the other one. Obviously I forgot, that I made that setting two years ago, and I didn't do that on all servers. Thanks to all for your very helpful input!