4

We're diagnosing Ruby performance problems on our application servers which we've managed to reduce to a simple test case.

We compare performance on a machine in our development cluster to a machine in our production data centre.

We used this simple Ruby oneliner:

5000000.times { a = []; a << 1; a.length }

And we benchmarked it as being consistently 55% slower on the production machine.

Things it obviously could be and why we think it's not:

  1. Different software - dev and production machines are installed from the same ubuntu os, ubuntu installation scripts, package repositories, and we use puppet to keep the configurations consistent.
  2. Different hardware - possibly, but see below.
  3. Different load - neither dev nor production machines are significantly loaded, and again see below.

Why don't we think it's load or hardware?

First, they have similar loads and hardware configurations. Second, we wrote a python test script:

n = 10000000
while n > 1:
  n = n - 1
  a = []
  a.append(4)
  len(a)

and this is consistently 10% faster on production than development, which is what we would expect. If the problem was load or hardware, wouldn't Python be slower on production as well?

Briefly both machines are virtualized using ESXi

  • development vm has 4GB RAM and hosted on a machine with dual quad-core AMD Opteron 2376 @ 2.294Ghz 32GB providing one virtual core to the vm

  • production vm has 4GB RAM and hosted on a machine with dual quad-core AMD Opteron 2354 @ 2.211Ghz 32GB providing four virtual cores to the vm (update: we have now tried with one virtual core on all the vms and it made no difference)

The operating system is Ubuntu Hardy 64bit. Our Ruby interpreter is:

ruby 1.8.6 (2008-08-11 patchlevel 287) [x86_64-linux]

and our python interpreter is

Python 2.5.2 (r252:60911, Jul 31 2008, 17:31:22)

NB. We have also tried this with Ruby Enterprise Edition, and the results are the same.

7 Answers7

2

I haven't a clue as to what's going on, but I thought I'd share a couple of other hardware differences, based upon the processors, see if it helps somebody else get closer to the answer.

  • The Opteron 2354 has 2MB L-3 cache, where the 2376 has 6MB.
  • The 2354 uses PC2-5300 DDR2 RAM, where the 2376 uses PC2-6400 DDR2 RAM.

I'm a bit rusty with hardware, but I assume this means memory access in general is significantly faster on the development machine? So, if Ruby was being, somehow, more "memory intensive" (and I don't really know what I mean by that!), then it could show up as a bigger performance difference?

(I'd gone searching to see if there was some virtualization feature in the newer processor that might explain the difference, but came up blank.)

Couple of questions...

mathie
  • 121
  • 4
2

Have you tried REDUCING the number of vCPUs you're giving to the production VM?

I know it sounds counterproductive but, and you may be aware of this already, ESX won't give a VM ANY CPU-time if ALL of the vCPU slots allocated to any give VM are not free - i.e. if a VM has 4 vCPUs assigned to it and they're just not all free then the VM doesn't get any time at all. I know it sounds mental but seriously try dropping to 2 or 1 vCPUs and run it again.

Best of luck.

Chopper3
  • 100,240
  • 9
  • 106
  • 238
1

Actually this is the avenue I was thinking about as well. With the cache being smaller on the production server, it will exhaust this faster and have to resort to using the main memory. If this access is slower in production than development this could account for the problem.

However, this same issue would, or should happen with Python as well. As both Ruby and Python are implemented in C the integer size would likely be the same(This assumption may be wrong though).

In short, I'm still looking!

1

Question: how are you doing the benchmarking? Are you using ruby's benchmark for the ruby version or some external tool? If the former, I'm wondering whether it is something in the benchmark library that is causing the problem.

user13691
  • 111
  • 1
1

I had a similar issue last week on our system. Turned out it was because someone had installed/required activesupport which was modifying a bunch of stuff in our class stack and there by causing a slowdown. This was detected only when running against actual traffic. Check the installed gems on the slow system and compare the versions against the development machine. Its possible something is out of whack.

  • 1
    the gems are identical. We actually diffed our entire /usr/lib/ruby/ directories and they are precisely the same down to the binaries. – Daniel Lucraft Dec 16 '09 at 15:45
  • Have you considered using Ruby 1.9.1? Its what we use in production and it is crazy how much more performance you get with it. Another option would be to try using JRuby or Rubinious in order to get around the global interpeter lock issues with the standard Ruby interpreter. The global interpreter lock is what will prevent you from being able to take advantage of the multi proc. I recommend you have a look here (http://peepcode.com/products/scaling-ruby) and see if it helps. (I am in no way affiliated with peepcode.com) –  Dec 16 '09 at 22:08
0

I generally mistrust VMWare's performance, so would be very interested to see what would happen if they were both the same vmware config.

Uhh... just to be certain, what happens when you do:

 n = 1000000
 while n > 1 do
   n = n - 1
   a = []
   a.push 4
   a.length
 end

I know it's functionally equivalent to 1000000.times { a = []; a << 4; a.length }, but the times function might be doing something funky behind the scenes? I would also agree with the statement about threading, but that effectively means that the dev box is only 0.083Ghz faster - which is not likely to account for a 55% drop. It would be interesting to see what ruby 1.9 does, but if you've tried REE I doubt that the performance difference is going to be too large. I'm assuming it's exactly the same disk speed, memory caches etc?

-1

Is it because ruby can't take advantage of the extra virtual cores due to being green threaded while python with native threading can? (Wild ass guessing here - I don't really know what I am talking about).

user15164
  • 99
  • 2