2

I am running an python multithreaded application with multiple processes which scrapes data from some websites. While running on my localhost It works great, but on the vps server I am using( Centos 5.8, 2.6 GHZ with 4 cores) performs very slow.

From the nethogs command I get the network usage too low. I get around 8KBps with 15 threads. On other hand, in my PC I get the usage around 100-120KBPS.

I have read about the Python GIL and threading limitations. It seems GIL never releases the lock on the VPS though it should while doing I/0

Is there any configuration in the VPS that I need to change for the threading to work properly?

UPDATE: Actually multithreading is working but it's the cpu that was causing the problem. 15 Threads were too many for it and it became too busy with the thread switching. Though the vps claims that it's 2.6 GHZ cpu, I think It's actually not. Is there a way to measure the real processor speed in the VPS?

1 Answers1

3

You are almost certainly sharing the CPU with other VPSs on the same host, so you can't expect to get the same performance as a dedicated CPU. The GIL doesn't behave different on different CPUs, so that's not the cause. Use top to check the CPU utilisation on the VPS; you're mainly interested in the Cpu line like this:

Cpu(s): 30.2%us,  7.8%sy,  0.0%ni, 41.0%id, 20.8%wa,  0.0%hi,  0.2%si,  0.0%st

With a recent hypervisor and OS you should see a non-zero st number -- this is the CPU time "stolen" by other VMs on the same host, from which you can figure out what proportion of the CPU you are getting.

mgorven
  • 30,036
  • 7
  • 76
  • 121