When running a compute-intensive task on a server with an Intel i7 quad-core processor with Hyperthreading, is it ideal to run eight threads (for the eight virtual cores), or only four (for the four physical cores)? Each thread achieves consistent 100% utilization of a virtual core.
3 Answers
8 threads would be ideal, assuming there's no significant additional overhead in result combining or anything like that. With only four threads, any execution units that couldn't be saturated by the single thread per virtual core would be wasted. With eight threads, they can be used.
Note that this only applies to the unrealistic assumption that each thread can saturate a core. Also, it might not apply if the division of the processor cache resources negatively impacted performance. Some tasks have performance that "falls off a cliff" at a certain cache size. If your cliff is between the full cache size of the physical core and half that cache size, then four threads might be better.
- 31,215
- 2
- 53
- 82
The doctrine I was taught when compiling was 1.5x the number of cores. This accounts for any time when a thread/process is waiting on I/O.
If your task has no chance of blocking on slower operations like I/O then there may be no need to exceed the number of cores but if it can, you will want more processes than cores.
Look at it this way: if you have four cores and three processes, you can never achieve 100% CPU. The same is true for four processes when one of them is blocking on I/O. If you have six processes with no blocking, you might be slightly less efficient as the kernel uses up some CPU time swapping the processes on and off the four cores but no core will ever be idle.
Unfortunately I have no idea about the physical/virtual aspect of your question.
- 25,847
- 7
- 57
- 90
My guess is that is optimal to use one task per core and disable hyperthreading.
If I start as many cpu-intensive threads as I have logical cores, I will have fast context switches for the cpu intensive tasks, but expensive ones for the background tasks since the hyperthreading totally consumed by the cpu intensive tasks. On the other hand, if I start as many cpu-intensive threads as I have physical cores I will have no context switches to those tasks and fast context switches for the background tasks. Seems good, but the background tasks will found free logical processors and will run almost imediatedly. It is like they are realtime performace (nice -20).
I do not know how fast a context switch between two tasks on the same core is. Also I am afraid that by sharing the cache between two threads on the same core will lowers the cache hit frequency (unless they are running the same smaller than 1mb program). I doubt it is without penalties. My intuition is that that a complex cpu intensive task will run faster on one task per core than one task per virtual processor. But if you do that you will left two virtual processors free and background tasks will get priority they should not have.
In the first scenario the hyperthreading is uselles, the background tasks will use expensive context switches because I maxed out hyperthreading with the normal processing. The second is unaceptable because up to 50% of my cpu power gets prioritized to the background tasks.
I usually disable hyperthreading on my intel desktop and servers. I show how in https://serverfault.com/a/720471/309821.
But this is based on guesswork. I have the impression that is better, but it may not be.