Sorry, but it is very non-definite.
If you have HT enabled you have two logical processors per core. If you have it disabled, you have just one. (This lets us talk about how the scheduler works without constantly qualifying what we mean by a "CPU".) Either way, a logical processor is seen by the OS as a processor, and except for some attempts at scheduling optimizations* the OS doesn't do anything else by, for, or because of hyperthreading.
From the time an LP context-switches to a thread, to the time it switches to some other thread, the LP is considered to be used 100% by that thread. The OS has no way to know whether a thread in an LP is using 10% of the core, or 90% of the core, or stalled completely because of something the thread in the other LP is doing. The OS just thinks it's running.
Nor does HT implement anything like thread priorities. So if two threads are trying to run in the two LPs on one core, and one is set in the OS to a higher priority than the other, the core can't do anything about that - there's no way it can even know. The core will treat the two threads as having the same priority and will assign microarchitecture resources accordingly.
*Optimizations: Modern OSs are aware of the relationship of LPs to cores and will try, for example, to use just one LP out of each core, until more than number_of_cores threads want to run; the two LP of a core are considered equivalent as far as cache investment is concerned; etc.
Jamie, thanks for a thorough explanation. As far as I understand, what you are trying to say is that in case I have
NTHREADS == NCORES
- logical core time will actually count as CPU time disregarding the actual amount of instructions it was able to share. So I will virtually get double the CPU time spent with HT enabled and probably get half the average FLOPS per (LP)core in case of (completely) fair queuing? – grandrew – 2016-07-14T10:15:41.050Q1: yes, though I don't get your phrase re "instructions it was able to share" - what would you be sharing in that situ.? Q2: Yes, with HT enabled and
NTHREADS==2xNCORES
you'd see 2x the apparent CPU time used, but not 2x the work done. 3rd: if the core's FP unit was your bottleneck when HT was disabled, then with HT and two threads per core, the total FP work done would be about the same as with one thread per core, but each thread would only get half the FP thruput. However, FP performance also depends on things other than the FP unit (like memory access), so this isn't certain. – Jamie Hanrahan – 2016-07-14T10:32:56.960As I see it, when two threads compete for time of a single core in HT cpu - the threads can "share" some of the core instead of waiting for core to free up - in case core finds it has anything to share with another(waiting) thread while executing the first thread. – grandrew – 2016-07-18T12:23:33.553
Well... there is no concept of "the first thread", i.e. neither of the LPs has precedence. The firmware tries to schedule the core's resources to allow both LPs to make progress. – Jamie Hanrahan – 2016-07-19T13:21:22.427