(This post is asking for speculation and I'm happy to oblige.)
Why not continue to just add less but faster core per chip for the same price?
The problem is that the current technology had hit its limits, so only minor
performance improvements are now possible. Improvements of 10-20% just don't sound
very convincing.
On the other hand, manufacturers do not wish to fall behind
Moore's law,
stating that computer chip performance would roughly double every 18 months
(with no increase in power consumption).
This needs an improvement factor of 100%, and such single-core technology
just does not exist.
Solution : Double the number of cores and sum up their total capacity,
as proof that performance is evolving fast enough by 100%.
In real life this theoretical increase of the number of cores is not guaranteed
to increase the total performance, since some computer resources are shared and
may become bottlenecks, for example the RAM, bus and disk.
What does decreasing per thread performance for the same micro-architecture brings?
Increasing the number of cores cannot be done indefinitely, especially in view
of electrical consumption. For a core to work faster, it needs more electricity.
This means that the more cores you have, each will have a smaller part of the
total available electricity and so must work slower.
The solution here is turbo mode, whereby one core gets most of the available
electrical supply. So you have one fast core and the others either turned off
or slowed down. But as one core cannot support that mode indefinitely, the
solution is to switch turbo mode on for multiple cores in rotation.
In general, for comparable technology, a CPU with fewer cores may prove faster than
a multi-core CPU, for a core-to-core comparison. Other factors may influence the
speed, but choosing between the number of cores and single-core performance is often
the question. The applicability of turbo mode to the work-load is another question.
Or rephrased
why you have better to run a huge python workload on a $117 22nm Core i3 Ivy Bridge cpu than a $2000 10nm Xeon Cannon Lake cpu?
(since cpu bounds python programs can run only one thread at time). – user2284570 – 2019-10-27T04:27:04.7433The answer is simple: Because server CPUs are optimized for multi-core processes. That is the typical scenario of a server (web- /database server, computing cluster, ...). – Robert – 2019-10-27T10:05:52.917
@Robert that’s I said in my question. But why putting less but faster cores isn’t equivalent to setting more but slower cores? – user2284570 – 2019-10-27T14:33:52.093
This is a question of the software you run. Typical server software can make use of multiple cores. A Python program not. Some typical desktop programs can make use of some cores. – Robert – 2019-10-27T17:11:15.917
@Robert do you know that python is Powerring the backend of several Google websites (though not the most used ones)? – user2284570 – 2019-10-27T17:36:37.330
Python web systems use multiple processes instead of multi-threading. Therefore Python can be used without problems on a server in this case. – Robert – 2019-10-28T09:05:04.553
@Robert except if you have a 2000 billions (like in my case) possibility graph where walking on it takes 5ms per path so that creating a process each time or relying on ɪᴘᴄ is too much overhead. – user2284570 – 2019-10-28T12:43:10.183