Is a higher core count or higher clock speed more beneficial to a computer's performance?

With lowering silicon costs and rising consumer needs, manufacturers seem to be pushing one of two things: clock speed and/or core count. With the way things are going, it doesn't seem that clock speed of processors is rising anymore, but the number of processor cores.

I remember only a few years back, I had a nice fast single-core Pentium 4 processor. Fast-forward to today, and I don't think you can even purchase a single-core processor (not to mention the rising increase in multicore processors even in cellphones). The way things are going, we might find computers with hundreds of cores in a few years (and I know many operating systems already have support for it).

Is it more beneficial to a system's overall performance to increase the clock speed, or increase the number of cores? Assume we're getting into hundreds of cores all running together, or clock speeds ten times higher what we have today (regardless of whether or not that is physically possible).

What are some examples of common processes (e.g. encryption, file compression, image/video editing) that will most benefit from one or another? Are there some processes which can be, but currently aren't (due to technical reasons) sped up by increasing their parallelism?

Assume the hypothetical processor has the exact same core design (word size, address bit width, memory bus size, cache, etc...), so the only variables here are clock speed and core count. And again, I'm not talking about one, two, or even four cores - imagine tens to hundreds.

Breakthrough

Posted 2011-08-17T19:27:50.770

Reputation: 32 927

9It's all going to depend on what you want to do on that computer. Multiple cores are good for some things, higher clock speeds for others. – ChrisF – 2011-08-17T19:30:16.660

@ChrisF I personally know the answer, but I'm asking this for two reasons. The first is to have this information on the website (I've only seen it asked in relation to dual or quad core processors), and the second is to try to give people an idea of what's to come "in the future" and to show what the applications are of both sides of the equation. – Breakthrough – 2011-08-17T19:32:57.457

It would be better to rework the question a bit. At the moment it reads like a "list of X" question where each answer is equally valid (especially 'cos of that last sentence). – ChrisF – 2011-08-17T19:34:18.420

The general answer is "yes". Or perhaps "maybe". Processor speed is really limited by memory access speed -- the effective MIPS rate of a processor is typically 10-30% of the max rate, due to memory delays. Multiple processors can both help and hurt this situation, depending on the memory subsystem design and the type of applications being executed. I recall one case where adding a second processor increased throughput by only about 10% for the average workload, due to memory contention. – Daniel R Hicks – 2011-08-17T19:35:01.057

– slhck – 2011-08-17T19:35:50.897

@ChrisF updated the question to try to direct the flow a bit more. This is a very abstract topic, and again, I want people to try to think "towards the future". Imagine +20 GHz clock speeds versus 128 cores. Obviously we have to take into account Amdahl's law (and I would expect it to show up in at least one answer), but that law also makes some assumptions about the workload.

– Breakthrough – 2011-08-17T19:38:03.937

@DanH then that also shows another implicit problem - processor cache. Obviously some (but not all) memory delay problems can be solved by an increased CPU cache, but what if a computer had multiple memory controllers (all with multiple, segregated amounts of RAM accessible by a single core) that could interface with a central "datastore" of memory (accessible by all cores)? AFAIK, nothing like this exists yet, but this is the kind of thinking I want to see in the answers (solutions to tomorrow's problems, basically). – Breakthrough – 2011-08-17T19:40:55.023

I appreciate the thought here but I'm sort of onboard that this question is way too broad to really be good, although the edits help a lot. This is really getting into computational complexity, though, and might honestly be better at math.SE maybe? The meat here really boils down to the last paragraph - what's the effect of parallelism on certain types of computations? – Shinrai – 2011-08-17T19:42:07.447

@Shinrai I will admit the thought crossed my mind when I was posting this, but felt it had a better fit here at SuperUser. I understand if this should be closed for being too broad, but would it also be worth considering making it part of the community wiki? – Breakthrough – 2011-08-17T19:43:51.760

3I would say that while this is a good question for talking over while having a pint, it is pretty much not a good stack-exchange question. – EBGreen – 2011-08-17T19:46:27.920

1There are too many variables, what ifs and other parameters + ongoing technology changes to develop a succinct answer that will be relevant for more than a specific period of time. This is an interesting topic for a forum or blog, but not as something to be pinned down as an 'answer'. I have voted to close for this reason so let the flaming begin!!! – Linker3000 – 2011-08-17T20:24:51.520

@Breakthrough -- Ultimately cache is just another layer if memory and another bottleneck -- the MIPS rates I quoted assumed cache. Most MPs have 2-3 layers of cache, in addition to "main store". And all manner of NUMA configurations exist, some with a common backing store for the multiple processors, some where each processor has an independent store but they "steal" from each other, etc. – Daniel R Hicks – 2011-08-17T21:07:10.547

@Linker3000 no flaming here, you have a valid point. Hopefully this question can be further explored in the future (depending on how our technology progresses). Regardless, I think everyone who looked at this question should read the following news article (it's really cool): IBM produces first working chips modeled on the human brain.

– Breakthrough – 2011-08-19T00:26:36.320

Answers

There are two basic situations that are to be considered:

The processor is used with a computer that solely does calculations for a single program
The processor is used for multiple programs running at the same time

The first situation is where processor 'speed' is more important, as the user wants the ability to make calculations quickly and efficiently. These situations are typically for calculation intensive processing i.e. calculating prime numbers for encryption/decryption

The second is where multiple cores come in handy, as each program can be assigned to a separate core, thus freeing each program from 'bottle-necking' each other. In today's world, the average user is going to be using their computer for multiple programs at a time, thus making the multi-core processing a desirable thing.

However, multi-core != faster speeds or higher performance in all cases. Since most programs are written for single core processing^*, clock speed is still important to look at. A combination of both must be taken into consideration (along with many other factors as well).

_*_{There are some programs, and hopefully soon more will be created, where multiple cores can be used at the same time. The future of software is found with this "Parallel Programming":}

_{Software developers can no longer rely on increasing clock speeds alone to speed up single-threaded applications; instead, to gain a competitive advantage, developers must learn how to properly design their applications to run in a threaded environment. Multi-core architectures have a single processor package that contains two or more processor "execution cores," or computational engines, and deliver—with appropriate software—fully parallel execution of multiple software threads.}

_-Intel

James Mertz

Posted 2011-08-17T19:27:50.770

Reputation: 24 787

Best answer I've seen thus far, +1. Do you know if it is possible to speed up encryption/decryption with some kind of parallel algorithm (or does such a thing even exist)? – Breakthrough – 2011-08-17T21:43:02.533

I personally think core count is the way to go. Software development has shifted to networked systems so no longer are local resources the only resources available to you. The most important factor in how you work now is what network you are a part of.

Notice the shift to mobile broadband, constant connectivity, remote access, etc etc. With that, constant connectivity requires battery life. While it is debatable which CPU factors are more optimal for battery life (You've got the a classic optimization equation of work value vs time), I personally think, if you had to pick one, I'd pick more cores.

Intel now allows you to power cores on demand. While not as optimal as having no cores to sleep, having the option to use more cores give you the flexibility to run more applications off the same hardware platform.

surfasb

Posted 2011-08-17T19:27:50.770

Reputation: 21 453

First of all, single-core speeds have not really gone down that much. The only reason Intel's current Sandy Bridge lineup does not top single-core Pentium 4s in terms of megahertz is that Intel lacks competition, so they don't have to push that hard.

Second, clock speed is not everything, even on single core. When looking at application performance, again against Pentium 4, current Intel lineup is around 50% faster per clock cycle. The reasons why Sandy Bridge is faster per clock cycle than Pentium 4 (Prescott being the last incarnation of it) are multitudinous, but having pre-fetching intelligent memory controller, having memory controller on same die with CPU and higher Instruction Level Parallelism (ILP) contribute to that.

Instruction level parallelism basically means that the processor looks at the instructions and their dependencies and if two instructions are not depending on each other, the CPU can start loading data for both at the same time and possibly reorder the instructions, of the data for one of them arrives before the other one.

Third, some application indeed benefit very nicely from multiple cores. For example Photoshop almost always prefers more cores over operating frequency. Ie. even a slow quad-core almost always beats any dual-core chip, and any dual-core beats any single-core chip. Tri-cores are a mixed bag, they often win over dual-cores, but not always.

Generally applications that do same kind of operations for lot of different sets of data benefit from parallelism most. For example video compression or photo editing often can be parallelized quite easily. On the other hand, computer games have proved out to be hard to parallelize. The graphics on them of course parallelizes very well, but that part is executed on GPU, not CPU. The remaining physics, game world bookkeeping and AIs parallelize less easily.

Zds

Posted 2011-08-17T19:27:50.770

Reputation: 2 209

As ChrisF mentions in a comment, it depends. But as answers like that aren't really answers, I'll try to make out some scenarios where one will be more beneficial than the other:

In most of the common processes you mention, the number of cores aren't going to matter very much, since most of the work is done in a single thread which can only execute on a single core (at a time). For such processes, a single but very powerful core will perform better than a couple of slower cores. Both encryption and file compression could be exceptions to this, but it depends a lot on what algorithms are used, and if they can be executed in parallel.

However, you have forgotten one of the most common tasks performed on computers today: browsing. Several popular browsers open each tab in a separate process (Chrome being the only one I'm sure does this, since it's the one I use), meaning that if you have four tabs open on a quad-core system, each browsing window can (in theory) have a core "to itself" (ignoring OS threads and stuff), and be as fast as if there was no other browser tabs/windows open. For people who browse with many tabs open at a time, this can be a serious performance improvement without having to build extremely fast CPU cores.

The key to knowing whether a multi-core system with slower cores will be faster than a single-core system with a fast core is knowing if you will do a lot of different things simultaneously or a few, but heavy, things. As this will differ a lot from user to user, so will the answer to your question.

The other answers make a couple of important points too:

processor performance isn't all about clock speed or number of cores anymore - other parts of the processor are becoming bottle necks as clock speed and core count improve.
for most users, processor performance isn't even the bottle neck to begin with. If you're spending your time in hosted applications like Google Docs, the speed of your network card is going to matter more than the speed of your processor core(s). If you're watching or editing high-res movie material, hard disk performance is going to matter more. Etc...

Tomas Aschan

Posted 2011-08-17T19:27:50.770

Reputation: 2 294

+1 for putting some thought in and actually coming up with explanations, but I would like to point out one thing: yes, some browsers put each tab in a separate process, but that's just done incase the process crashes. Most browsers at the very least run each tab in a separate thread, and operating systems have the ability to run multiple threads (from the same process) on different cores. – Breakthrough – 2011-08-17T20:05:44.237

IE 9 does the multi-process method. However, I believe they use a set number of processes and just share all the tabs between those processes. It results in many fewer processes than Chrome can end up with, while still meaning you'll only lose a few tabs if it all comes down in one Redmond-tinged heap. – music2myear – 2011-08-17T20:19:19.333

Oh, and it also depends on the software. While the OS can generally handle traffic management and send waiting threads to available cores (mental picture of match.com going on inside my grotty silicon), programs that are multi-threaded themselves (most of the Adobe Creative Suite products and other well-supported, modern multimedia development tools) will take much better advantage of the capabilities of a multi-core system. – music2myear – 2011-08-17T20:21:25.413

Actually today the most important factor isn't the clock speed of processor, there are a lot of new features launched since this "factor of comparison" fell into desuse.

Today you must take a look at many factors to infer about the processors's performance. Things like:

number of cores
number of parallel operation threads
processor family (Dual core, Pentium, Core i / Calpella, Sandy Bridge, etc.)
processor generation (2nd, 6th, etc.) and then
processor's clock speed.

Actually, when I want to compare processor speeds I consult passmark of notebookcheck benchmark tables. Benchmarks, in my opinion, are the best factor to measure and compare processor speed and performance.

Diogo

Posted 2011-08-17T19:27:50.770

Reputation: 28 202

Yes, but to simplify things, let's assume all else is equal (same amount of cache per core, same address bus width, same word size, etc...). The cores themselves are the exact same, it's just a) how many, or b) how fast. – Breakthrough – 2011-08-17T19:52:29.500

Um, the processor family and generation have nothing inherently to do with processor speed. After all, the Atom procs are much newer than P4s or the Core and Core2 proces, but nobody would ever argue they're faster. Other things that have more direct effect on CPU speed are on-die cache, the number of registers, the architecture of the chip, the size of the conductive pathways (nm manucturing process), floating point operation capabilities, etc. – music2myear – 2011-08-17T20:25:15.677

@music2myear - When reffering to generation im talking about family( http://superuser.com/questions/314757/whats-the-main-difference-between-intel-processor-generations ). Actually Atom D525 are better than a lot of Core 2 ( http://www.cpubenchmark.net/cpu_lookup.php?cpu=Intel+Atom+D525+%40+1.80GHz )

– Diogo – 2011-08-17T20:28:48.863