44

I was doing some simple hand benchmarking on our (live) database server during non-peak hours, and I noticed that queries returned somewhat erratic benchmark results.

I had enabled the "Balanced" power saving plan on all our servers a while ago, because I figured they were nowhere near high utilization and this way we could save some energy.

I had assumed this would have no significant, measurable impact on performance. However, if CPU power saving features are impacting typical performance -- particularly on the shared database server -- then I am not sure it's worth it!

I was a little surprised that our web tier, even when at 35-40% load, is down-clocking from 2.8 Ghz @ 1.25V to 2.0 Ghz @ 1.15V.

I fully expect the down-clocking to save power, but that load level seems high enough to me that it should be kicking up to full clock speed.

Our 8-cpu database server has a ton of traffic, but extremely low CPU utilization (just due to the nature of our SQL queries -- lots of them, but really simple queries). It's usually sitting at 10% or less. So I expect it was downclocking even more than the above screenshot. Anyway, when I turned power management to "high performance" I saw my simple SQL query benchmark improve by about 20%, and become very consistent from run to run.

I guess I was thinking that power management on lightly loaded servers was win-win -- no performance loss, and significant power savings because the CPU is commonly the #1 or #2 consumer of power in most servers. That does not appear to be the case; you will give up some performance with CPU power management enabled, unless your server is always under so much load that the power management has effectively turned itself off. This result surprised me.

Does anyone have any other experience or recommendations to share on CPU power management for servers? Is it something you turn on or off on your servers? Have you measured much power are you saving? Have you benchmarked with it on and off?

Jeff Atwood
  • 12,994
  • 20
  • 74
  • 92
  • 2
    I hate to say this, but, you provided your own answer. See the three bubbles next to "Performance" for Balanced mode, and six for High Performance mode? There's the difference :) Power saving is implemented largely by downclocking the CPU. You're going to see a performance hit from that, even though it can bring it back up under load. – Bill Weiss Dec 14 '09 at 18:15
  • yes, but "balanced" is **on by default** in Windows Server 2008 R2. This implies, to me, that there shouldn't be a significant performance penalty in typical use, e.g., a lightly loaded server. – Jeff Atwood Dec 14 '09 at 18:38
  • 4
    Three bubbles vs six! That looks like a huge difference! :) – Bill Weiss Dec 14 '09 at 19:31
  • 1
    Windows: Dumbing down humans since 1985. – Michael Graff Dec 16 '09 at 11:07

10 Answers10

16

I'm not sure about servers, but the current thinking in embedded devices is not to bother with steps between low-power and flat-out because the extra time involved will eat your power savings, so basically they run low power until they get any real amount of cpu load at which point they flip over to fastest-possible so they can finish the job and get back to idling at low power.

pjz
  • 10,497
  • 1
  • 31
  • 40
11

I have always turned off any type of power management on servers. I am curious to what others have experienced, but I always assumed that if the server is under-clocking, there will always be some delay to 'step up' the CPU to 100%, and in a data-center setting any delay like this is unacceptable.

The data you provided seems to support this assumption. So, I have not done any specific testing but it would seem that you should not use any power-saving technology within Windows or the BIOS. I even turn off the 'shut off Network' and PCI card settings to be ultra conservative.

Dave Drager
  • 8,315
  • 28
  • 45
8

How Much Power will this actually Save you:
If you do decide that this feature might put the stability of your servers at risk (I have on experience with this), then you might look elsewhere for the energy savings.

I would try to find out just how much energy this might save for the amount of servers you have (Although perhaps you already did this). Since the graph you posted in your answer is percentages, for your company, the savings might actually be very little actual power. If you don't have many servers, it might not actually be that much, and getting motion activated lights or something like that in your office might save more energy (even though that is not as marketable).

I remember reading a few years back about one of the major American car companies (forget which) having pressure to change the emissions of the exhaust on their cars. Instead, the company showed that if it capped some of its factories, that would be much cheaper for them as well as resulting in far more emissions savings.

Don't Forget Disks:
Also, you might want to check that these power savings feature don't spin down the disk(s) if they are not used. Maybe for a little while all the SQL query results would be in RAM, the disk would be used and go to sleep (Not sure if it works like that though)? If this can happen, there would be a big performance penalty while everything spins up again.

Kyle Brandt
  • 82,107
  • 71
  • 302
  • 444
  • 4
    oh yes, disk spin down is quite risky on servers. We had that enabled on our NAS and Brent Ozar (our SQL/DBA expert) advised us that this was a very bad idea.. every spin-up is like a lottery to see if the drive will actually make it back :) – Jeff Atwood Dec 15 '09 at 03:40
  • "Balanced" mode saves me 10 watts at idle compared to "High performace" mode (~15.5W vs ~25.5W on a core i7 4790). But, balanced mode on Windows server 2012 does not seem to ramp up to the max CPU clock speed which severely hurts database query performance. – Ryan Anderson Apr 29 '16 at 14:09
8

you will give up some performance with CPU power management enabled, unless your server is always under so much load that the power management has effectively turned itself off. This result surprised me.

Preface: I'm making some leaps/generalizations about Intel Xeons and their power saving performace with SpeedStep. In reading about the Intel Xeon "Yorkfield" 45nm CPUs, Enhanced Intel SpeedStep Technology (EIST) and Enhanced Halt State (C1E) seem to be the real culprit of the situation. I would agree with your statment in believing that turning on such power management features would aid the conservation of energy but when the CPUs needed the energy under load that the system would return to a normal voltage clock speed settings. It appears that EIST and C1E have some side effects that aren't intuitively implied when using either/or option in the BIOS. After crawling through numerous overclocking websites, it appears that these two settings in the BIOS cause quite a bit of frustration.

From http://www.overclock.net/intel-cpus/376099-speedstep-guide-why-does-my-processor.html:

C1E (Enhanced Halt State): C1E is the simpler of the two components. It can be enabled or disabled in the BIOS, and performs independently of the operating system. C1E has two configurations - idle, and load. When CPU usage is relatively low, this feature lowers your processor's multiplier to its lowest setting (usually 6x) and slightly lowers its vCore. During a CPU-intensive application, it will raise the multipler to its maximum value, and will provide a small boost in vCore to compensate. In our example, C1E will make your processor run at either 6x or 9x the FSB.

EIST (Enhanced Intel SpeedStep Technology): This is a very robust feature and has a wide variety of power-saving capabilities. Like its simpler cousin, EIST can affect both your CPU's voltage and it's multiplier - however, it has many more levels of configuration. Instead of a simple "slow or fast" setting, SpeedStep can utilize all of the available multipliers. In our example case, EIST will allow your processor to run with a multiplier of 6, 7, 8, or 9, and chooses which one to use based on how much demand your CPU is under. EIST is controlled by Windows, and utilizes the different "power schemes" you may have seen in your control panel.

While adjusting your performance settings for "high performance" is probably the best setting for a database server, I'm fairly certain either EIST and/or C1E caused the CPUs to under perform even though they should have gone back to normal settings when the load increased substantially. The big caveat to me appears to be "what is a substantial load?" According to the overclockers.net site they claim that EIST uses those "power schemes" settings for how to manipulate your CPU settings. But there's no indication of percentage of load or for how long to know when to turn the CPUs back to normal voltage.

Again, I'm by no means an expert on the subject matter for Intel CPUs but I would wager that adjusting these two settings might get you the power savings you want and the performance you should get, but sticking with the "maximum performance" setting is just as effective without the need to reboot.

osij2is
  • 3,875
  • 2
  • 23
  • 31
5

The fast answer is: Of course power saving will affect performance.

The longer answer is no fun. Basically, try a setting, test performance, and decide what you can live with.

Applications and systems are so very complicated that there is no cut and dry answer here, other than "yes, reaction time and other system speeds will be affected." If it is that much slower than the hard drive, or the network -- well, you get the idea. Test in reality.

Michael Graff
  • 6,588
  • 1
  • 23
  • 36
4

I always try to VM as many servers as I can but where I have to 'bare-metal' a server it's usually as I need or want totally consistent performance. So for these business-critical machines I NEVER switch on anything power-saving related whatsoever for exactly the reasons you're experiencing.

***bang-goes-my-green-credentials*

Chopper3
  • 100,240
  • 9
  • 106
  • 238
3

A few things:

  1. Check in the BIOS to make sure that power management is under OS control. It could be possible that it's set to be managed by the firmware, and therefore using dumb, unoptimal processor power management.

  2. Check to see if there are any power management-related hotfixes that you might be missing. There were quite a few notable ones back in the day when Vista/Server 2008 came out.

  3. Check the detailed configuration for Balanced. It's possible that another power saving feature is causing the reduced performance. In theory, the performance hit from EIST should be negligible, though then again, an SQL database has a unique footprint, and it's conceivable that it gets disproportionately affected by processor power management.

Bigbio2002
  • 2,763
  • 11
  • 34
  • 51
2

Some information from Microsoft (Word Doc format, unfortunately)

Improve Energy Efficiency and Manage Power Consumption with Windows Server 2008 R2

Windows Server 2008 is more energy efficient overall than its predecessor, Windows Server 2003. By default, Windows Server 2008 runs the “Balanced” power savings plan, which aims to keep performance high while saving power whenever possible. This means that Windows Server 2008 uses less power than does a baseline installation of Windows Server 2003. Because the “Balanced” mode maximizes out-of-the-box (OOB) power efficiency, Microsoft highly recommends leaving the default “Balanced” settings selected in most cases.

Windows Server 2008 includes two additional default modes, “Power Saver” and “High Performance,” which have different power and performance goals and may be appropriate in some situations. The “High Performance” mode may be appropriate for servers that run at very high utilization and need to provide maximum performance, regardless of power cost. The “Power Saver” mode can be used for little-utilized servers that have more performance capability than they really need; using “Power Saver” in this situation may provide incremental power savings.

These particular hardware-level CPU power saving features are the same under any OS of course, it's just a question of whether or not you turn them on.

The power savings graph of no CPU power management, versus CPU power management:

We're clear that (and this graph shows that) at high utilization levels, CPU power management is automatically turned off. What I'm not clear on, however, is whether at low utilization levels there is impact to overall server performance, e.g. turnaround time on simple-ish SQL Server queries.

Jeff Atwood
  • 12,994
  • 20
  • 74
  • 92
  • It seems a touch naive to believe a white paper written by Microsoft about one of their own products. The aims of the feature may be correct, but the actual figures may and will vary. – gekkz Dec 14 '09 at 18:20
  • 1
    don't believe what? that **"balanced" is on by default**? That's a fact.. install Windows Server 2008 R2 if you don't believe me. – Jeff Atwood Dec 14 '09 at 18:26
  • @Jeff Atwood: I think he might have meant the actual power savings results of the graph. It might be in their interest to show the statistics in a way that shows the most possible power savings for marketing reasons. For instance, the percentage of maximum wattage seems a little odd, why not just put actual watts saved? However, as a disclaimer, there could be a very good reason they chose that metric (if that is even the right word). – Kyle Brandt Dec 14 '09 at 18:38
  • the power savings (a modest 10% per the graph) of enabling CPU power management is indisputable.. you can use a kill-a-watt and your own home PC to verify that. It's actually higher on a home PC since they tend to have less drives, memory, raid controllers, etc sucking down power. – Jeff Atwood Dec 14 '09 at 18:39
  • @KyleBrandt -- I suspect Percent of maximum wattage was used to make the findings more generic. This way you apply the findings to your machine (that draws 400w at full bore, compared to a XXX-watt server in the paper). – EricB Oct 31 '13 at 15:12
0

When you're talking about performance on a server, there are a few different ways of looking at it. There's the apparent response time (similar to network latency) and the throughput (similar to network bandwidth).

Some versions of Windows Server ship with Balanced Power settings enabled by default. As Jeff pointed out. Windows 2008 R2 is one of them. Very few CPU's these days are single core so this explanation applies to almost every Windows server you will run into with the exception of single-core VM's. (more on those later).

When the Balanced power plan is active, the CPU attempts to throttle back how much power it's using. The way it does this is by disabling half of the CPU cores in a process known as "parking". Only half of the CPU's will be available at a time so it uses less power during times of low traffic. This isn't a problem in and of itself.

What IS a problem is the fact that when CPU's are unparked, you've doubled the available CPU cycles available to the system and suddenly unbalanced the load on the system, taking it from (for example) 70% utilization to 35% utilization. The system looks at that and after the burst of traffic is processed, it thinks "Hey, I should dial this back a bit to conserve power". And so it does.

Here's the bad part. In order to prevent an uneven distribution of heat & power within the CPU cores, it has a tendency to park the CPU's that haven't been parked recently. And in order for that to function properly, the CPU needs to flush everything from the CPU registers (L1, L2 & L3 cache) to some other location (most likely main memory).

As a hypothetical example, let's say you have an 8 core CPU with C1-C8.

  • Active: C1, C3, C5, C7
  • Parked: C2, C4, C6, C8

When this happens, all of them become active for some period of time, and then the system will park them as follows:

  • Active: C2, C4, C6, C8
  • Parked: C1, C3, C5, C7

But in doing so, there's a good amount of overhead associated with flushing all of the data from the L1-L3 cache to make this happen so that weird errors don't happen to programs that were flushed from the CPU pipeline.

There's likely an official name for it, but I like to explain it as CPU thrashing. Basically the processors are spending more time doing busy work moving data around internally than they are fielding work requests.

If you have any kind of application that needs low latency for its requests, you need to disable the Balanced Power settings. If you're not sure if this is a problem, do the following:

  1. Open up the "Task Manager"
  2. Click the "Performance" tab.
  3. Click "Open Resource Monitor"
  4. Select the "CPU" tab
  5. Look at the right-side of the window at the various CPU's.

If you see any of them getting parked, you'll notice that half of them are parked at any given time, they'll all fire up, and then the other half get parked. It alternates back and forth. Thus, the system CPU's are thrashing.

Virtual Machines: This problem is even worse when you're running a virtual machine because there's the additional overhead of the hypervisor. Generally speaking, in order for a VM to run, the hardware needs to have a slot in time available for each of the cores at each timeslice.

If you have a 16 core piece of hardware, you can run VM's using more than 16 total cores but for each timeslice, only up to 16 virtual CPU's will be eligible for that time slice and the hypervisor must fit all of the cores for a VM into that timeslice. It can't be spread out over multiple timeslices. (A timeslice is essentially a set of X CPU cycles. It might be 1000 or it might be 100k cycles)

Ex: 16 core hardware with 8 VM's. 6 have 4 virtual CPU's(4C) and 2 have 8 virtual CPU's(8C).

Timeslice 1: 4x4C Timeslice 2: 2x8C Timeslice 3: 2x4C + 1x8C Timeslice 4: 1x8C + 2x4C

What the hypervisor cannot do is split half of the allotment for a timeslice to the first 4 CPU's of an 8 vCPU VM and then on the next timeslice, give the rest to the other 4 vCPU's of that VM. It's all or nothing within a timeslice.

If you're using Microsoft's Hyper-V, the power control settings could be enabled in the host OS, meaning it will propagate down to the client systems, thus impacting them as well.

Once you see how this works, it's easy to see how using Balanced Power Control settings causes performance problems and sluggish servers. One of the underlying issues is that the incoming request needs to wait for the CPU parking/unparking process to complete before the server is going to be able to respond to the incoming request, whether that's a database query, a web server request or anything else.

Sometimes, the system will park or unpark CPU's in the middle of a request. In these cases, the request will start into the CPU pipeline, get dumped out of it, and then a different CPU core will pick up the process from there. If it's a hefty enough request, this might happen several times throughout the course of the request, changing what should have been a 5 second database query to a 15 second database query.

The biggest thing you're going to see from using Balanced Power is that the systems are going to feel slower to respond to just about every request you make.

Mike Taber
  • 149
  • 1
  • 7
0

You should never, ever resort to using the Windows settings or the Bios Speedstep which comes on Intel processors and there's also an AMD equivalent. These can cause issues, and I've seen such issues where with Speedstep the CPU clock would keep bouncing up and down erratically even though the CPU resource usage was consistent.

If you want to be greener and save power, use low power processors, designated with the L character before the model name, such as L54XX series and L55XX series from Intel.

EDIT: I'm sorry if I gave the impression that this feature will always fail, I've just been burned by it, and in a mission critical system I can't have this sort of stuff happen, so I just try to stay away from it.

gekkz
  • 4,219
  • 2
  • 20
  • 19
  • I don't think it causes issues, or it wouldn't be on by default in almost every modern OS and CPU. You might be thinking of much older variants of CPU power saving technologies, perhaps a few years ago? – Jeff Atwood Dec 14 '09 at 18:03
  • It seems that you assume that everything that is a feature works properly. I'm actually talking about first-hand experience with a dual E5520 setup which had that issue, in two different servers. – gekkz Dec 14 '09 at 18:17
  • @gekkz I dn't think anyone is saying that it's impossible for it to cause issues, but your answer suggests that it always, or nearly always, causes issues, which is simply untrue. If it were, thousands (millions?) of servers out there would be having issues caused by this right now. – phoebus Dec 14 '09 at 19:22