7

I've noticed that the CPU usage on our 8-CPU database server, running SQL Server 2008, is not balanced at all.

Here are the 1 day averages for a random day a while back, which is typical and consistently asymmetric:

9, 15, 10, 21, 18, 21, 14, 9

(only thumbnail here, because the image is really tall, but click through for the full size image)

Compared with our 4-CPU web servers, which are all almost exactly and perfectly balanced all the time, that struck me as odd.

Now, this is a dedicated server, so the only thing running on it is SQL Server 2008 (and the built-in full text indexing, which we use pretty heavily), so I'm not sure why the CPU usage would be so asymmetric. Thoughts?

Jeff Atwood
  • 12,994
  • 20
  • 74
  • 92
  • I'm just setting up a SQL Server 2012 environment and am wondering about matters like this. Jeff please could you advise if changing your files / filegroups (as per the accepted answer) solved this issue for you? Thanks. – Elliveny Sep 14 '12 at 08:11

5 Answers5

9

The scales are different on all of them, other than a spike on 4 of the graphs your averages would all be about 10-25%.

uzbones
  • 215
  • 1
  • 3
  • -1. I listed the averages in the posts, and our web tier servers are ALWAYS within a few percent of each other. For example right now: web1: 33,33,35,34. web2: 3,2,3,3. – Jeff Atwood Jul 02 '09 at 14:16
  • 1
    UZbones - don't you find it odd that some are averaging higher than others? 9% vs 21% is a pretty big variance. My concern would be that if the server's load continues to ramp up, that we'll hit a bottleneck earlier if the load isn't even. – Brent Ozar Jul 02 '09 at 14:22
  • 1
    No, not really. Not all applications are multi-threaded or can support multiple processors, also all of the processes are not equal or use up the same CPU %. Web Servers would be more likey to be symmetric IMO, because most of the worker processes are simmilar and simpler (probably), but the database has quite a few maintainence processes that would only be on one CPU at a time, and not all SQL queries use up the same CPU – uzbones Jul 02 '09 at 15:06
  • I would like to point out though that I'm not a DBA. It's certainly possible that you could have an issue if you increate the overall load due to the differnce in the average. However at the low %'s your currently at I do not see a problem with the current variance. – uzbones Jul 02 '09 at 15:09
  • I find it extremely irritating that all you did was regurgitate exactly what I put in the post: 9, 15, 10, 21, 18, 21, 14, 9. Did you not read that line? – Jeff Atwood Jul 03 '09 at 01:49
  • 1
    If you really want to know, I probably didn't really register that line of your question when I read it. I was just trying to point out that looking at the graphs is misleading, which I would assume quite a few people did since this answer has some up-votes. I'm sorry your irritated, and the points mean nothing to me, so if you really are upset about it I will delete it, since my intention was not to cause any more issues that someone already has. – uzbones Jul 03 '09 at 03:59
  • Also you might want to look into the resource governor. http://blogs.msdn.com/psssql/archive/2008/01/10/sql-server-2008-resource-governor-questions.aspx See the section "CPU Scheduling (Fair Scheduling)" – uzbones Jul 03 '09 at 04:16
9

How are your files / filegroups set up?

I'll plagiarize myself:

One more thought on IO: we were careful to set up our biggest most often used tables to be on filegroups with multiple files in them. One of the performance enhancements of this is that SQL will thread requests to each file in the filegroup - so if BigOverUsedTable is on FileGroup1 and FileGroup1 has four files in it and your DB has 8 cores, it will actually use four cores to do "select big number crunching nasty query from BigOverUsedTable" - whereas otherwise, it will only use one CPU. We got this idea from this MSDN article:

http://msdn.microsoft.com/en-us/library/ms944351.aspx

From TFA:

"Filegroups use parallel threads to improve data access. When a table is accessed sequentially, the system creates a separate thread for each file in parallel. When the system performs a table scan for a table in a filegroup with four files, it uses four separate threads to read the data in parallel. In general, using multiple files on separate disks improves performance. Too many files in a filegroup can cause too many parallel threads and create bottlenecks."

We have four files in our filegroup on an 8 core machine due to this advice. It's working out well.

Edit: this has another (possibly) better answer now. The graphs were off on scale - if you look closely, each processor is actually about 20% loaded as uzbones points out.

Edit: We can actually tell that using multiple file filegroups helps, because we didn't put all our tables in the filegroup with four files. Big queries on the "single file" filegroup only use one CPU, but queries on the table in the four file filegroup hit 4 CPU's.

Kyle
  • 1,849
  • 2
  • 17
  • 23
  • There's only one data file and one log file per database. Everything's in the primary filegroup. – Brent Ozar Jul 02 '09 at 14:19
  • Then that's definitely something to change. I don't think its as simple as adding files to the main filegroup; I don't think SQL will auto-migrate data from BigOverUsedTable to the new file, I tried that once. We had to re-create the DB when we did this on the new filegroup. – Kyle Jul 02 '09 at 16:59
  • 1
    Yeah, what you do is create extra new files on the filegroup. For example, if you want to end up with 4 files on the primary filegroup, then create another 4, and empty out the first one into the other 4 by using the DBCC SHRINKFILE command with the EMPTYFILE parameter, then delete that first file. – Brent Ozar Jul 03 '09 at 00:58
6

Check this out:

http://blogs.technet.com/mat_stephen/archive/2005/02/02/365325.aspx

SQL might only be writing to a handful of files and each processor is using each file.

MathewC
  • 6,877
  • 9
  • 38
  • 53
  • I like this idea, but there's several databases involved that each have their own data file. I would expect it to be a little flatter given the number of databases. I still like it though. – Brent Ozar Jul 02 '09 at 14:34
  • @Brent - I think Mathew is on to something... is it possible that differences with the amount of data being read/written from the different databases accounts for the variance, even if there are as many databases as there are cores? Splitting those DBs out into additional filegroups may even the CPU load out. – Sean Earp Jul 03 '09 at 15:46
  • @Sean - you're saying that one particular large database would stick to one CPU? Seriously? – Brent Ozar Jul 03 '09 at 20:47
3

The first thing I check for stuff like that is drivers. I've had a lot of problems with network teaming and iSCSI MPIO drivers sticking on specific cores. I bet that's not the issue here though since it looks like it's happening across 4 cores - I usually only see that with 2 cores. I'll ask around to see if anybody's seen it this wide.

I've also seen it with NUMA boxes where there's a memory mismatch - say half the cores are hooked up to 16gb of ram and the others are hooked up to 8. Google for IBM x460 NUMA if you wanna see some funny info about that. The 460 and related models let you daisy chain several servers together to create big iron - kinda relevant with the scale up vs out blog entry. They're awesome machines.

Brent Ozar
  • 4,425
  • 17
  • 21
0

Because flushing CPU caches is so incredibly expensive that the kernel tries to avoid it at all costs.

(Note: At least Linux does; I'd be surprised if Windows didn't have the same behaviour)

David Pashley
  • 23,151
  • 2
  • 41
  • 71