What are the baseline perfmon counters for monitoring SQL Server and what do they mean?
1 Answers
Here are the counters I monitor and why (see the explain button in perfmon for detailed explainations of what they do). Note that many of these counters are useless unless you have a baseline to measure against. You must monitor applications before they have performance problems.
Process Object:% Processor Time - If you have growth and this counter is averaging 80% it's time to upgrade. 75-80% average is a fully utilized server. Sustained spikes particularly with low user connections could mean a poorly written query or could mean it's time for an upgrade, it's certainly better to have someone peek at the code.
System Object:Processor Queue Length - this value should be less than 2 x the number of CPUs in the system. If it is growing and the CPU % (above) is 80%+, then you likely need more or faster processors
Memory Object:Pages/sec - In general, adding memory ( or allocating more to SQL) should lower this counter. This counter is NOT indicitive of a performance issue in and of itself. This counter is subjective. What is telling is that if over time this goes up and performance is going down, it's certainly time to allocate more ram to sql server. As a very general rule, assuming sql server is the only app on the box, this number should average 0 over a 24 hour period (with spikes). Below 20 should not be really noticed from a performance standpoint, over 20 and you likely need more ram
Memory Object:Available MBytes - look for consistancy over time- not a magic number, the the theoretical perfect world this would be as close to 0 as possible
PhysicalDisk Object:Avg. Disk Queue Length - a good rule of thumb is no higher than the number of spindles X 2, this number is subjective
PhysicalDisk Object:% Idle Time - use this - 100, for a more accurate view of the %disk time (this an old note of mine, under widows 2008 this may have been fixed but there were some issues with 2000, and 2003) If you were tuning you would want to know the ratio of reads to writes- this is just base performance
Network Interface Object:Bytes Total/Sec - You are looking for network issues here. if it's abnormally low (say in the 20% range) and you have decent load take a look at the network config, likewise if it's over 60% you mighht have a misconfig or a network bottleneck. 60% should be around 3000 batch request/sec
SQL Server Access Methods Object:Full Scans/Sec - if you see many of these SQL server indexes are not being used, find the sql developer and bring a bat (use wood as the aluminum ones tend to dent), note that this is relative to the baseline as not all queries can use an index- but it's certainly worth asking the developer to take a look if there is any additional indexing thhat can be done
SQL Server Database Object:Transactions/Sec - use this to tell the average utilization of the server, this will trend up as performance trends down (note that SQLServer:SQL Statistics: Batch Requests/Sec is more accurate, howver i've found that TPS is more useful)
SQL Server Buffer Manager Object:Buffer Cache Hit Ratio - as you would guess higher is better and more ram should (but not necessarily) increase this
SQL Server General Statistics Object:User Connections - used more for trending than for actual performance, java based apps will often pump this up to the 1000's due to the way it connects to sql server
SQL Server Locks Object:Average Wait Time - again subjective, over time to indicate performance trends/issues however this is also great for individual issues (eg why does this report run so slow), if this spikes have a developer look at the code behind that particular report. It might just create alot of locks or it might need to be tweaked (so leave the bat at your desk this time)
- 23,938
- 4
- 35
- 58
-
Scrap "Disk Queue Length" (WAY too subjective on hardware etc.) and baiscally use Seconds/IO (read, write) - taht directly tells you how fast an IO happens on average, which is non-subjective. SAN's may be ok with 500 queue length, but if seconds goes higher than 0.01 (10ms) you have a problem. Depending on hardware should be single digit, low single digit (not all discs are fast enough for that) or 0 - the later for SSD. – TomTom Apr 23 '10 at 10:44
-
from a performance standpoint telling me how fast reads happen will not tell me if they are backed up. Seconds I/O will tell me if there are slow reads but not if there 300 requests waiting – Jim B Apr 23 '10 at 11:48