43

I'm having a performance problem in a site we've made, and I'm not exactly sure how to start diagnosing it.

The short description is: We have a very small site (http://hearablog.com) with very little traffic, in a crappy dedicated server, CPU is always very high, sometimes it stays at 100% for minutes, and w3wp.exe is taking most of it. A typical scenario is w3wp.exe takes 60%, and SQL Server takes about 30%. Our DB is pretty small too.

Long description and more details:

  • The site is hosted in a very crappy server by Cari.Net. From the beginning we had the feeling that the server didn't quite behave correctly, like some things would take just too long, so this could be a configuration problem from the get go. It may also be that we are getting a virtual server while we're supposed to have a dedicated one, although we have no evidence that'd indicate this, except for the fact that the server tends to be quite slow.

  • The server is Windows 2008 Standard 64-bit, with SQL 2008 Express

  • Hardware is a Celeron 2.80 GHz, 1Gb RAM

  • The website is developed in ASP.Net MVC, using Entity Framework for data access.

  • Now, this is pretty crappy hardware, but i've had other servers with these guys, with equivalent (or worse) HW, and performance is much better than this one. That said, the other servers have W2003 and SQL2005, and I'm using ASP.Net "WebForms" 2.0, no MVC, no LINQ, no EF; so I'm not sure whether going to 2008 / the other stuff means a big performance penalty is expected.

  • I'm serving MP3 files (5-20 Mb) regularly, which is a slightly unusual load, maybe that is causing some kind of problems?
    Would that cause w3wp to use a lot of CPU?

  • Disk usage seems very low. Memory is usually around 90%, but disk usage seems to indicate it's not paging much.

  • I get tons of e-mails every day about SQL timeouts, for queries taking over 30 seconds, although all our queries are pretty straightforward (or should be, but EF may be screwing it up).

This is what resource monitor looks like in one of these "sprints" of 100% CPU, in case there's anything useful there.

alt text

And a snapshot of some performance counters: alt text

Now, what confuses me very much is that CPU usage of w3wp is just so high. It shouldn't be doing much really... So my questions are...

  • Is there any way of finding out "what" it is doing? Maybe even profile it?
  • Any performance counters I should be looking at?
  • Is this to be expected given this hardware/software configuration?
  • Is this could be cause by some kind of configuration failure, where would you start looking?

Thank you VERY much.
Daniel Magliola

Daniel Magliola
  • 1,402
  • 9
  • 20
  • 33

7 Answers7

42

You can also use the Worker Processes UI inside IIS Manager and inspect the requests that are currently executing and see where they are getting stuck if any. Open IIS Manager->Click the Server in the Tree->Double Click Worker Processes Icon->Double click the Worker Process that is consuming CPU to see the currently executing requests in real time so that you can see which module is taking time.

Also consider using Failed Request Tracing to track some of the time per request to see where they are taking long time.

  • 2
    This is promising, it actually sounds like EXACTLY what I want to see, but actually those screens show empty. It's apparently showing only requests that are taking longer than a second, according to the big sign on top, and none of our requests are evidently, because the list is empty. Any ideas on how to make it show more requests? How to lower the 1s filter? Thanks! – Daniel Magliola Oct 29 '10 at 10:51
  • 1
    You can type 0 in the filter and click Go, that will set it to 0 seconds. Also, you could run from an elevated command prompt "%windir%\system32\inetsrv\appcmd.exe list requests" – Carlos Aguilar Mares Oct 29 '10 at 15:11
  • 1
    Thank you very much Carlos! This is what I ended up doing to find the one request (A cron we have) that was killing my server every 5 minutes (it took 3.5 minutes to run, so it was almost constantly at 100% CPU). Thanks!!! – Daniel Magliola Oct 31 '10 at 23:30
  • 1
    This UI told me what URL had been accessed; unfortunately it's a POST to an asmx webservice, and that data isn't available. (headbang) – Ross Presser Jan 22 '16 at 15:49
5

Ok, to start - the server is REALLY crappy. But it SHOULD be enough.

  • For virtualization, check your drivers. I know of no virutalization platform that hides the CPU (and I coubt someone puts up a hyper-v or esx on a celeron) but the drivers for disc etc. are an indication.

  • CPU should not be that high. Sadly, with this RAM, you are pretty much toast - if you start adding a profiler you pretty much will blow the memory you have.

I would:

  • Check the logs for stuff executing at this moment.
  • Upgrade the OS to 2008 R2 - a LOT more information is available there.

For testing:

  • In your dev environment make a copy of the site and run some performance tests.
  • Do profiling there.
  • Use Failed REquest Tracking to find out which requests fail.

http://learn.iis.net/page.aspx/266/troubleshooting-failed-requests-using-tracing-in-iis-7/

has some start there. This may give you a hint in case the problems are more - hm - "categorizable".

I would also keep longer term performance logs. Watch out for your IO (seconds / read, Seconds / write are pretty much the only relevant ones). All the rest is IO wise too vague - but once your IO starts taking longer than it should, the discs fall behind.

I would rule out a configuration issue at this point - at least as primary indicator. Something uses up your W3p ressources, now you need to find out what it is.

In general, this is not a server I would love having physical - it is so small, it makes no sense IMHO to havea machine there. Virtual would be better ;)

TomTom
  • 50,857
  • 7
  • 52
  • 134
  • thank you very much for your answer. some questions: Which logs would you check for stuff executing at this moment? (sorry if this is a newbie question) - Upgrade OS: We might try that, but I'm afraid it may break stuff, maybe, how safe is this? - Dev environment: The problem is that in my dev environment it works fine. CPU is negligible, requests don't fail, etc. – Daniel Magliola Oct 27 '10 at 13:26
  • As for I/O logs: I just added the counters you mention, and they are all at 0 while the CPU is high. I just added a screenshot of some performance counters I'm looking at. I know a snapshot doesn't tell the whole story, but those values tend to be pretty stable. Do you think the number of current connections (which I have no explanation for) could be a problem? Any ideas on how to figure out what those connections are requesting/doing? Any other counters you think may be useful for diagnosing something like this? – Daniel Magliola Oct 27 '10 at 13:29
  • Well, R2 is quite safe. I upgraded everything and never got a problem. Anyhow.... this is a CPU issue and nailing it can be terrible, espeically given that you dont ahve enough RAM to install a profiler. I would actually attempt a complete reinstall. Yes, sucks, but it means you could install R2 fresh and see whether the problem persists. Bad thing is that you dont ahve a reserve system, so you can not identify whether the problem is "local" or more general. Alternatively: stop IIS, wipe all temp folders in use, also – TomTom Oct 27 '10 at 13:32
  • for compilation and see what happens when you restart. With R2 you could see if / what files are kept open by IIS. Is this local to one web application, or is it also there if all websites are stopped? Next thing to try - turn off all sites and find out which one breaks things. – TomTom Oct 27 '10 at 13:33
  • Finally, the problem with virtual servers is that, as far as I've found, we end up paying more or the same for the same hardware, plus, the bandwidth bill is killer (keep in mind we serve audio files). We will be moving into a bigger server if we have to, but honestly, with the traffic we have, there must be some problem, we shouldn't be using 100% CPU ever.. – Daniel Magliola Oct 27 '10 at 13:40
  • Just deleted the "Temporary ASP.Net files", is that what you meant? I'm giving it a bit of time to stabilize and see whether that made any change, but I wouldn't put too much hope on that, this has been happening ever since we had our server a year ago (and getting worse over time) - As for which site breaks things, we only have one site in the server, with one database, that's it, so it's definitely this one. – Daniel Magliola Oct 27 '10 at 13:56
  • Me neither. I am more on the line to say that something is broken on the operating system. Someone serving lots of stuff should have 2-3 servery anyway (hwere a larger hardare + virtualization comes possibly in handy) allowing you to update web servers without downtime... and for cases like that. – TomTom Oct 27 '10 at 14:03
  • And it allows you - virtual servers - to assign some more gb to it for some time (prfiling, just to see what happens). I run all my operations from a number of larger boxes using Hyper-v and i really never want to go back to physical servers for anything but dedicated high performance databases ;) – TomTom Oct 27 '10 at 14:04
4

You could try using a program called Process Explorer to monitor individual threads running under the w3wp process. It should allow you to see what thread is causing all the damage.

Joe Phillips
  • 320
  • 1
  • 4
  • 11
3

I had really great luck using Microsoft's Debug Diagnostic Tool to dump my w3wp process and then check out the threads and stack traces for things that were locking up. It'll even tell you the requested page that spawned the thread which is SUPER nice.

http://www.microsoft.com/en-us/download/details.aspx?id=26798

jocull
  • 211
  • 2
  • 8
1

I agree with TomTom down the line, especially about getting better mileage from a Virtual at this point. Debugging/profiling locally to narrow down the problem is the right thing to do.

I am going to put on my Karnak The Magnificent hat and cape and ask for the first envelope. Ram Rebellion. What do you get when you put the OS, ASP.NET, and a greedy SQL Server Express into 1GB.

I believe that your issue is that SQL Server Express is pulling in all available RAM for a Buffer Pool and being slow to release it. See http://support.microsoft.com/kb/321363 for more information. Also, IIS has a default cache of 256MB which you may need to tweak (https://stackoverflow.com/questions/2853135/controlling-asp-net-output-cache-memory-usage). Debug Diagnostics is a great tool for troubleshooting this (ok, probably a sledgehammer).

http://technet.microsoft.com/en-us/library/bb742546.aspx is a pretty decent article to look at. http://social.technet.microsoft.com/forums/en-US/sharepointadmin/thread/706c653a-16b0-4696-85ee-9ae3552a582e points to app pool recycling gone mad as another possible issue.

Larry Smithmier
  • 418
  • 1
  • 4
  • 7
1

Use Perfmon's "Process" counter to see the individual attributes of the w3wp.exe process. How much of the CPU time for the worker process is kernel time? High kernel times could be indicative of paging, but you say you're not convinced. Other possibilities are duff drivers. The worker process has 23 threads active, which is good, but what are they doing? Try SysInternals' ProcessExplorer to dig around a bit more; you can also see what TCP/IP connections are in play. I haven't used SQL Express, but does it have memory tuning parameters, like its big brother. Is SQL starving IIS of memory, causing excessive paging?

Simon Catlin
  • 5,222
  • 3
  • 16
  • 20
  • Let's see if I'm doing this right... I added the %processor time, and %user time counters, both for w3wp process, and they both match each other perfectly all the time. Does that mean there's no kernel time, or am I looking at this the wrong way? (sorry, i'm a newbie at this) – Daniel Magliola Oct 29 '10 at 11:10
-1

It might not be totally related, but verify if you are using NOLOCK in your queries. It might help in the case of the SQL timeouts.