29

Sometimes I have to answer support calls responding to PC crashes with blue screens. How can I effectively narrow down the problem giving the information on that screen? What are the most important questions I have to ask the user?

Edit: By "diagnose" I mean, how can I interpret the information on the blue screen in order to narrow down the cause of the problem?

splattne
  • 28,348
  • 19
  • 97
  • 147

7 Answers7

23

When the computer bluescreens it'll most likely create a dump of the memory. The content from memory is written to the Pagefile as the system is going down. It uses the Pagefile as placeholder for the data since it is too dangerous to try to create a new file on disk.

When the machine starts up again it'll detect the dump, and move the data into a separate dump file (typically C:\Windows\Memory.dmp or C:\Windows\Minidumps*.dmp).

Install WinDbg and open the .dmp file. Click the !Analyze link. Now it'll show you the stack from the thread that killed Windows, and show you which files that were involved. Often WinDbg will point you directly at a specific driver file. You can find step-by-step instructions here.

I can recommend reading Mark Russinovich's blog and books. You can download WinDbg from Microsoft.

So the question to the user is: "Can you e-mail me your dump file?"

Fidel
  • 363
  • 1
  • 4
  • 18
Frode Lillerud
  • 1,656
  • 3
  • 18
  • 20
9

Mark Russinovich (of SysInternals fame) has an excellent blog entry where he describes how one can use the debugging tools to track down the module name and even the stack frame (i.e. function call) during which the blue screen occurred.

It's illustrated, well written, and has helped me get my feet under me when I started learning how to debug Blue Screen messages.

Shalom Craimer
  • 543
  • 9
  • 16
6

The error code in the top left. By googling that, you can often narrow it down to whether it's a hardware or software issue. Proceed from there (the Google results).

Mark S. Rasmussen
  • 2,108
  • 2
  • 21
  • 31
3

If they have the Bluescreen still open: The Actual Message near the top (i.e. IRQL_DRIVER_LESS_OR_EQUAL) and the Error Code at the Bottom (0x.......) with the module that crashed (i.e. nvdisp4.dll).

There are some common approaches here, but in my example, it's a Bluescreen caused by the nVidia Graphics Driver. If you analyze a few bluescreens, there are some common messages, codes and modules that regularly pop up, so after some time you should be able to narrow down issues more easily simply through experience.

Michael Stum
  • 4,010
  • 4
  • 35
  • 48
1

These are the things I look for since 1.) the PC that bluescreens, is normally my internet connection. 2.) bluescreens flash by too fast for even an experienced user like myself. So I rely heavily on questions.

  1. First, have you changed any hardware lately?
  2. Have you installed any new software?
  3. Importantly, can you get in via safe mode?

It goes without saying that if the answer to three above is yes then undo which ever of one and two above is yes. If both one and two are yes, handle the undo stepwise by doing one then testing before you do the other.

splattne
  • 28,348
  • 19
  • 97
  • 147
jake
  • 194
  • 1
  • 4
  • 12
1

Try checking the event viewer, if you don't see anything obvious there (wouldn't surprise me) try giving the MS debugging tools a try:

http://www.microsoft.com/whdc/DevTools/Debugging/default.mspx

l0c0b0x
  • 11,697
  • 6
  • 46
  • 76
0

Try running a memory test - intermittent BSODs are often faulty RAM

Richard Gadsden
  • 3,696
  • 4
  • 28
  • 58