Hardware interrupts and system unresponsiveness

5

Very occasionally, about once a week, my Windows Vista Business machine will completely lock up for anything between a minute and several minutes. Once this happens, it recurs more frequently until I reboot.

Process Explorer reveals that during this event, the system is performing "Hardware interrupts & DPCs". The HDD activity led on my machine also remains lit until it becomes responsive again, although I cannot hear any of the disks actually scratching.

Interrupts CPU usage
In the image above, you can see a lockup event as a spike of the red (interrupt) line. It appears to be short, but this is due to Process Explorer not being able to update the graph while the machine is not responding.

Here's a screenshot of the overall CPU usage; there appear to be a large number of interrupts in general.

I get the impression that my machine is experiencing a higher-than-normal number of interrupts. This leads me to suspect that some piece of hardware or a driver is misbehaving. Or could it be an IRQ conflict?

How can I diagnose this?


Edit #1: A look at the system log reveals several warning messages such as:

An error was detected on device \Device\Harddisk1\DR1 during a paging operation.

And:

Reset to device, \Device\RaidPort0, was issued.

I do not, however, have a RAID configuration set up, and all disks are connected directly to my motherboard's SATA ports.


Edit #2: Following the advice given here, I've made some changes to my rig to try to resolve the problem. I haven't experienced any freezes yet, but will return to either accept an answer or keep diagnosing.

  1. I replaced the SATA cable for my system disk;
  2. I plugged the SATA cable into a different SATA port on my Asus M2N-SLI Deluxe motherboard;
  3. I updated my nForce 570 SLI AMD drivers to nVidia's latest version.

I'm making the assumption here that \Device\RaidPort0 is my system disk. If the problem persists, the next step is to detach my other three disks one by one until the problem disappears. If that doesn't resolve it, I'll get rid of nForce altogether. And after that, it seems it can only be the system disk or my motherboard itself.


Edit #3: After swapping the system disk's SATA port with a different disk's port, I found the following entries in the Event Log after several days:

Reset to device, \Device\RaidPort1, was issued.

And:

A request to this device has been cancelled.

Device: \Device\RaidPort1
Model: ST3160812AS
Firmware Version: 3.AA
Serial Number: 5LS34HQ1
Port: 1

It seems fairly clear to me that the problem is neither the disk or the SATA cable, as the errors have entirely shifted to a different port. I will consider this SATA port to be broken and exclusively use the other five.

Paul Lammertsma

Posted 2011-04-08T10:57:32.127

Reputation: 3 508

3

Best place to diagnose this type of problem...http://www.msfn.org/board/topic/140263-how-to-get-the-cause-of-high-cpu-usage-by-dpc-interrupt/

– Moab – 2011-04-08T21:11:40.713

@Moab Thanks, I'm working on getting this set up to do some diagnosing. In the meantime, I've found a related issue on Microsoft Support that appears to be caused by having multiple NICs. I've disabled LMHOSTS lookup, and will see if this resolves anything.

– Paul Lammertsma – 2011-04-08T23:10:09.877

Answers

2

The lit HDD LED is a sign for HDD data transfer. If you disk is set to "silent" you may not hear its activity. It could also be a communications error on the SATA (or IDE) cable.

The Windows Event logs might have someting, if there are disk errors.

Update:

An error was detected on device \Device\Harddisk1\DR1 during a paging operation.

SATA CRC error/timeout. And page operations are unlikly preemtible => system hangs for a while.

Reset to device, \Device\RaidPort0, was issued.

The disk did not respond for a while, and windows did a reset of the SATA port. As your system resumes operation, the error condition seems to be temporary.

Have you tried changing SATA cables (take look at the contacts for corrorsion)? If that does not help, I'd try changing the disk.

Turbo J

Posted 2011-04-08T10:57:32.127

Reputation: 1 919

I'll edit the OP with some messages from the Windows Event log. The disk is not set to any sort of silent mode, and I can usually hear it when it seeks. – Paul Lammertsma – 2011-04-08T22:35:52.933

Thanks, I will take a look into this when I get around to actually switching this thing off. – Paul Lammertsma – 2011-04-08T23:12:08.300

You may not hear any seeking noise from the HDD if it is reading linearly or is busy in a way that does not involve seeking. – bwDraco – 2011-04-08T23:37:06.093

@DragonLord You're right: it appears that an HDD is doing something even while Windows is locked up. After this most recent lock up, I've made some changes (see the OP) to my rig. – Paul Lammertsma – 2011-04-11T23:56:37.093

1After swapping the SATA ports with another disk, I'm getting warnings on a different port. Both warning messages are still appearing, but now it reports "Reset to device, \Device\RaidPort1, was issued." It appears that one of the SATA ports is not working. At least now I know which one! – Paul Lammertsma – 2011-04-17T14:57:17.410