I'm having a problem with Adaptec 5805 raid card
http://www.adaptec.com/en-us/support/raid/sas_raid/sas-5805/
(with two SAS discs in raid) and Gigabyte motherboard GA-H67A-D3H-B3
http://www.gigabyte.com/products/product-page.aspx?pid=3866#sp
running CENTOS 6 as webserver.
Short story : when I boot the server, the raid card runs on full speed, doing over 250Mb/s transfer rate. Within no more than 60 minutes, I receive an IRQ error, IRQ 16 is stopped and since then, the card does no more than 2,5Mb/s transfer rate (but working). I need to fix it, so the card runs on full speed all the time.
Long story :
1] the motherboard doesn't have PCIe x8 slot to fit the raid card. I tried the x16 slot, but when in this slot, the card is not detected at all, system boots without it. So I used x4 slot, where the card (surprisingly for me), works great. Except the IRQ ...
2] there are two SATA disks connected to motherboard, each as primary on its channel
SAMSUNG HD502HJ SAMSUNG HD103UJ
then, there is additional network card in first of the normal PCI slots (in the picture on the above link, its the right-most white PCI slot next to "DUAL BOOT" description on the mobo.
And the raid card is in the PCIeX4 slot (next to those three white PCI slots)
Nothing else is used, I do not use any USB devices or anything else, just two SATA discs, two network connectors (mobo and card) and raid card with two SAS discs connected
3] system is like i said Centos 6
uname -a
Linux 2.6.32-71.29.1.el6.x86_64 #1 SMP Mon Jun 27 19:49:27 BST 2011 x86_64 x86_64 x86_64 GNU/Linux
CPU is
Intel(R) Core(TM) i7-2600 CPU @ 3.40GHz
lspci -v
00:00.0 Host bridge: Intel Corporation Sandy Bridge DRAM Controller (rev 09)
Flags: bus master, fast devsel, latency 0
Capabilities: [e0] Vendor Specific Information <?>
00:02.0 VGA compatible controller: Intel Corporation Sandy Bridge Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
Subsystem: Giga-byte Technology Device d000
Flags: bus master, fast devsel, latency 0, IRQ 10
Memory at fb400000 (64-bit, non-prefetchable) [size=4M]
Memory at e0000000 (64-bit, prefetchable) [size=256M]
I/O ports at ff00 [size=64]
Expansion ROM at <unassigned> [disabled]
Capabilities: [90] MSI: Enable- Count=1/1 Maskable- 64bit-
Capabilities: [d0] Power Management version 2
Capabilities: [a4] PCI Advanced Features
00:16.0 Communication controller: Intel Corporation Cougar Point HECI Controller #1 (rev 04)
Subsystem: Giga-byte Technology Device 1c3a
Flags: bus master, fast devsel, latency 0, IRQ 10
Memory at fbfff000 (64-bit, non-prefetchable) [size=16]
Capabilities: [50] Power Management version 3
Capabilities: [8c] MSI: Enable- Count=1/1 Maskable- 64bit+
00:1a.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #2 (rev 05) (prog-if 20 [EHCI])
Subsystem: Giga-byte Technology Device 5006
Flags: bus master, medium devsel, latency 0, IRQ 18
Memory at fbffe000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCI Advanced Features
Kernel driver in use: ehci_hcd
00:1c.0 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 1 (rev b5) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=01, subordinate=01, sec-latency=0
Memory behind bridge: fb800000-fbbfffff
Prefetchable memory behind bridge: 00000000dc000000-00000000dc0fffff
Capabilities: [40] Express Root Port (Slot+), MSI 00
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [90] Subsystem: Giga-byte Technology Device 5001
Capabilities: [a0] Power Management version 2
Kernel driver in use: pcieport
Kernel modules: shpchp
00:1c.5 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 6 (rev b5) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=02, subordinate=02, sec-latency=0
I/O behind bridge: 0000d000-0000dfff
Prefetchable memory behind bridge: 00000000fbd00000-00000000fbdfffff
Capabilities: [40] Express Root Port (Slot+), MSI 00
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [90] Subsystem: Giga-byte Technology Device 5001
Capabilities: [a0] Power Management version 2
Kernel driver in use: pcieport
Kernel modules: shpchp
00:1c.6 PCI bridge: Intel Corporation 82801 PCI Bridge (rev b5) (prog-if 01 [Subtractive decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=03, subordinate=04, sec-latency=0
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: fbc00000-fbcfffff
Prefetchable memory behind bridge: 00000000dc100000-00000000dc1fffff
Capabilities: [40] Express Root Port (Slot+), MSI 00
Capabilities: [80] MSI: Enable- Count=1/1 Maskable- 64bit-
Capabilities: [90] Subsystem: Giga-byte Technology Device 5001
Capabilities: [a0] Power Management version 2
00:1c.7 PCI bridge: Intel Corporation Cougar Point PCI Express Root Port 8 (rev b5) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=00, secondary=05, subordinate=05, sec-latency=0
Memory behind bridge: fbe00000-fbefffff
Capabilities: [40] Express Root Port (Slot+), MSI 00
Capabilities: [80] MSI: Enable+ Count=1/1 Maskable- 64bit-
Capabilities: [90] Subsystem: Giga-byte Technology Device 5001
Capabilities: [a0] Power Management version 2
Kernel driver in use: pcieport
Kernel modules: shpchp
00:1d.0 USB Controller: Intel Corporation Cougar Point USB Enhanced Host Controller #1 (rev 05) (prog-if 20 [EHCI])
Subsystem: Giga-byte Technology Device 5006
Flags: bus master, medium devsel, latency 0, IRQ 23
Memory at fbffd000 (32-bit, non-prefetchable) [size=1K]
Capabilities: [50] Power Management version 2
Capabilities: [58] Debug port: BAR=1 offset=00a0
Capabilities: [98] PCI Advanced Features
Kernel driver in use: ehci_hcd
00:1f.0 ISA bridge: Intel Corporation Cougar Point LPC Controller (rev 05)
Subsystem: Giga-byte Technology Device 5001
Flags: bus master, medium devsel, latency 0
Capabilities: [e0] Vendor Specific Information <?>
Kernel modules: iTCO_wdt
00:1f.2 IDE interface: Intel Corporation Cougar Point 4 port SATA IDE Controller (rev 05) (prog-if 8f [Master SecP SecO PriP PriO])
Subsystem: Giga-byte Technology Device b002
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19
I/O ports at fe00 [size=8]
I/O ports at fd00 [size=4]
I/O ports at fc00 [size=8]
I/O ports at fb00 [size=4]
I/O ports at fa00 [size=16]
I/O ports at f900 [size=16]
Capabilities: [70] Power Management version 3
Capabilities: [b0] PCI Advanced Features
Kernel driver in use: ata_piix
Kernel modules: ata_generic, pata_acpi, ata_piix
00:1f.3 SMBus: Intel Corporation Cougar Point SMBus Controller (rev 05)
Subsystem: Giga-byte Technology Device 5001
Flags: medium devsel, IRQ 18
Memory at fbffc000 (64-bit, non-prefetchable) [size=256]
I/O ports at 0500 [size=32]
Kernel driver in use: i801_smbus
Kernel modules: i2c-i801
00:1f.5 IDE interface: Intel Corporation Cougar Point 2 port SATA IDE Controller (rev 05) (prog-if 85 [Master SecO PriO])
Subsystem: Giga-byte Technology Device b002
Flags: bus master, 66MHz, medium devsel, latency 0, IRQ 19
I/O ports at f700 [size=8]
I/O ports at f600 [size=4]
I/O ports at f500 [size=8]
I/O ports at f400 [size=4]
I/O ports at f300 [size=16]
I/O ports at f200 [size=16]
Capabilities: [70] Power Management version 3
Capabilities: [b0] PCI Advanced Features
Kernel driver in use: ata_piix
Kernel modules: ata_generic, pata_acpi, ata_piix
01:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09)
Subsystem: Adaptec ASR5805
Flags: bus master, fast devsel, latency 0, IRQ 16
Memory at fb800000 (64-bit, non-prefetchable) [size=2M]
[virtual] Expansion ROM at dc000000 [disabled] [size=512K]
Capabilities: [98] Power Management version 2
Capabilities: [a0] MSI: Enable- Count=1/2 Maskable- 64bit+
Capabilities: [d0] Express Endpoint, MSI 00
Capabilities: [90] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Kernel driver in use: aacraid
Kernel modules: aacraid
02:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL8111/8168B PCI Express Gigabit Ethernet controller (rev 06)
Subsystem: Giga-byte Technology GA-EP45-DS5 Motherboard
Flags: bus master, fast devsel, latency 0, IRQ 32
I/O ports at de00 [size=256]
Memory at fbdff000 (64-bit, prefetchable) [size=4K]
Memory at fbdf8000 (64-bit, prefetchable) [size=16K]
Capabilities: [40] Power Management version 3
Capabilities: [50] MSI: Enable+ Count=1/1 Maskable- 64bit+
Capabilities: [70] Express Endpoint, MSI 01
Capabilities: [b0] MSI-X: Enable- Count=4 Masked-
Capabilities: [d0] Vital Product Data
Capabilities: [100] Advanced Error Reporting
Capabilities: [140] Virtual Channel <?>
Capabilities: [160] Device Serial Number 01-00-00-00-68-4c-e0-00
Kernel driver in use: r8169
Kernel modules: r8169
03:00.0 PCI bridge: Integrated Technology Express, Inc. Device 8892 (rev 30) (prog-if 01 [Subtractive decode])
Flags: bus master, fast devsel, latency 0
Bus: primary=03, secondary=04, subordinate=04, sec-latency=32
I/O behind bridge: 0000e000-0000efff
Memory behind bridge: fbc00000-fbcfffff
Prefetchable memory behind bridge: 00000000dc100000-00000000dc1fffff
Capabilities: [90] Power Management version 2
Capabilities: [a0] Subsystem: Giga-byte Technology Device 5000
04:00.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet (rev 10)
Subsystem: Realtek Semiconductor Co., Ltd. RTL-8169 Gigabit Ethernet
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 18
I/O ports at ee00 [size=256]
Memory at fbcff000 (32-bit, non-prefetchable) [size=256]
[virtual] Expansion ROM at dc100000 [disabled] [size=64K]
Capabilities: [dc] Power Management version 2
Kernel driver in use: r8169
Kernel modules: r8169
05:00.0 USB Controller: Device 1b6f:7023 (rev 01) (prog-if 30)
Subsystem: Device 1b6f:7023
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at fbef8000 (64-bit, non-prefetchable) [size=32K]
Capabilities: [50] Power Management version 3
Capabilities: [70] MSI: Enable- Count=1/4 Maskable+ 64bit+
Capabilities: [a0] Express Endpoint, MSI 00
Capabilities: [100] Advanced Error Reporting
Capabilities: [190] Device Serial Number 01-01-01-01-01-01-01-01
lspci -vv
01:00.0 RAID bus controller: Adaptec AAC-RAID (rev 09)
Subsystem: Adaptec ASR5805
Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
Latency: 0, Cache Line Size: 4 bytes
Interrupt: pin A routed to IRQ 16
Region 0: Memory at fb800000 (64-bit, non-prefetchable) [size=2M]
[virtual] Expansion ROM at dc000000 [disabled] [size=512K]
Capabilities: [98] Power Management version 2
Flags: PMEClk- DSI- D1+ D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
Capabilities: [a0] MSI: Enable- Count=1/2 Maskable- 64bit+
Address: 0000000000000000 Data: 0000
Capabilities: [d0] Express (v1) Endpoint, MSI 00
DevCap: MaxPayload 512 bytes, PhantFunc 0, Latency L0s unlimited, L1 <1us
ExtTag- AttnBtn- AttnInd- PwrInd- RBE+ FLReset-
DevCtl: Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
MaxPayload 128 bytes, MaxReadReq 512 bytes
DevSta: CorrErr- UncorrErr- FatalErr- UnsuppReq- AuxPwr- TransPend-
LnkCap: Port #0, Speed 2.5GT/s, Width x8, ASPM L0s, Latency L0 <128ns, L1 unlimited
ClockPM- Surprise- LLActRep- BwNot-
LnkCtl: ASPM Disabled; RCB 64 bytes Disabled- Retrain- CommClk+
ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
LnkSta: Speed 2.5GT/s, Width x4, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
Capabilities: [90] Vital Product Data
Unknown small resource type 00, will not decode more.
Capabilities: [100] Advanced Error Reporting
UESta: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UEMsk: DLP- SDES- TLP- FCP- CmpltTO- CmpltAbrt- UnxCmplt- RxOF- MalfTLP- ECRC- UnsupReq- ACSViol-
UESvrt: DLP+ SDES- TLP- FCP+ CmpltTO- CmpltAbrt- UnxCmplt- RxOF+ MalfTLP+ ECRC- UnsupReq- ACSViol-
CESta: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr-
CEMsk: RxErr- BadTLP- BadDLLP- Rollover- Timeout- NonFatalErr+
AERCap: First Error Pointer: 00, GenCap+ CGenEn- ChkCap+ ChkEn-
Kernel driver in use: aacraid
Kernel modules: aacraid
cat /proc/interrupts
CPU0 CPU1 CPU2 CPU3 CPU4 CPU5 CPU6 CPU7
0: 128 0 0 0 0 0 0 0 IO-APIC-edge timer
1: 105 0 606 4366 0 0 0 0 IO-APIC-edge i8042
8: 1 0 0 0 0 0 0 0 IO-APIC-edge rtc0
9: 0 0 0 0 0 0 0 0 IO-APIC-fasteoi acpi
16: 1381 0 197881 730 0 0 0 9 IO-APIC-fasteoi aacraid
18: 1695 0 0 0 13372 60347990 0 0 IO-APIC-fasteoi ehci_hcd:usb1, eth1
19: 4637 0 14949 6352494 0 0 0 106473 IO-APIC-fasteoi ata_piix, ata_piix
23: 33 0 27 12 0 0 0 0 IO-APIC-fasteoi ehci_hcd:usb2
24: 291 0 0 0 0 0 0 0 HPET_MSI-edge hpet2
25: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet3
26: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet4
27: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet5
28: 0 0 0 0 0 0 0 0 HPET_MSI-edge hpet6
32: 1275 0 0 0 0 1905 21317086 0 PCI-MSI-edge eth0
NMI: 1873 10150 1974 1672 702 3046 1825 780 Non-maskable interrupts
LOC: 17501877 13611350 13868117 3612581 1520650 1850972 8633075 1486682 Local timer interrupts
SPU: 0 0 0 0 0 0 0 0 Spurious interrupts
PMI: 0 0 0 0 0 0 0 0 Performance monitoring interrupts
PND: 0 0 0 0 0 0 0 0 Performance pending work
RES: 5238 34250 12858 4299 1555 4833 5663 2485 Rescheduling interrupts
CAL: 334 302 429 414 421 464 465 468 Function call interrupts
TLB: 7863 154723 12147 11152 14099 33766 42580 11065 TLB shootdowns
TRM: 0 0 0 0 0 0 0 0 Thermal event interrupts
THR: 0 0 0 0 0 0 0 0 Threshold APIC interrupts
MCE: 0 0 0 0 0 0 0 0 Machine check exceptions
MCP: 293 293 293 293 293 293 293 293 Machine check polls
ERR: 7
MIS: 0
the module used is kernel module kmod-aacraid from elrepo for Centos 6
Installed Packages
Name : kmod-aacraid
Arch : x86_64
Version : 1.1.7
Release : 1.el6.elrepo
Size : 340 k
Repo : installed
From repo : elrepo
Summary : aacraid kernel module(s)
URL : http://www.adaptec.com/
License : GPLv2
Description: This package provides the aacraid kernel module(s) built
: for the Linux kernel using the x86_64 family of processors.
and the error from the log
Dec 15 14:02:33 kernel: irq 16: nobody cared (try booting with the "irqpoll" option)
Dec 15 14:02:33 kernel: Pid: 0, comm: swapper Not tainted 2.6.32-71.29.1.el6.x86_64 #1
Dec 15 14:02:33 kernel: Call Trace:
Dec 15 14:02:33 kernel: <IRQ> [<ffffffff810da96b>] __report_bad_irq+0x2b/0xa0
Dec 15 14:02:33 kernel: [<ffffffff810dab6c>] note_interrupt+0x18c/0x1d0
Dec 15 14:02:33 kernel: [<ffffffff810db255>] handle_fasteoi_irq+0xc5/0xf0
Dec 15 14:02:33 kernel: [<ffffffff81015fb9>] handle_irq+0x49/0xa0
Dec 15 14:02:33 kernel: [<ffffffff814d093c>] do_IRQ+0x6c/0xf0
Dec 15 14:02:33 kernel: [<ffffffff81013ad3>] ret_from_intr+0x0/0x11
Dec 15 14:02:33 kernel: <EOI> [<ffffffff812da962>] ? acpi_idle_enter_c1+0xa3/0xc1
Dec 15 14:02:33 kernel: [<ffffffff812da941>] ? acpi_idle_enter_c1+0x82/0xc1
Dec 15 14:02:33 kernel: [<ffffffff813df687>] cpuidle_idle_call+0xa7/0x140
Dec 15 14:02:33 kernel: [<ffffffff81011e96>] cpu_idle+0xb6/0x110
Dec 15 14:02:33 kernel: [<ffffffff814c27d8>] start_secondary+0x1fc/0x23f
Dec 15 14:02:33 kernel: handlers:
Dec 15 14:02:33 kernel: [<ffffffffa002a590>] (aac_rx_intr_message+0x0/0xc0 [aacraid])
Dec 15 14:02:33 kernel: Disabling IRQ #16
I do not see any IRQ 16 conflict, the suggested irqpoll option doesn't change a thing. I do not need USB, so i can disable it, but the system is production one, so I want to know, where the problem is, before I start to mess with BIOS or any other thing (and I also need to reduce the downtime as much as possible).
Can anyone help me with diagnosing the problem here?