1

I have current config HP Proliant bl460gen6 with controller smart array p711m Ubuntu OS There are 35 hdd drives configured in raid 1+0 with 1 disk in spare usually i monitor status of raid with default command

hpacucli ctrl all show config

and if it found failed disk i replace it. By chance I noticed that the diodes are signaling of broken two HDDs on the storage system. At the same time, hpacucli in the report said that all HDD were normal. After googling the problem I got another version of the hpacucli syntaxis like

hpacucli ctrl slot=2 ld 1 show

After its implementation, it confirmed the existence of problematic HDD Replacing one HDD continued to monitor the situation, the recovering of the raid took place in the normal mode, however, the numbering of the HDDs in the list is given the wrong with doubling drives numbers

Replaced HDD is in slot 2

hpacucli ctrl all show config

  Smart Array P711m in Slot 2
  array A (SATA, Unused Space: 0  MB)
  logicaldrive 1 (61.9 TB, RAID 1+0, Recovering, 75% complete)
  physicaldrive 1E:1:1 (port 1E:box 1:bay 1, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:1 (port 1E:box 1:bay 1, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:2 (port 1E:box 1:bay 2, SATA, 4000.7 GB, Rebuilding)
  physicaldrive 1E:1:2 (port 1E:box 1:bay 2, SATA, 4000.7 GB, Rebuilding)
  physicaldrive 1E:1:3 (port 1E:box 1:bay 3, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:3 (port 1E:box 1:bay 3, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:4 (port 1E:box 1:bay 4, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:4 (port 1E:box 1:bay 4, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:5 (port 1E:box 1:bay 5, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:5 (port 1E:box 1:bay 5, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:6 (port 1E:box 1:bay 6, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:6 (port 1E:box 1:bay 6, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:7 (port 1E:box 1:bay 7, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:7 (port 1E:box 1:bay 7, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:8 (port 1E:box 1:bay 8, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:8 (port 1E:box 1:bay 8, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:9 (port 1E:box 1:bay 9, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:9 (port 1E:box 1:bay 9, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:10 (port 1E:box 1:bay 10, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:10 (port 1E:box 1:bay 10, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:11 (port 1E:box 1:bay 11, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:11 (port 1E:box 1:bay 11, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:12 (port 1E:box 1:bay 12, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:12 (port 1E:box 1:bay 12, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:13 (port 1E:box 1:bay 13, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:13 (port 1E:box 1:bay 13, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:14 (port 1E:box 1:bay 14, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:14 (port 1E:box 1:bay 14, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:15 (port 1E:box 1:bay 15, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:16 (port 1E:box 1:bay 16, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:17 (port 1E:box 1:bay 17, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:18 (port 1E:box 1:bay 18, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:19 (port 1E:box 1:bay 19, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:20 (port 1E:box 1:bay 20, SATA, 4000.7 GB, OK)
  physicaldrive 1E:1:21 (port 1E:box 1:bay 21, SATA, 4000.7 GB, OK, active spare)

hpacucli ctrl slot=2 ld 1 show

     Smart Array P711m in Slot 2
     array A
     Logical Drive: 1
     Size: 61.9 TB
     Fault Tolerance: 1+0
     Heads: 255
     Sectors Per Track: 32
     Cylinders: 65535
     Strip Size: 256 KB
     Full Stripe Size: 4352 KB
     Status: Recovering, 78% complete
     MultiDomain Status: OK
     Caching:  Enabled
     Unique Identifier: 
     Disk Name: /dev/sda
     Mount Points: None
     Logical Drive Label: 
     Mirror Group 0:
        physicaldrive 1E:1:1 (port 1E:box 1:bay 1, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:2 (port 1E:box 1:bay 2, SATA, 4000.7 GB, Rebuilding)
        physicaldrive 1E:1:3 (port 1E:box 1:bay 3, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:4 (port 1E:box 1:bay 4, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:5 (port 1E:box 1:bay 5, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:6 (port 1E:box 1:bay 6, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:7 (port 1E:box 1:bay 7, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:8 (port 1E:box 1:bay 8, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:9 (port 1E:box 1:bay 9, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:10 (port 1E:box 1:bay 10, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:11 (port 1E:box 1:bay 11, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:12 (port 1E:box 1:bay 12, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:13 (port 1E:box 1:bay 13, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:14 (port 1E:box 1:bay 14, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:1 (port 1E:box 1:bay 1, SATA, 4000.7 GB, OK)
        physicaldrive 1E:1:2 (port 1E:box 1:bay 2, SATA, 4000.7 GB, Failed)
        physicaldrive 1E:1:3 (port 1E:box 1:bay 3, SATA, 4000.7 GB, OK)

Whats get wrong, how i can fix it and why different hpacucli commands return different HDD status

Nikita Dow
  • 11
  • 2

1 Answers1

0

You may have dual-domain cabling in place (multipath SAS).

The P711 is a blade server SAS RAID controller and meant to connect to the Blade Chassis expansion ports (SAS switch) and link to a larger enclosure (like a D6000 35 or 70-bay SAS JBOD).

enter image description here

enter image description here

That's likely what you have. Wait for the disk to rebuild.

Also, you shouldn't monitor the RAID status the way you're doing it. You can just install the HP management agents and the system will email or send an SNMP trap with health status changes.

See: Monitoring an HP ProLiant DL380 G7 without the bloat

Using hplog -v will show all system alerts too.

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • No, i dont use dual domain. It is a c7000 full of bl 460g6 with smart arrays p700m/711m controllers and thru PTM module connected to MDS JBODS. One server - one controller - PTM - one minisas 8088 cable - half of JBOD (35 of 70 HDD) All another servers with same connection method work w/o errors and bugged report from hpacucli – Nikita Dow May 18 '17 at 12:48