Tonight i received a message generated by mdadm on my server:
This is an automatically generated mail message from mdadm
A DegradedArray event had been detected on md device /dev/md3.
Faithfully yours, etc.
P.S. The /proc/mdstat file currently contains the following:
Personalities : [raid1]
md4 : active raid1 sdb4[0] sda4[1]
474335104 blocks [2/2] [UU]
md3 : active raid1 sdb3[2](F) sda3[1]
10000384 blocks [2/1] [_U]
md2 : active (auto-read-only) raid1 sdb2[0] sda2[1]
4000064 blocks [2/2] [UU]
md1 : active raid1 sdb1[0] sda1[1]
48064 blocks [2/2] [UU]
I removed /dev/sdb3 from /dev/md3 and re-added it, it was rebuilding for a while and become a spare device, so now i have such stats:
cat /proc/mdstat
Personalities : [raid1]
md4 : active raid1 sdb4[0] sda4[1]
474335104 blocks [2/2] [UU]
md3 : active raid1 sdb3[2](S) sda3[1]
10000384 blocks [2/1] [_U]
md2 : active (auto-read-only) raid1 sdb2[0] sda2[1]
4000064 blocks [2/2] [UU]
md1 : active raid1 sdb1[0] sda1[1]
48064 blocks [2/2] [UU]
and
[CODE]
mdadm -D /dev/md3
/dev/md3:
Version : 0.90
Creation Time : Sat Jun 28 14:47:58 2008
Raid Level : raid1
Array Size : 10000384 (9.54 GiB 10.24 GB)
Used Dev Size : 10000384 (9.54 GiB 10.24 GB)
Raid Devices : 2
Total Devices : 2
Preferred Minor : 3
Persistence : Superblock is persistent
Update Time : Sun Sep 4 16:30:46 2011
State : clean, degraded
Active Devices : 1
Working Devices : 2
Failed Devices : 0
Spare Devices : 1
UUID : 1c32c34a:52d09232:fc218793:7801d094
Events : 0.7172118
Number Major Minor RaidDevice State
0 0 0 0 removed
1 8 3 1 active sync /dev/sda3
2 8 19 - spare /dev/sdb3
Here is last logs in /var/log/messages
Sep 4 16:15:45 ogw2 kernel: [1314646.950806] md: unbind<sdb3>
Sep 4 16:15:45 ogw2 kernel: [1314646.950820] md: export_rdev(sdb3)
Sep 4 16:17:00 ogw2 kernel: [1314721.977950] md: bind<sdb3>
Sep 4 16:17:00 ogw2 kernel: [1314722.011058] RAID1 conf printout:
Sep 4 16:17:00 ogw2 kernel: [1314722.011064] --- wd:1 rd:2
Sep 4 16:17:00 ogw2 kernel: [1314722.011070] disk 0, wo:1, o:1, dev:sdb3
Sep 4 16:17:00 ogw2 kernel: [1314722.011073] disk 1, wo:0, o:1, dev:sda3
Sep 4 16:17:00 ogw2 kernel: [1314722.012667] md: recovery of RAID array md3
Sep 4 16:17:00 ogw2 kernel: [1314722.012673] md: minimum _guaranteed_ speed: 1000 KB/sec/disk.
Sep 4 16:17:00 ogw2 kernel: [1314722.012677] md: using maximum available idle IO bandwidth (but not more than 200000 KB/sec) for recovery.
Sep 4 16:17:00 ogw2 kernel: [1314722.012684] md: using 128k window, over a total of 10000384 blocks.
Sep 4 16:20:25 ogw2 kernel: [1314927.480582] md: md3: recovery done.
Sep 4 16:20:27 ogw2 kernel: [1314929.252395] ata2.00: configured for UDMA/133
Sep 4 16:20:27 ogw2 kernel: [1314929.260419] ata2.01: configured for UDMA/133
Sep 4 16:20:27 ogw2 kernel: [1314929.260437] ata2: EH complete
Sep 4 16:20:29 ogw2 kernel: [1314931.068402] ata2.00: configured for UDMA/133
Sep 4 16:20:29 ogw2 kernel: [1314931.076418] ata2.01: configured for UDMA/133
Sep 4 16:20:29 ogw2 kernel: [1314931.076436] ata2: EH complete
Sep 4 16:20:30 ogw2 kernel: [1314932.884390] ata2.00: configured for UDMA/133
Sep 4 16:20:30 ogw2 kernel: [1314932.892419] ata2.01: configured for UDMA/133
Sep 4 16:20:30 ogw2 kernel: [1314932.892436] ata2: EH complete
Sep 4 16:20:32 ogw2 kernel: [1314934.828390] ata2.00: configured for UDMA/133
Sep 4 16:20:32 ogw2 kernel: [1314934.836397] ata2.01: configured for UDMA/133
Sep 4 16:20:32 ogw2 kernel: [1314934.836413] ata2: EH complete
Sep 4 16:20:34 ogw2 kernel: [1314936.776392] ata2.00: configured for UDMA/133
Sep 4 16:20:34 ogw2 kernel: [1314936.784403] ata2.01: configured for UDMA/133
Sep 4 16:20:34 ogw2 kernel: [1314936.784419] ata2: EH complete
Sep 4 16:20:36 ogw2 kernel: [1314938.760392] ata2.00: configured for UDMA/133
Sep 4 16:20:36 ogw2 kernel: [1314938.768395] ata2.01: configured for UDMA/133
Sep 4 16:20:36 ogw2 kernel: [1314938.768422] sd 1:0:0:0: [sda] Unhandled sense code
Sep 4 16:20:36 ogw2 kernel: [1314938.768426] sd 1:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 4 16:20:36 ogw2 kernel: [1314938.768431] sd 1:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Sep 4 16:20:36 ogw2 kernel: [1314938.768438] Descriptor sense data with sense descriptors (in hex):
Sep 4 16:20:36 ogw2 kernel: [1314938.768441] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Sep 4 16:20:36 ogw2 kernel: [1314938.768454] 01 ac b6 4a
Sep 4 16:20:36 ogw2 kernel: [1314938.768459] sd 1:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
Sep 4 16:20:36 ogw2 kernel: [1314938.768468] sd 1:0:0:0: [sda] CDB: Read(10): 28 00 01 ac b5 f8 00 03 80 00
Sep 4 16:20:36 ogw2 kernel: [1314938.768527] ata2: EH complete
Sep 4 16:20:38 ogw2 kernel: [1314940.788406] ata2.00: configured for UDMA/133
Sep 4 16:20:38 ogw2 kernel: [1314940.796394] ata2.01: configured for UDMA/133
Sep 4 16:20:38 ogw2 kernel: [1314940.796415] ata2: EH complete
Sep 4 16:20:40 ogw2 kernel: [1314942.728391] ata2.00: configured for UDMA/133
Sep 4 16:20:40 ogw2 kernel: [1314942.736395] ata2.01: configured for UDMA/133
Sep 4 16:20:40 ogw2 kernel: [1314942.736413] ata2: EH complete
Sep 4 16:20:42 ogw2 kernel: [1314944.548391] ata2.00: configured for UDMA/133
Sep 4 16:20:42 ogw2 kernel: [1314944.556393] ata2.01: configured for UDMA/133
Sep 4 16:20:42 ogw2 kernel: [1314944.556414] ata2: EH complete
Sep 4 16:20:44 ogw2 kernel: [1314946.372392] ata2.00: configured for UDMA/133
Sep 4 16:20:44 ogw2 kernel: [1314946.380392] ata2.01: configured for UDMA/133
Sep 4 16:20:44 ogw2 kernel: [1314946.380411] ata2: EH complete
Sep 4 16:20:46 ogw2 kernel: [1314948.196391] ata2.00: configured for UDMA/133
Sep 4 16:20:46 ogw2 kernel: [1314948.204391] ata2.01: configured for UDMA/133
Sep 4 16:20:46 ogw2 kernel: [1314948.204411] ata2: EH complete
Sep 4 16:20:48 ogw2 kernel: [1314950.144390] ata2.00: configured for UDMA/133
Sep 4 16:20:48 ogw2 kernel: [1314950.152392] ata2.01: configured for UDMA/133
Sep 4 16:20:48 ogw2 kernel: [1314950.152416] sd 1:0:0:0: [sda] Unhandled sense code
Sep 4 16:20:48 ogw2 kernel: [1314950.152419] sd 1:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
Sep 4 16:20:48 ogw2 kernel: [1314950.152424] sd 1:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
Sep 4 16:20:48 ogw2 kernel: [1314950.152431] Descriptor sense data with sense descriptors (in hex):
Sep 4 16:20:48 ogw2 kernel: [1314950.152434] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
Sep 4 16:20:48 ogw2 kernel: [1314950.152447] 01 ac b6 4a
Sep 4 16:20:48 ogw2 kernel: [1314950.152452] sd 1:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
Sep 4 16:20:48 ogw2 kernel: [1314950.152461] sd 1:0:0:0: [sda] CDB: Read(10): 28 00 01 ac b6 48 00 00 08 00
Sep 4 16:20:48 ogw2 kernel: [1314950.152523] ata2: EH complete
Sep 4 16:20:48 ogw2 kernel: [1314950.575325] RAID1 conf printout:
Sep 4 16:20:48 ogw2 kernel: [1314950.575332] --- wd:1 rd:2
Sep 4 16:20:48 ogw2 kernel: [1314950.575337] disk 0, wo:1, o:1, dev:sdb3
Sep 4 16:20:48 ogw2 kernel: [1314950.575341] disk 1, wo:0, o:1, dev:sda3
Sep 4 16:20:48 ogw2 kernel: [1314950.575344] RAID1 conf printout:
Sep 4 16:20:48 ogw2 kernel: [1314950.575347] --- wd:1 rd:2
Sep 4 16:20:48 ogw2 kernel: [1314950.575350] disk 1, wo:0, o:1, dev:sda3
So i cant understand why this device (sdb3) become SPARE and RAID isnt synced...
Can anybody point me out what to do?
UPDATE: forgot to say that /dev/md3 is mounted as / (root) partition and includes all system directories except of /boot.