0

I am trying to understand how btrfs raid1 mode behaves when you take out one disk and put it back (odroid hc4 is an example device)

Here is what I have before the test:

Label: none  uuid: f85fb0ab-e643-4266-9f61-0f6e4980b871
    Total devices 2 FS bytes used 2.35GiB
    devid    1 size 111.79GiB used 9.03GiB path /dev/sdb
    devid    2 size 1.82TiB used 9.03GiB path /dev/sda

then I detach /dev/sda disk, dmesg shows this:

[158763.932162] ata1: SATA link down (SStatus 0 SControl 300)
[158769.474056] ata1: SATA link down (SStatus 0 SControl 300)
[158774.849753] ata1: SATA link down (SStatus 0 SControl 300)
[158774.849775] ata1.00: disabled
[158774.849814] ata1.00: detaching (SCSI 0:0:0:0)
[158774.851067] sd 0:0:0:0: [sda] Synchronizing SCSI cache
[158774.851161] sd 0:0:0:0: [sda] Synchronize Cache(10) failed: Result: hostbyte=0x04 driverbyte=0x00
[158774.851168] sd 0:0:0:0: [sda] Stopping disk
[158774.851190] sd 0:0:0:0: [sda] Start/Stop Unit failed: Result: hostbyte=0x04 driverbyte=0x00
[158784.777808] BTRFS error (device sdb): bdev /dev/sda errs: wr 9, rd 0, flush 1, corrupt 0, gen 0
[158784.778204] BTRFS error (device sdb): bdev /dev/sda errs: wr 10, rd 0, flush 1, corrupt 0, gen 0
[158784.778716] BTRFS error (device sdb): bdev /dev/sda errs: wr 11, rd 0, flush 1, corrupt 0, gen 0
[158784.779232] BTRFS error (device sdb): bdev /dev/sda errs: wr 12, rd 0, flush 1, corrupt 0, gen 0
[158784.779505] BTRFS error (device sdb): bdev /dev/sda errs: wr 13, rd 0, flush 1, corrupt 0, gen 0
[158784.782423] BTRFS error (device sdb): bdev /dev/sda errs: wr 13, rd 0, flush 2, corrupt 0, gen 0
[158784.782651] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158784.782660] BTRFS error (device sdb): bdev /dev/sda errs: wr 14, rd 0, flush 2, corrupt 0, gen 0
[158784.782762] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158784.782767] BTRFS error (device sdb): bdev /dev/sda errs: wr 15, rd 0, flush 2, corrupt 0, gen 0
[158784.782864] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158784.782869] BTRFS error (device sdb): bdev /dev/sda errs: wr 16, rd 0, flush 2, corrupt 0, gen 0
[158784.784112] BTRFS error (device sdb): error writing primary super block to device 2
[158788.810744] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
[158788.814057] ata1.00: ATA-10: CT2000BX500SSD1, M6CR030, max UDMA/133
[158788.814064] ata1.00: 3907029168 sectors, multi 1: LBA48 NCQ (depth 32), AA
[158788.824662] ata1.00: configured for UDMA/133
[158788.824934] scsi 0:0:0:0: Direct-Access     ATA      CT2000BX500SSD1  030  PQ: 0 ANSI: 5
[158788.825550] sd 0:0:0:0: Attached scsi generic sg0 type 0
[158788.825726] sd 0:0:0:0: [sdc] 3907029168 512-byte logical blocks: (2.00 TB/1.82 TiB)
[158788.825760] sd 0:0:0:0: [sdc] Write Protect is off
[158788.825766] sd 0:0:0:0: [sdc] Mode Sense: 00 3a 00 00
[158788.825812] sd 0:0:0:0: [sdc] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
[158788.863902] sd 0:0:0:0: [sdc] Attached SCSI disk
[158788.884759] BTRFS warning: duplicate device /dev/sdc devid 2 generation 48028 scanned by systemd-udevd (5758)
[158789.894806] BTRFS error (device sdb): bdev /dev/sda errs: wr 17, rd 0, flush 2, corrupt 0, gen 0
[158820.615643] BTRFS error (device sdb): bdev /dev/sda errs: wr 18, rd 0, flush 2, corrupt 0, gen 0
[158820.616144] BTRFS error (device sdb): bdev /dev/sda errs: wr 19, rd 0, flush 2, corrupt 0, gen 0
[158820.616585] BTRFS error (device sdb): bdev /dev/sda errs: wr 20, rd 0, flush 2, corrupt 0, gen 0
[158820.617080] BTRFS error (device sdb): bdev /dev/sda errs: wr 21, rd 0, flush 2, corrupt 0, gen 0
[158820.617353] BTRFS error (device sdb): bdev /dev/sda errs: wr 22, rd 0, flush 2, corrupt 0, gen 0
[158820.617558] BTRFS error (device sdb): bdev /dev/sda errs: wr 23, rd 0, flush 2, corrupt 0, gen 0
[158820.620575] BTRFS error (device sdb): bdev /dev/sda errs: wr 23, rd 0, flush 3, corrupt 0, gen 0
[158820.620791] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158820.620799] BTRFS error (device sdb): bdev /dev/sda errs: wr 24, rd 0, flush 3, corrupt 0, gen 0
[158820.620896] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158820.620901] BTRFS error (device sdb): bdev /dev/sda errs: wr 25, rd 0, flush 3, corrupt 0, gen 0
[158820.621128] BTRFS warning (device sdb): lost page write due to IO error on /dev/sda (-5)
[158820.621237] BTRFS error (device sdb): bdev /dev/sda errs: wr 26, rd 0, flush 3, corrupt 0, gen 0
[158820.622271] BTRFS error (device sdb): error writing primary super block to device 2
[158830.852513] BTRFS error (device sdb): bdev /dev/sda errs: wr 27, rd 0, flush 3, corrupt 0, gen 0

then I attach it back and it appears as /dev/sdc this time.

If I try to replace it I get various errors:

# btrfs replace start 2 /dev/sdc /mnt
/dev/sdc appears to contain an existing filesystem (btrfs).
ERROR: use the -f option to force overwrite of /dev/sdc
# btrfs replace start 2 /dev/sdc /mnt -f
ERROR: /dev/sdc is mounted

But if I reboot the device everything goes back to normal.

  • you need to fail the drive first and then replace it regularly – djdomi Jun 28 '22 at 17:13
  • How do I fail the device? – Boris Rybalkin Jun 29 '22 at 18:23
  • i found a similar like [this one](https://unix.stackexchange.com/questions/334228/btrfs-raid1-how-to-replace-a-disk-drive-that-is-physically-no-more-there) on unix-stackexchange – djdomi Jun 30 '22 at 06:14
  • That question is about read-only state of a new device and an old linux kernel (4.4) I am on 5.11 also they moved to mdadm in the end. One of the comments mentioning that you better use disk uuid to build raid, I will try that hoping it will reduce dependency on device names changing in runtime in my case. – Boris Rybalkin Jul 01 '22 at 07:36
  • the way to failure is the same. – djdomi Jul 02 '22 at 05:15

0 Answers0