1

I'm in the process of recovering a dying hard-drive with ddrescue. The utility works very well in the parts of the drive that have not problems but in the parts of the drive with problems it goes very slow and seems to be causing a deadlock in some kernel module.

Firstly: my system,

$ uname -a
Linux 3.16.2-1-ARCH #1 SMP PREEMPT Sat Sep 6 13:12:51 CEST 2014 x86_64 GNU/Linux

Here is what is happening, I'm currently in the first stage of recovery using ddrescue -dn /dev/sdd ddrescue.img ddrescue.log Reoccuring in my kernel logs are the following logs

[ 1160.113936] end_request: critical target error, dev sdd, sector 520968448
[ 1191.145082] usb 3-2: reset SuperSpeed USB device number 3 using xhci_hcd
[ 1191.159792] xhci_hcd 0000:01:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88044919cf00
[ 1191.159797] xhci_hcd 0000:01:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88044919cf48
[ 1222.107631] usb 3-2: reset SuperSpeed USB device number 3 using xhci_hcd
[ 1222.122490] xhci_hcd 0000:01:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88044919cf00
[ 1222.122495] xhci_hcd 0000:01:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88044919cf48
[ 1346.337324] sd 17:0:0:0: [sdd] Unhandled error code
[ 1346.337329] sd 17:0:0:0: [sdd]  
[ 1346.337332] Result: hostbyte=0x05 driverbyte=0x00
[ 1346.337334] sd 17:0:0:0: [sdd] CDB: 
[ 1346.337336] cdb[0]=0x28: 28 00 1f 0d 59 80 00 00 01 00
[ 1346.337345] end_request: I/O error, dev sdd, sector 520968576
[ 1377.408091] usb 3-2: reset SuperSpeed USB device number 3 using xhci_hcd
[ 1377.422946] xhci_hcd 0000:01:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88044919cf00
[ 1377.422951] xhci_hcd 0000:01:00.0: xHCI xhci_drop_endpoint called with disabled ep ffff88044919cf48

I speculate that this is due to the I/O errors occurring at the kernel level--the module ends up resetting the connection to the device. (Please correct me if I'm wrong).

This will continue for a while and work fine until eventually I get what looks like a deadlock.

[ 4132.846802] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.866845] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.886878] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.906841] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.926928] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.946948] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.966935] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4132.986990] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4133.007033] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300
[ 4133.027030] usb-storage: Error in queuecommand_lck: us->srb = ffff880446c78300

^ these messages never stop coming

When it deadlocks all related io locks and killing the process doesn't work--my only solution is restarting the system (sometimes forcefully)--this seems to me like a way to cause potential data corruption in the data I'm trying to recover. I shouldn't need to have to restart my system multiple times just to recover this drive.

  1. I understand that this drive is failing but why does this module eventually deadlock?
  2. How should I go about reporting/patching this bug?
  3. Are there certain kernel modules I can restart to recover from this error without having to restart? (My best attempt was force removing uas which stops ddrescue but I'm unable to start it again)

Thank you in advanced

u8sand
  • 111
  • 1
  • The problem could be also in your USB3 hard disc dock device, which cannot handle drive error situations correctly. – Tero Kilkanen Sep 13 '14 at 20:46
  • I see; thank you for the information--I'm actually using my laptop now for the process with USB2 and it is handling it well; though of course for good sectors it is much slower than a USB3 transfer, at least I don't have to worry about deadlocks. – u8sand Sep 13 '14 at 23:57

0 Answers0