Hello everyone, Previously, I was running an array with six disks all connected via USB. I am running raid1c3 for metadata and raid6 for data, kernel 5.5.4-arch1-1 and btrfs --version v5.4, and I use bees for deduplication. Four of the six drives are stored in a single four-bay enclosure. Due to my oversight, TLER was not enabled for any of the drives, so when one of them started failing, the enclosure was reset and all four drives were disconnected. After rebooting, the file system was still mountable. I saw some transid errors in dmesg, but I didn't really pay attention to them because I was trying to get rid of the now failed drive. I tried to "btrfs replace" the drive with a different one, but the replace stopped making progress because all reads to the dead drive in a certain location were failing (even with the "-r") flag. So I tried mounting degraded without the dead drive and doing "btrfs dev delete missing" instead. The deletion failed with the following kernel message: [ +2.697798] BTRFS warning (device sdb): csum failed root -9 ino 257 off 2083160064 csum 0xd0a0b14c expected csum 0x7f3ec5ab mirror 1 [ +0.003381] BTRFS warning (device sdb): csum failed root -9 ino 257 off 2083160064 csum 0xd0a0b14c expected csum 0x7f3ec5ab mirror 2 [ +0.002514] BTRFS warning (device sdb): csum failed root -9 ino 257 off 2083160064 csum 0xd0a0b14c expected csum 0x7f3ec5ab mirror 4 [ +0.000543] BTRFS warning (device sdb): csum failed root -9 ino 257 off 2083160064 csum 0xd0a0b14c expected csum 0x7f3ec5ab mirror 1 [ +0.001170] BTRFS warning (device sdb): csum failed root -9 ino 257 off 2083160064 csum 0xd0a0b14c expected csum 0x7f3ec5ab mirror 2 [ +0.001151] BTRFS warning (device sdb): csum failed root -9 ino 257 off 2083160064 csum 0xd0a0b14c expected csum 0x7f3ec5ab mirror 4 I noticed that almost all of the files give an I/O error when read, and similar kernel messages are generated, but with positive roots. I also see "read error corrected" messages, but if I try to read the files again, I the exact same messages are printed again, which seems to suggest that the errors haven't really been corrected? (But maybe this is intended behavior.) I also attempted to use "btrfs restore" to recover the files, but almost all of the files produce "ERROR: zstd decompress failed Unknown frame descriptor" and the recovery does not succeed. Since, then, I have been scrubbing the file system. The first scrub produce lots of Uncorrectable read errors and several hundred csum errors. I'm assuming the read errors are due to the missing drive. The puzzling thing is, the scrub can "complete" (actually, it is aborted after it completes on all drives but the missing one) and I can delete all of the files with unrecoverable csum errors, but all of the issues above persist. I can then turn around scrub again, and the scrub will find new csum errors, which seems bizarre to me, since I would have expected them all to be fixed. However, all transid related errors have disappeared after the first scrub. I have also tried deleting the file referenced in the device deletion error and restarting the deletion. This seems to be working, but progress has been very slow and I fear I'll have to delete all of the I/O error-producing files above, which I would like to avoid if possible. What should I do in this situation and how can I avoid this in the future? Thanks, Jonathan
