On 26.03.2019 14:37 Qu Wenruo wrote: > On 2019/3/26 下午6:24, berodual_xyz wrote: >> Mount messages below. >> >> Thanks for your input, Qu! >> >> ## >> [42763.884134] BTRFS info (device sdd): disabling free space tree >> [42763.884138] BTRFS info (device sdd): force clearing of disk cache >> [42763.884140] BTRFS info (device sdd): has skinny extents >> [42763.885207] BTRFS error (device sdd): parent transid verify failed on 1048576 wanted 60234 found 60230 > So btrfs is using the latest superblock while the good one should be the > old superblock. > > Btrfs-progs is able to just ignore the transid mismatch, but kernel > doesn't and shouldn't. > > In fact we should allow btrfs rescue super to use super blocks from > other device to replace the old one. > > So my patch won't help at all, the failure happens at the very beginning > of the devices list initialization. > > BTW, if btrfs restore can't recover certain files, I don't believe any > rescue kernel mount option can do more. > > Thanks, > Qu I have made btrfs limp along (till a rebuild) in the past by commenting out/removing the transid checks. Obviously you should still mount it read-only (and with no log replay) and it might crash, but there is a small chance this would work. > >> [42763.885263] BTRFS error (device sdd): failed to read chunk root >> [42763.900922] BTRFS error (device sdd): open_ctree failed >> ## >> >> >> >> >> Sent with ProtonMail Secure Email. >> >> ‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐ >> On Tuesday, 26. March 2019 10:21, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: >> >>> On 2019/3/26 下午4:52, berodual_xyz wrote: >>> >>>> Thank you both for your input. >>>> see below. >>>> >>>>>> You sda and sdb are at gen 60233 while sdd and sde are at gen 60234. >>>>>> It's possible to allow kernel to manually assemble its device list using >>>>>> "device=" mount option. >>>>>> Since you're using RAID6, it's possible to recover using 2 devices only, >>>>>> but in that case you need "degraded" mount option. >>>>> He has btrfs raid0 profile on top of hardware RAID6 devices. >>>> Correct, my FS is a "raid0" across four hardware-raid based raid6 devices. The underlying devices of the raid controller are fine, same as the volumes themselves. >>> Then there is not much we can do. >>> >>> The super blocks shows all your 4 devices are in 2 different states. >>> (older generation with dirt log, newer generation without log). >>> >>> This means some writes didn't reach all devices. >>> >>>> Only corruption seems to be on the btrfs side. >>> Please provide the kernel message when trying to mount the fs. >>> >>>> Does your tip regarding mounting by explicitly specifying the devices still make sense? >>> Not really. For RAID0 case, it doesn't make much sense. >>> >>>> Will this figure out automatically which generation to use? >>> You could try, as all the mount option is making btrfs completely RO (no >>> log replay), so it should be pretty safe. >>> >>>> I am at the moment in the process of using "btrfs restore" to pull more data from the filesystem without making any further changes. >>>> After that I am happy to continue testing, and will happily test your mentioned "skip_bg" patch - but if you think that there is some other way to mount (just for recovery purpose - read only is fine!) while having different gens on the devices, I highly appreciate it. >>> With mounting failure dmesg, it should be pretty easy to determine >>> whether my skip_bg will work. >>> >>> Thanks, >>> Qu >>> >>>> Thanks Qu and Andrei!
