On 2019/3/24 下午6:49, Thorsten Hirsch wrote: > Hi Qu, > > thank you once again for your advice. I could indeed recover all my > data, even the snapshots docker had created. Everything's working as > if nothing had ever happened. Here's what I've did in the end: > > btrfs recover <src> <dest> worked flawless, but only recovered some data. > mount -o ro,notreelog,nologreplay was the only way to mount the broken > partition and it showed me a lot more data than btrfs recover could > recover. However when trying to access these additional files I had > input/output errors. This means some csum tree got corrupted. I have seen several reports about csum and extent tree corruption, so it's quite possible. > > btrfs recover -sxmS <src> <dest> was the magic command that recovered > all my data (which I could "cp -a" back to my device after creating a > new btrfs file system). After reading the help of btrfs-recover it's > obvious that the arguments are required, but in the btrfs wiki it says > "If you're really lucky, this might be enough"[1] describing the > command w/o arguments. I think this is misleading. The arguments are > always necessary if you want to recover all your data. Well, at least > I think the wiki page makes mores sense if the arguments were > included. After looking into the man page, I strongly believe that file owner/mode/symlink related things should be the default value. At least we should enhance either the manpage or btrfs-restore. Thanks, Qu > > If there's anything I can provide to help you improve btrfs or its > recovery tools please don't hesitate to ask. Although I don't have an > image of the broken partition, at least I still have the core dump of > "btrfs check --clear-space-cache v1". > > [1] https://btrfs.wiki.kernel.org/index.php/Restore > > Thorsten Hirsch > > P.S.: btrfs check --repair was of no use. It crashed almost > immediately. I tried it only after recovering all my data, to see if > it would've helped as well. > > Am Sa., 23. März 2019 um 14:57 Uhr schrieb Qu Wenruo <quwenruo.btrfs@xxxxxxx>: >> >> >> >> On 2019/3/23 下午6:48, Thorsten Hirsch wrote: >>> Hi Qu, >>> >>> sorry for this direct reply. I've been trying to answer to the mailing >>> list since yesterday, but my mails seem to get dropped. So please see >>> my answer to your mail enclosed. >>> >>> Thorsten >>> >>> >>> ---------- Forwarded message --------- >>> From: Thorsten Hirsch <t.hirsch@xxxxxx> >>> Date: Sa., 23. März 2019 um 09:29 Uhr >>> Subject: Re: Fw: kernel oops when mounting btrfs >>> To: <linux-btrfs@xxxxxxxxxxxxxxx> >>> >>> >>> Hi Qu, >>> >>> thank you, but unfortunately that didn't work out so well. The tree >>> dump was no problem [1], but clearing the space cache resulted in a >>> core dump. Now btrfs check --readonly reports some errors. I attached >>> the output of these commands. >>> >>> Thorsten >>> >>> [1] https://gist.github.com/thorstenhirsch/65d4308ce54729c902cb09c0d4ad2baf >> >> This explains why a lot of things doesn't go correct. >> >> The inode item of your free space cache tree is wrong. >> According to my experimental with latest kernel, it looks like some >> older kernel is the culprit. >> >> Your free space cache inode lacks the correct mode. >> Normally the mode should be 0100600. But your fs only has 0, and kernel >> panics for that reason. >> >>> >>> # btrfs check --clear-space-cache v1 /dev/nvme0n1p3 >>> Opening filesystem to check... >>> Checking filesystem on /dev/nvme0n1p3 >>> UUID: 4284a794-ad75-450d-b023-ebc5e75f31f5 >>> Failed to find [544448348160, 168, 16384] >> >> Then this means something bad happened in extent tree. >> >>> btrfs unable to find ref byte nr 544448364544 parent 0 root 2 owner 0 offset 0 >>> transaction.c:195: btrfs_commit_transaction: BUG_ON `ret` triggered, value -5 >>> btrfs(+0x3be68)[0x556936269e68] >>> btrfs(btrfs_commit_transaction+0x12a)[0x55693626a2ec] >>> btrfs(btrfs_clear_free_space_cache+0x32a)[0x55693625fecf] >>> btrfs(+0x4be5b)[0x556936279e5b] >>> btrfs(cmd_check+0x5c2)[0x556936284d86] >>> btrfs(main+0x1f6)[0x556936241ef6] >>> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb)[0x7fb9a7911b6b] >>> btrfs(_start+0x2a)[0x556936241f3a] >>> Aborted (core dumped) >>> >>> >>> # btrfs check --readonly /dev/nvme0n1p3 >>> Opening filesystem to check... >>> parent transid verify failed on 419860414464 wanted 30188 found 30105 >>> parent transid verify failed on 419860414464 wanted 30188 found 30105 >> >> So extent tree get corrupted in that repair attempt, which looks pretty >> strange, as aborted transaction shouldn't cause any impact on the >> existing fs. >> >> I'm afraid you can only try btrfs check --repair. >> >> If no good result, then I'm afraid you have to go to salvage the data, >> which I believe over 99% of your data should be safe. >> >> To salvage the data, either use btrfs-restore, or you my experimental >> 'skip_bg' kernel patches: >> https://github.com/adam900710/linux/tree/rescue_options >> >> The 'skip_bg' kernel patches introduce a new mount option, >> 'ro,rescue=skip_bg', which can skip the whole (corrupted) extent tree, >> and since you have all trees consistent but extent tree, you have all >> the readonly btrfs features, like subvolume list, csum check. >> >> Thanks, >> Qu >>
Attachment:
signature.asc
Description: OpenPGP digital signature
