Re: Fw: kernel oops when mounting btrfs

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Mar 23, 2019 at 2:30 AM Thorsten Hirsch <t.hirsch@xxxxxx> wrote:
>
> Hi Qu,
>
> thank you, but unfortunately that didn't work out so well. The tree
> dump was no problem [1], but clearing the space cache resulted in a
> core dump. Now btrfs check --readonly reports some errors. I attached
> the output of these commands.
>
> Thorsten
>
> [1] https://gist.github.com/thorstenhirsch/65d4308ce54729c902cb09c0d4ad2baf
>
> # btrfs check --clear-space-cache v1 /dev/nvme0n1p3
> Opening filesystem to check...
> Checking filesystem on /dev/nvme0n1p3
> UUID: 4284a794-ad75-450d-b023-ebc5e75f31f5
> Failed to find [544448348160, 168, 16384]
> btrfs unable to find ref byte nr 544448364544 parent 0 root 2  owner 0 offset 0
> transaction.c:195: btrfs_commit_transaction: BUG_ON `ret` triggered, value -5
> btrfs(+0x3be68)[0x556936269e68]
> btrfs(btrfs_commit_transaction+0x12a)[0x55693626a2ec]
> btrfs(btrfs_clear_free_space_cache+0x32a)[0x55693625fecf]
> btrfs(+0x4be5b)[0x556936279e5b]
> btrfs(cmd_check+0x5c2)[0x556936284d86]
> btrfs(main+0x1f6)[0x556936241ef6]
> /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xeb)[0x7fb9a7911b6b]
> btrfs(_start+0x2a)[0x556936241f3a]
> Aborted (core dumped)
>
>
> # btrfs check --readonly /dev/nvme0n1p3
> Opening filesystem to check...
> parent transid verify failed on 419860414464 wanted 30188 found 30105
> parent transid verify failed on 419860414464 wanted 30188 found 30105
> Ignoring transid failure
> Checking filesystem on /dev/nvme0n1p3
> UUID: 4284a794-ad75-450d-b023-ebc5e75f31f5
> [1/7] checking root items
> [2/7] checking extents
> ref mismatch on [544448348160 16384] extent item 1, found 0
> backref 544448348160 root 2 not referenced back 0x563ce432f010
> incorrect global backref count on 544448348160 found 1 wanted 0
> backpointer mismatch on [544448348160 16384]
> owner ref check failed [544448348160 16384]
> ref mismatch on [544448364544 16384] extent item 0, found 1
> tree backref 544448364544 parent 2 root 2 not found in extent tree
> backpointer mismatch on [544448364544 16384]
> ERROR: errors found in extent allocation tree or chunk allocation
> [3/7] checking free space cache
> cache and super generation don't match, space cache will be invalidated
> [4/7] checking fs roots
> [5/7] checking only csums items (without verifying data)
> [6/7] checking root refs
> [7/7] checking quota groups skipped (not enabled on this FS)
> ERROR: transid errors in file system
> found 275464060928 bytes used, error(s) found
> total csum bytes: 266882612
> total tree bytes: 1513570304
> total fs tree bytes: 1135788032
> total extent tree bytes: 73220096
> btree space waste bytes: 236694654
> file data blocks allocated: 1962517999616
>  referenced 221128466432

This looks like the same problem I reported earlier this month, and
also filed a bug for at
https://bugzilla.kernel.org/show_bug.cgi?id=202717

In my case I did a scrub and check before clearing space cache v1. No
problems reported. And then clearing space cache v1 crashed. And then
check reports corruption.

Bug 1, for sure btrfs check clear cache crashing is a bug
Bug 2, btrfs check appears to do non-COW overwrite of the extent tree,
which might be fine as long as it doesn't crash but it seems risky
considering how fragile the extent tree is anyway

Clear cache right now is not fail safe near as I can tell. It can make
an error free file system corrupt.

If there's some problem already, before clearing space cache, that
means more bugs:

Bug 3, btrfs check doesn't find the problem, reports no errors
Bug 4, btrfs kernel code doesn't find the problem during scrub,
reports no errors


-- 
Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux