On 2020/7/6 下午6:41, Paul-Erik Törrönen wrote: > OS: CentOS8 with updated kernel and btrfs-tools > Kernel: 5.4.17-2011.3.2.1.el8uek.x86_64 > btrfs version: 5.6.1 > > Some background. The file server had been running for quite some time > before I reinstalled it with CentOS 8, at which point I noticed (after > having installed the Oracle UEK kernel which still supports btrfs) the > errors when mounting the partition. I updated the btrfs-tools to latest > and ran the 'btrfs check --repair /dev/sdc1', but despite this, still > have errors when accessing certain files on the FS. Dmesg shows them as > following: > > [1681476.647521] BTRFS error (device sdc1): block=3154170920960 read > time tree block corruption detected > [1681476.694520] BTRFS critical (device sdc1): corrupt leaf: root=5 > block=3154170920960 slot=9 ino=13286681 file_offset=204800, extent end > overflow, have file offset 204800 extent num bytes 18446744073709481984 Some older extents are affected by older kernel not handling extents length correctly. 18446744073709481984 = -69632, which means there is some underflow. Recent upstream kernel caught it and reject the whole tree block to prevent furhter problem. > > There does not appear to be any HW-related errors in the logs. > > With the help of darkling on Freenode IRC #btrfs collected the following > information: > > btrfs inspect dump-tree -b 3154170920960 /dev/sdc1 -> > > btrfs-progs v5.6.1 > leaf 3154170920960 items 97 free space 6102 generation 1326633 owner > FS_TREE > leaf 3154170920960 flags 0x1(WRITTEN) backref revision 1 > fs uuid 519a6725-10d4-4d82-bc4a-32de7dfb923f > chunk uuid 4d1fe695-d2cb-43bc-bac6-d3101dc0725b > item 0 key (13286391 INODE_REF 2479498) itemoff 16245 itemsize 38 > index 8927 namelen 28 name: file_28-2 > item 1 key (13286391 EXTENT_DATA 0) itemoff 15384 itemsize 861 > generation 152266 type 0 (inline) > inline extent data size 840 ram_bytes 840 compression 0 > (none) > item 2 key (13286679 INODE_ITEM 0) itemoff 15224 itemsize 160 > generation 148004 transid 1326604 size 0 nbytes 0 > block group 0 mode 40700 links 1 uid 941400003 gid 513 > rdev 0 > sequence 5 flags 0x0(none) > atime 1592086804.669979287 (2020-06-14 01:20:04) > ctime 1488735439.699720907 (2017-03-05 19:37:19) > mtime 1488735439.699720907 (2017-03-05 19:37:19) > otime 1488735439.699720907 (2017-03-05 19:37:19) > item 3 key (13286679 INODE_REF 900409) itemoff 15200 itemsize 24 > index 156 namelen 14 name: file_14 > item 4 key (13286681 INODE_ITEM 0) itemoff 15040 itemsize 160 > generation 148002 transid 1102464 size 203197 nbytes 204800 > block group 0 mode 100644 links 1 uid 941400003 gid 513 > rdev 0 > sequence 43 flags 0x10(PREALLOC) > atime 1550596828.990479215 (2019-02-19 19:20:28) > ctime 1488735117.417279715 (2017-03-05 19:31:57) > mtime 1488735117.417279715 (2017-03-05 19:31:57) > otime 2562566064006200577.1311248627 (-399890746-03-05 > 07:02:57) > item 5 key (13286681 INODE_REF 2479479) itemoff 15005 itemsize 35 > index 10704 namelen 25 name: file_25-1 > item 6 key (13286681 EXTENT_DATA 0) itemoff 14952 itemsize 53 > generation 148002 type 1 (regular) > extent data disk byte 2584805376 nr 204800 > extent data offset 0 nr 90112 ram 204800 > extent compression 0 (none) > item 7 key (13286681 EXTENT_DATA 90112) itemoff 14899 itemsize 53 > generation 148002 type 2 (prealloc) > prealloc data disk byte 2584805376 nr 204800 > prealloc data offset 90112 nr 45056 > item 8 key (13286681 EXTENT_DATA 135168) itemoff 14846 itemsize 53 > generation 148002 type 2 (prealloc) > prealloc data disk byte 2584805376 nr 204800 > prealloc data offset 135168 nr 69632 > item 9 key (13286681 EXTENT_DATA 204800) itemoff 14793 itemsize 53 > generation 148002 type 1 (regular) > extent data disk byte 0 nr 0 > extent data offset 0 nr 18446744073709481984 ram > 18446744073709481984 > extent compression 0 (none) The offending one is here, the extent obviouly underflows for hole extent. ... > > The dmesg log lists the following unique blocks: > 2627479830528 slot=10 extent data offset 0 nr 18446744073709457408 ram > 18446744073709457408 > 2627928588288 slot=10 No extent data line - (seems to be a file with > stat data) Would you please provide the dump for this bytenr? I'm a little interested in this. > 28710276399104 slot=79 extent data offset 0 nr 18446744073709420544 ram > 18446744073709420544 > 28710639370240 slot=79 extent data offset 0 nr 18446744073709420544 ram > 18446744073709420544 > 30933479342080 slot=27 extent data offset 0 nr 18446744073709395968 ram > 18446744073709395968 > 3154170920960 slot=9 extent data offset 0 nr 18446744073709481984 ram > 18446744073709481984 > 3154170970112 slot=59 extent data offset 0 nr 18446744073709527040 ram > 18446744073709527040 > 3154171035648 slot=27 extent data offset 0 nr 18446744073709514752 ram > 18446744073709514752 > 3154217795584 slot=102 102 item does not exist This > 3154257952768 slot=59 59 -"- And this > 3154259034112 slot=27 No extent data line And this > 3154291228672 slot=9 -"- And this too. > > The curious part are the two which list non-existing slots. These errors > have been present in the dmesg-log ever since booting the server and > while I mounted, as per instructions from the btrfs-Wiki, as readonly, > so I don't think it is a case of changed files. > > All the data on the partition is copied elsewhere, so there is no issue > of losing files and recreating the FS on the partition is the most > probable outcome of this. Thought nonetheless that investigating this > may be something of interest to the btrfs-developers as I can keep the > current FS around for couple of days. Thanks for your detailed report, this would help us to enhance btrfs-progs to fix them. For now, you can just mount them with older kernel, find the offending inode using the ino number in the dmesg, and delete the offending file. With all offending inodes deleted, the fs would come back to normal status. Thanks, Qu > > Poltsi
Attachment:
signature.asc
Description: OpenPGP digital signature
