Re: corrupt leaf: root=1 block=57567265079296 slot=83, bad key order

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> On Thu, Feb 14, 2019 at 08:25:26PM +0800, Qu Wenruo wrote:
>> On 2019/2/14 =E4=B8=8B=E5=8D=887:58, Jesper Utoft wrote:
>>> Hello Fellow BTRFS users.
>>>
>>> I have run into the bad key order issue.
>>> corrupt leaf: root=3D1 block=3D57567265079296 slot=3D83, bad key orde=
r, prev
>>> (18446744073709551605 0 57707594776576) current (18446726481523507189=

>>> 0 57709742260224)
>>> The lines repeats over and over..
>>>
>>> I read a thread between Hugo Mills and Eric Wolf about a similar issu=
e
>>> and i have gathered the same info.=20
>> Now we have all the needed info.
>>
>>>
>>> I understand that it probably is hardware related, i have been runnin=
g
>>> memtest for 60h+ to see if i could reproduce it.
>>> I also tried to run btrfs check --recover but it did not help.
>>>
>>> My questions is if it can be fixed?
>>
>> Yes, but only manual patching is possible yet.
>=20
>    David: What needs to be done to get the bitflip-in-key patches
> added to btrfs check? They've been lurking in some patch stack for
> literally years, and would have dealt with this one easily.

[snip]

>=20
> [snip]
>> Thankfully, all keys around give us a pretty good idea what the origin=
al
>> value should be: (FREE_SPACE UNTYPED 57709742260224).
>>
>> And for the raw value:
>> bad:  0xffffeffffffffff5
>> good: 0xfffffffffffffff5
>>             ^
>> e->f, one bit get flipped.
>> (UNTYPED is the same value for UNKNOWN.0, so don't worry about that).
>>
>> I have created a special branch for you:
>> https://github.com/adam900710/btrfs-progs/tree/dirty_fix
>>
>> Just compile that btrfs-progs, no need to install, then excute the
>> following command inside btrfs-progs directory:
>>
>> # ./btrfs-corrupt-block -X <device>
>=20
>    BUT, don't do it until you've found and replaced the bad RAM that
> broke it in the first place.

I got the code to build & ran as described above. I do not know if it
worked and there were many other errors, or if it failed and just
moved the issue elsewhere.
In any case i had a few subvolumes with missing files so i have been
send receiving the subvolumes i can, and cp'ed the ones that i could
not.
Now it's running in a new btrfs volume on a new disk. And i will
probably use snapraid or a transfer of subvolumes between btrfs
filesystems for "backup" instead of a raid 1. Which i expect would be
a more sane approach anyway.

>=20
>> And your report just remind me to update the write time tree block
>> checker....
>=20
>    Looking forward to dealing with a whole new type of "btrfs is
> broken!" complaints on IRC (followed by "can't I just let it carry on
> regardless?"). ;)

Thank you all for the very quick assistance. Especially for the "dirty
fix" even though the volume was damaged too much to fix.

I will save money for a new hardware setup with ecc ram, for now i
will have to hope for the best, and keep taking regular backups.

Thanks
Jesper Utoft
Ps: I'm not a member of the mailing list, so if you reply please do
reply to me direcly as well.



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux