Re: 12 TB btrfs file system on virtual machine broke again

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/1/5 下午10:17, Christian Wimmer wrote:
> Hi Qu,
> 
> 
>> On 5. Jan 2020, at 01:25, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
>>
>>
>>
>> On 2020/1/5 上午1:07, Christian Wimmer wrote:
>>> Hi guys, 
>>>
>>> I run again in a problem with my btrfs files system.
>>> I start wondering if this filesystem type is right for my needs.
>>> Could you please help me in recovering my 12TB partition?
>>>
>>> What happened? 
>>> -> This time I was just rebooting normally my virtual machine. I discovered during the past days that the system hangs for some seconds so I thought it would be a good idea to reboot my SUSE Linux after 14 days of working. The machine powered off normally but when starting it run into messages like the pasted ones.
>>>
>>> I immediately powered off again and started my Arch Linux where I have btrfs-progs version 5.4 installed.
>>> I tried one of the commands that you gave me in the past (restore) and I got following messages:
>>>
>>>
>>> btrfs-progs-5.4]# ./btrfs restore -l /dev/sdb1
>>> checksum verify failed on 3181912915968 found 000000A9 wanted 00000064
>>> checksum verify failed on 3181912915968 found 00000071 wanted 00000066
>>> checksum verify failed on 3181912915968 found 000000A9 wanted 00000064
>>> bad tree block 3181912915968, bytenr mismatch, want=3181912915968, have=4908658797358025935
>>
>> All these tree blocks are garbage. This doesn't look good at all.
>>
>> The weird found csum pattern make no sense at all.
>>
>> Are you using fstrim or discard mount option? If so, there could be some
>> old bug causing the problem.
> 
> 
> Seems that I am using fstrim (I did not know this, what is it?):
> 
> BTW, sda2 is here my root partition which is practically the same configuration (just smaller) than the 12TB hard disc
> 
> 2020-01-03T11:30:47.479028-03:00 linux-ze6w kernel: [1297857.324177] sda2: rw=2051, want=532656128, limit=419430400
> 2020-01-03T11:30:47.479538-03:00 linux-ze6w kernel: [1297857.324658] BTRFS warning (device sda2): failed to trim 1 device(s), last error -5
> 2020-01-03T11:30:48.376543-03:00 linux-ze6w fstrim[27910]: fstrim: /opt: FITRIM ioctl failed: Input/output error

That's the cause. The older kernel had a bug where btrfs can trim
unrelated data, causing data loss.

And I'm afraid that bug trimmed some of your tree blocks, screwing up
the whole fs.


> 2020-01-03T11:30:48.378998-03:00 linux-ze6w kernel: [1297858.223675] attempt to access beyond end of device
> 2020-01-03T11:30:48.379012-03:00 linux-ze6w kernel: [1297858.223677] sda2: rw=3, want=421570540, limit=419430400
> 2020-01-03T11:30:48.379013-03:00 linux-ze6w kernel: [1297858.223678] attempt to access beyond end of device
> 2020-01-03T11:30:48.379013-03:00 linux-ze6w kernel: [1297858.223678] sda2: rw=3, want=429959147, limit=419430400
> 2020-01-03T11:30:48.379014-03:00 linux-ze6w kernel: [1297858.223679] attempt to access beyond end of device
> 2020-01-03T11:30:48.379014-03:00 linux-ze6w kernel: [1297858.223679] sda2: rw=3, want=438347754, limit=419430400
> 2020-01-03T11:30:48.379014-03:00 linux-ze6w kernel: [1297858.223680] attempt to access beyond end of device
> 
> Could this be the problem?
> 
> 
> Suse Kernel version is 4.12.14-lp151.28.13-default #1 SMP
I can't find any source tag matching your version. So I can't be 100%
sure about the bug, but that error message still shows the same symptom.

I recommend to check updates about your distro.

Thanks,
Qu

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux