Re: Growing number of "invalid tree nritems" errors

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 




On 2020/7/5 下午6:30, Thilo-Alexander Ginkel wrote:
> On Sun, Jul 5, 2020 at 11:53 AM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote:
>> How producible is this?
> 
> I did some log analysis: The problem started showing up on two of
> three servers starting July 3rd, 2020. This coincides with an applied
> Ubuntu Linux kernel update to 4.15.0-109-generic whose changelog shows
> plenty of btrfs changes:
> https://launchpad.net/ubuntu/+source/linux/4.15.0-109.110

So it backported all these restrict self check of recent kernels.

That's great to expose any unexpected metadata.
Although sometimes backport itself may introduce new bugs (very rare),
especially for heavy backported kernels.

So if it's possible, try upstream kernel can also be an alternative to
test if it's really something wrong.

Another factor involved is btrfs-progs version, which normally gets less
backports, while upstream normally have more strict checks overall.
So trying upstream btrfs-check would also be a good idea if possible.

> 
> Server #2 (still online) shows 16 error messages in its log since
> 2020-07-03 whereas server #3 shows 310 error messages.

Then it shouldn't be a hardware problem unless all servers have the same
problem.

In such cases, I would recommend to try upstream kernels first,
especially when the heavily backported kernels are involved.

If you can reproduce it with upstream kernel, then I strongly recommend
to use that mentioned diff to provide more info to debug, as it would be
a false alert.

> 
> On thing special about server #3 is that its btrfs file system has a
> huge metadata section (probably due to it hosting many [~ 50 Mio]
> small files), which doesn't seem too healthy:
> 
> # btrfs filesystem usage /mnt
> Overall:
>     Device size:                 476.30GiB
>     Device allocated:            372.02GiB
>     Device unallocated:          104.28GiB
>     Device missing:                  0.00B
>     Used:                        272.16GiB
>     Free (estimated):            194.49GiB      (min: 194.49GiB)
>     Data ratio:                       1.00
>     Metadata ratio:                   1.00
>     Global reserve:              512.00MiB      (used: 0.00B)
> 
> Data,single: Size:284.01GiB, Used:193.80GiB
>    /dev/mapper/luks      284.01GiB
> 
> Metadata,single: Size:88.01GiB, Used:78.36GiB
>    /dev/mapper/luks       88.01GiB

In fact, your metadata is not that unhealthy.

> 
> System,single: Size:4.00MiB, Used:80.00KiB
>    /dev/mapper/luks        4.00MiB
> 
> Unallocated:
>    /dev/mapper/luks      104.28GiB

And there are plenty unallocated space, so your fs looks pretty healthy
instead.

Thanks,
Qu
> 
>> If it still shows the same symptom after verifying the RAM, would you
>> please apply this small debug diff on your kernel?
> 
> I'll see what I can do.
> 
> Thanks,
> Thilo
> 

Attachment: signature.asc
Description: OpenPGP digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux