On Sun, Jul 5, 2020 at 2:10 PM Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > I did some log analysis: The problem started showing up on two of > > three servers starting July 3rd, 2020. This coincides with an applied > > Ubuntu Linux kernel update to 4.15.0-109-generic whose changelog shows > > plenty of btrfs changes: > > https://launchpad.net/ubuntu/+source/linux/4.15.0-109.110 > > So it backported all these restrict self check of recent kernels. > > That's great to expose any unexpected metadata. > Although sometimes backport itself may introduce new bugs (very rare), > especially for heavy backported kernels. > > So if it's possible, try upstream kernel can also be an alternative to > test if it's really something wrong. I took Patrik Lundquist's advice and upgraded to the latest HWE kernel, which is based on 5.3.0. I'll follow your suggestions if the problem manifests again. > > Server #2 (still online) shows 16 error messages in its log since > > 2020-07-03 whereas server #3 shows 310 error messages. > > Then it shouldn't be a hardware problem unless all servers have the same > problem. ACK. Memory test also came back negative. > > On thing special about server #3 is that its btrfs file system has a > > huge metadata section (probably due to it hosting many [~ 50 Mio] > > small files), which doesn't seem too healthy: > > > > # btrfs filesystem usage /mnt > > Overall: > > Device size: 476.30GiB > > Device allocated: 372.02GiB > > Device unallocated: 104.28GiB > > Device missing: 0.00B > > Used: 272.16GiB > > Free (estimated): 194.49GiB (min: 194.49GiB) > > Data ratio: 1.00 > > Metadata ratio: 1.00 > > Global reserve: 512.00MiB (used: 0.00B) > > > > Data,single: Size:284.01GiB, Used:193.80GiB > > /dev/mapper/luks 284.01GiB > > > > Metadata,single: Size:88.01GiB, Used:78.36GiB > > /dev/mapper/luks 88.01GiB > > In fact, your metadata is not that unhealthy. Allright, thanks for pointing this out. The other servers have ~ 1.5 GB allocated for metadata, so this seemed way off (but can probably be explained by the vastly different file system usage on #3). Thanks, Thilo
