On 9/22/19 10:25 PM, Chris Murphy wrote: >> OK, I'm building 5.2.17 now. Keen to avoid the corruption errors I was >> hit by a few weeks back... May take a time as I'm in the middle of a >> slow backup. > > Did you see corruption on this same file system? The 5.2 corruption > bug has been fixed in 5.2.15. > https://www.spinics.net/lists/stable-commits/msg129532.html > > I'm not aware of a way to fix it though. I had, what I believe is this issue, in late August/early Sept on my hdd based filesystem. I was not aware that my nvme drive (/var/lib/lxc etc) or my ssd, /, were affected. No obvious clues unlike the hdd filesystem which threw checksum errors and went ro after a few minutes. > > >> I note that a filtered balance, though not hitting enospc and not >> reporting any errors did seemed to relocate a chunk/extent (sorry, I >> forget the terminology) but running it a second, third and so on time >> got the same result. As if the balance reported doing some work, but >> did not actually do it. I also had to reboot at one point as it seemed >> to get stuck in a loop but alas I can't repeat this. With the extra >> logical volume added there is certainly no lack of space relative to the >> size of the filesystem. > > For sure you're running into some kind of bug. OK, I broadly did as you suggested, with the filesystem in question mounted with enospc_debug and a 5.2.17 kernel. 'script' seemed to put a few odd characters in, but you get my gist. Not sure what the first line is?! Some garbage from somewhere. Script started on Mon 23 Sep 2019 07:38:55 PM BST root@phoenix:~# rebootmount | grep lxcint-http-copy/umount_rootfs btrfs bal staRT rt -dusage=1 /var/lib/lxc Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=1 /var/lib/lxc1[1P[1@1[1P[1@2 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=2 /var/lib/lxc[1P[1@3 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=3 /var/lib/lxc[1P[1@4 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=4 /var/lib/lxc[1P[1@5 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=5 /var/lib/lxc[1P[1@1[1@0 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=10 /var/lib/lxc[1P[1@2 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=20 /var/lib/lxc[1P[1@3 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=30 /var/lib/lxc[1P[1@4 Done, had to relocate 0 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=40 /var/lib/lxc[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=50 /var/lib/lxc[1P[1@6 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=60 /var/lib/lxc[1P[1P[1@7[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=70 /var/lib/lxc Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=70 /var/lib/lxc70[1P[1P[1P[1P[1P[1P[1P[1P[1P[1@m[1@u[1@s[1@a[1@g[1@e[1@=[1@1 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=1 /var/lib/lxc1[1P[1@2 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=2 /var/lib/lxc[1P[1@3 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=3 /var/lib/lxc[1P[1@4 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=4 /var/lib/lxc[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=5 /var/lib/lxc[1P[1@1[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=10 /var/lib/lxc[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=15 /var/lib/lxc[1P[1P[1@2[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=20 /var/lib/lxc[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=25 /var/lib/lxc[1P[1P[1@3[1P[1@3[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=30 /var/lib/lxc[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=35 /var/lib/lxc[1P[1P[1@4[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=40 /var/lib/lxc0[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=45 /var/lib/lxc[1P[1P[1@5[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=50 /var/lib/lxc[1P[1@5 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=55 /var/lib/lxc5[1P[1P[1@6[1@0 Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -musage=60 /var/lib/lxc[1P[1P[1@6[1@5 Done, had to relocate 2 out of 89 chunks root@phoenix:~# btrfs bal start -musage=65 /var/lib/lxc[1P[1P[1@5[1P[1@7[1@0 Done, had to relocate 2 out of 89 chunks root@phoenix:~# btrfs bal start -musage=70 /var/lib/lxc Done, had to relocate 2 out of 89 chunks root@phoenix:~# btrfs bal start -musage=70 /var/lib/lxc Done, had to relocate 2 out of 89 chunks root@phoenix:~# btrfs bal start -musage=70 /var/lib/lxc Done, had to relocate 2 out of 89 chunks root@phoenix:~# btrfs bal start -musage=70 /var/lib/lxc[1P[1@d Done, had to relocate 1 out of 89 chunks root@phoenix:~# btrfs bal start -dusage=70 /var/lib/lxc Done, had to relocate 1 out of 89 chunks root@phoenix:~# exit exit Script done on Mon 23 Sep 2019 07:45:40 PM BST dmesg showed this sort of thing: root@phoenix:~# dmesg | tail -n 40 [ 724.387061] BTRFS info (device dm-4): trans_block_rsv: size 0 reserved 0 [ 724.387062] BTRFS info (device dm-4): chunk_block_rsv: size 0 reserved 0 [ 724.387063] BTRFS info (device dm-4): delayed_block_rsv: size 0 reserved 0 [ 724.387064] BTRFS info (device dm-4): delayed_refs_rsv: size 4194304 reserved 1523712 [ 724.387093] BTRFS info (device dm-4): relocating block group 1942181904384 flags system [ 724.412338] BTRFS info (device dm-4): found 1 extents [ 724.438117] BTRFS info (device dm-4): balance: ended with status: 0 [ 725.144408] BTRFS info (device dm-4): balance: start -musage=70 -susage=70 [ 725.144453] BTRFS info (device dm-4): unable to make block group 1944362942464 ro [ 725.144455] BTRFS info (device dm-4): sinfo_used=16384 bg_num_bytes=33538048 min_allocable=1048576 [ 725.144456] BTRFS info (device dm-4): space_info 2 has 33538048 free, is not full [ 725.144458] BTRFS info (device dm-4): space_info total=33554432, used=16384, pinned=0, reserved=0, may_use=0, readonly=0 [ 725.144459] BTRFS info (device dm-4): global_block_rsv: size 536870912 reserved 536870912 [ 725.144460] BTRFS info (device dm-4): trans_block_rsv: size 0 reserved 0 [ 725.144461] BTRFS info (device dm-4): chunk_block_rsv: size 0 reserved 0 [ 725.144462] BTRFS info (device dm-4): delayed_block_rsv: size 0 reserved 0 [ 725.144463] BTRFS info (device dm-4): delayed_refs_rsv: size 0 reserved 0 [ 725.144547] BTRFS info (device dm-4): relocating block group 1944362942464 flags system [ 725.172398] BTRFS info (device dm-4): unable to make block group 1943289200640 ro [ 725.172400] BTRFS info (device dm-4): sinfo_used=11051237376 bg_num_bytes=1073741824 min_allocable=1048576 [ 725.172402] BTRFS info (device dm-4): space_info 4 has 759922688 free, is not full [ 725.172404] BTRFS info (device dm-4): space_info total=11811160064, used=10512777216, pinned=0, reserved=114688, may_use=538345472, readonly=0 [ 725.172405] BTRFS info (device dm-4): global_block_rsv: size 536870912 reserved 536821760 [ 725.172406] BTRFS info (device dm-4): trans_block_rsv: size 0 reserved 0 [ 725.172407] BTRFS info (device dm-4): chunk_block_rsv: size 0 reserved 0 [ 725.172408] BTRFS info (device dm-4): delayed_block_rsv: size 0 reserved 0 [ 725.172409] BTRFS info (device dm-4): delayed_refs_rsv: size 4194304 reserved 1523712 [ 725.172441] BTRFS info (device dm-4): relocating block group 1943289200640 flags metadata [ 725.198714] BTRFS info (device dm-4): balance: ended with status: 0 [ 728.945125] BTRFS info (device dm-4): balance: start -dusage=70 [ 728.945261] BTRFS info (device dm-4): relocating block group 1938356699136 flags data [ 729.822341] BTRFS info (device dm-4): found 9991 extents [ 730.012601] BTRFS info (device dm-4): found 9990 extents [ 730.160436] BTRFS info (device dm-4): balance: ended with status: 0 [ 732.697352] BTRFS info (device dm-4): balance: start -dusage=70 [ 732.697513] BTRFS info (device dm-4): relocating block group 1945503793152 flags data [ 733.573416] BTRFS info (device dm-4): found 9820 extents [ 733.738592] BTRFS info (device dm-4): found 9817 extents [ 733.882864] BTRFS info (device dm-4): balance: ended with status: 0 [ 833.204449] radeon_dp_aux_transfer_native: 32 callbacks suppressed root@phoenix:~# I'm not sure the balance is resolving anything. The filesystem has not gone read only. I'll try an unfiltered balance now to see how that goes. Pete
