After giving up on my previous array (see "BUG_ON in btrfs check &
fs/btrfs/extent-tree.c:3071"), I copied 11TB of crap (including tons of
small files, and a few files up to eg 2TB in size) to a fresh array
built on 5.4.16, with a bit of duperemove in between, partially on
5.4.16 and partially on 5.4.16 + "fs: allow deduplication of eof block
into the end of the destination file" + "Btrfs: make deduplication with
range including the last block work".
I then added a second drive to the array (to finish copying), kicked it
back on again, and ran btrfs balance -mconvert=raid1 /path, waited for
the first one or two block groups, then btrfs balance cancel /path (so
that, I though, new metadata would be written with raid1, but old wont
convert until I had more IOPS available). This seems to have completely
borked my array's ability to balance. I trued running a few more balance
-mconvert=raid1,softs but it hits ENOSPC after allocating metadata block
groups and copying no data into them. After letting it run for a while,
I now have a ton of metadata blocks that are unused and I cant seem to
free them. bigraid2 started with around 500G of available unallocated
space, and now is rather limited:
Overall:
Device size: 18.19TiB
Device allocated: 13.25TiB
Device unallocated: 4.95TiB
Device missing: 0.00B
Used: 12.23TiB
Free (estimated): 4.96TiB (min: 2.49TiB)
Data ratio: 1.00
Metadata ratio: 1.80
Global reserve: 512.00MiB (used: 0.00B)
Data,single: Size:11.93TiB, Used:11.92TiB
/dev/mapper/bigraid2_crypt 8.29TiB
/dev/mapper/bigraid42_crypt 3.64TiB
Metadata,single: Size:151.00GiB, Used:149.04GiB
/dev/mapper/bigraid2_crypt 151.00GiB
Metadata,RAID1: Size:596.00GiB, Used:86.13GiB
/dev/mapper/bigraid2_crypt 596.00GiB
/dev/mapper/bigraid42_crypt 596.00GiB
System,RAID1: Size:32.00MiB, Used:1.38MiB
/dev/mapper/bigraid2_crypt 32.00MiB
/dev/mapper/bigraid42_crypt 32.00MiB
Unallocated:
/dev/mapper/bigraid2_crypt 72.97GiB
/dev/mapper/bigraid42_crypt 4.87TiB
Running btrfs balance start -musage=0 /path very quickly eats drive
space by allocating more metadata,RAID1 blocks and then hitting enospc:
[264158.311333] BTRFS info (device dm-2): balance: start -musage=0 -susage=0
[264158.443053] BTRFS info (device dm-2): relocating block group
13821436493824 flags metadata|raid1
[264159.103628] BTRFS info (device dm-2): relocating block group
13820362752000 flags metadata|raid1
[264160.513106] BTRFS info (device dm-2): relocating block group
13818215268352 flags metadata|raid1
[264161.475568] BTRFS info (device dm-2): relocating block group
13816067784704 flags metadata|raid1
[264163.174316] BTRFS info (device dm-2): relocating block group
13814994042880 flags metadata|raid1
[264164.326509] BTRFS info (device dm-2): relocating block group
13813920301056 flags metadata|raid1
[264165.234746] BTRFS info (device dm-2): relocating block group
13812846559232 flags metadata|raid1
[264166.237149] BTRFS info (device dm-2): relocating block group
13811772817408 flags metadata|raid1
[264167.282200] BTRFS info (device dm-2): relocating block group
13810699075584 flags metadata|raid1
[264168.207370] BTRFS info (device dm-2): relocating block group
13808551591936 flags metadata|raid1
[264169.167734] BTRFS info (device dm-2): relocating block group
13807477850112 flags metadata|raid1
[264170.166969] BTRFS info (device dm-2): relocating block group
13806404108288 flags metadata|raid1
[264171.114659] BTRFS info (device dm-2): relocating block group
13805330366464 flags metadata|raid1
[264172.985116] BTRFS info (device dm-2): relocating block group
13803182882816 flags metadata|raid1
[264174.784420] BTRFS info (device dm-2): relocating block group
13801035399168 flags metadata|raid1
[264175.967226] BTRFS info (device dm-2): relocating block group
13799961657344 flags metadata|raid1
[264177.252213] BTRFS info (device dm-2): relocating block group
13798887915520 flags metadata|raid1
[264178.499955] BTRFS info (device dm-2): relocating block group
13796740431872 flags metadata|raid1
[264179.780731] BTRFS info (device dm-2): relocating block group
13795666690048 flags metadata|raid1
[264180.885075] BTRFS info (device dm-2): relocating block group
13793519206400 flags metadata|raid1
[264182.089885] BTRFS info (device dm-2): relocating block group
13790297980928 flags metadata|raid1
[264183.358970] BTRFS info (device dm-2): relocating block group
13786003013632 flags metadata|raid1
[264184.462920] BTRFS info (device dm-2): 137 enospc errors during balance
[264184.462922] BTRFS info (device dm-2): balance: canceled
(the above allocated some 100G or so in Metadata,RAID1).
Additionally, and I dont know if this was about this fs or the old,
broken, mounted-ro,degraded fs, but I saw this while writing:
[245116.074467] ------------[ cut here ]------------
[245116.074497] WARNING: CPU: 20 PID: 474 at fs/btrfs/inode.c:9378
btrfs_destroy_inode+0x1c/0x288 [btrfs]
[245116.074498] Modules linked in: xt_tcpudp(E) binfmt_misc(E) veth(E)
xt_nat(E) wireguard(OE) essiv(E) authenc(E) ip6_udp_tunnel(E)
udp_tunnel(E) nft_counter(E) nft_chain_nat(E) xt_MASQUERADE(E) nf_nat(E)
nf_conntrack(E) nf_defrag_ipv6(E) nf_defrag_ipv4(E) nft_compat(E)
nf_tables(E) nfnetlink(E) btrfs(E) zstd_compress(E) zstd_decompress(E)
amdgpu(E) gpu_sched(E) snd_hda_codec_hdmi(E) ast(E) drm_vram_helper(E)
snd_hda_intel(E) ttm(E) ofpart(E) snd_hda_codec(E) drm_kms_helper(E)
powernv_flash(E) sg(E) mtd(E) snd_hda_core(E) uas(E) drm(E) snd_hwdep(E)
snd_pcm(E) tg3(E) mpt3sas(E) snd_timer(E) ipmi_powernv(E)
ipmi_devintf(E) drm_panel_orientation_quirks(E) libphy(E) snd(E)
syscopyarea(E) ipmi_msghandler(E) sysfillrect(E) sysimgblt(E)
fb_sys_fops(E) i2c_algo_bit(E) ptp(E) opal_prd(E) raid_class(E)
pps_core(E) soundcore(E) scsi_transport_sas(E) at24(E) ip_tables(E)
x_tables(E) autofs4(E) ext4(E) crc16(E) mbcache(E) jbd2(E) sd_mod(E)
raid10(E) raid456(E) crc32c_generic(E) libcrc32c(E)
[245116.074545] async_raid6_recov(E) async_memcpy(E) async_pq(E)
evdev(E) hid_generic(E) usbhid(E) hid(E) raid6_pq(E) async_xor(E) xor(E)
async_tx(E) raid1(E) raid0(E) multipath(E) linear(E) md_mod(E)
usb_storage(E) dm_crypt(E) dm_mod(E) algif_skcipher(E) af_alg(E)
xhci_pci(E) xhci_hcd(E) vmx_crypto(E) nvme(E) usbcore(E) nvme_core(E)
usb_common(E)
[245116.074566] CPU: 20 PID: 474 Comm: kswapd0 Tainted: G W OE
5.4.0-3-powerpc64le #1 Debian 5.4.16-1
[245116.074568] NIP: c0080000078aec94 LR: c000000000447bb8 CTR:
c0080000078aec78
[245116.074569] REGS: c000000ff760f580 TRAP: 0700 Tainted: G W
OE (5.4.0-3-powerpc64le Debian 5.4.16-1)
[245116.074570] MSR: 9000000000029033 <SF,HV,EE,ME,IR,DR,RI,LE> CR:
84048224 XER: 20040000
[245116.074575] CFAR: c000000000447bb4 IRQMASK: 0
GPR00: c000000000447bb8 c000000ff760f810
c0080000079e6800 c000000e405ce530
GPR04: 0000000000000018 0000000000000003
0000000000000009 c00c000025ae0400
GPR08: 0000000ffed1d000 0000000000000001
c000000c6292e830 c00800000796cb30
GPR12: c0080000078aec78 c000000fffff2800
0000000000000001 0000000000000001
GPR16: c000000001143be0 c000000001143bf0
0000000000000000 0000000000000260
GPR20: 0000000000000000 0000000000000502
0000000000000000 0000000000015bdb
GPR24: 0000000000000000 000000000004c20a
0000000000000000 c000000007fff800
GPR28: c000000007fffc50 c000000e405ce530
c0080000079f2b00 c000000e405ce530
[245116.074600] NIP [c0080000078aec94] btrfs_destroy_inode+0x1c/0x288
[btrfs]
[245116.074605] LR [c000000000447bb8] destroy_inode+0x68/0xc0
[245116.074606] Call Trace:
[245116.074608] [c000000ff760f810] [c000000000447ba0]
destroy_inode+0x50/0xc0 (unreliable)
[245116.074612] [c000000ff760f840] [c000000000448290] dispose_list+0x80/0xb0
[245116.074614] [c000000ff760f880] [c00000000044a3a0]
prune_icache_sb+0x70/0xb0
[245116.074618] [c000000ff760f8d0] [c00000000041e6e8]
super_cache_scan+0x148/0x210
[245116.074621] [c000000ff760f940] [c00000000032f7d8]
do_shrink_slab+0x178/0x3b0
[245116.074623] [c000000ff760fa10] [c00000000032fd0c]
shrink_slab+0x2fc/0x4a0
[245116.074626] [c000000ff760faf0] [c0000000003370bc]
shrink_node+0x12c/0x600
[245116.074629] [c000000ff760fbb0] [c000000000338a84]
balance_pgdat+0x344/0x6b0
[245116.074631] [c000000ff760fce0] [c000000000339070] kswapd+0x280/0x5c0
[245116.074633] [c000000ff760fdb0] [c000000000155668] kthread+0x148/0x1a0
[245116.074638] [c000000ff760fe20] [c00000000000bd54]
ret_from_kernel_thread+0x5c/0x68
[245116.074639] Instruction dump:
[245116.074641] 7c0803a6 4e800020 60000000 00137b88 00000000 3c4c0013
38427b88 7c0802a6
[245116.074646] 60000000 e9430138 312affff 7d295110 <0b090000> e94301c8
312affff 7d295110
[245116.074651] ---[ end trace 39e70c580b2c71e0 ]---