When rebasing my qgroup + balance optimization patches, I found one
very
obvious performance regression for balance.
For normal 4G subvolume, 16 snapshots, balance workload, v4.20 kernel
only takes 3s to relocate a metadata block group, while for v5.0-rc1, I
don't really know how it will take as it hasn't finished yet.
Are you sure it's v5.0-rc1 regression, not earlier?
I'm trying to do a metadata-only balance from RAID-5 to RAID-1, with
4.19.8.
It was going relatively "normal", until it got stuck and showing no
progress.
I've canceled the balance, upgraded to 4.20, started the balance again.
For straight 11 days, it rewrote terabytes of data on the disks, with no
progress at all. Also, 4.19.8 had a balance interrupted because of "out
of space", despite we have terabytes free.
Metadata RAID-5 usage stays at 4.12GiB for the past 11 days (and a few
more days with 4.19.8).
# btrfs fi usage /data
WARNING: RAID56 detected, not implemented
WARNING: RAID56 detected, not implemented
Overall:
Device size: 14.47TiB
Device allocated: 112.06GiB
Device unallocated: 14.36TiB
Device missing: 0.00B
Used: 107.93GiB
Free (estimated): 0.00B (min: 8.00EiB)
Data ratio: 0.00
Metadata ratio: 1.64
Global reserve: 512.00MiB (used: 1.86MiB)
Data,RAID5: Size:5.28TiB, Used:3.04TiB
/dev/sda5 1.76TiB
/dev/sdb5 1.76TiB
/dev/sdc5 1.76TiB
/dev/sdd5 1.76TiB
Metadata,RAID1: Size:56.00GiB, Used:53.97GiB
/dev/sda5 29.00GiB
/dev/sdb5 27.00GiB
/dev/sdc5 27.00GiB
/dev/sdd5 29.00GiB
Metadata,RAID5: Size:12.38GiB, Used:11.13GiB
/dev/sda5 4.12GiB
/dev/sdb5 4.12GiB
/dev/sdc5 4.12GiB
/dev/sdd5 4.12GiB
System,RAID1: Size:32.00MiB, Used:416.00KiB
/dev/sdb5 32.00MiB
/dev/sdc5 32.00MiB
Unallocated:
/dev/sda5 1.83TiB
/dev/sdb5 1.83TiB
/dev/sdc5 1.83TiB
/dev/sdd5 1.83TiB
# btrfs balance status /data
Balance on '/data' is running
13 out of about 64 chunks balanced (15 considered), 80% left