Ok, first I'll update the subject line
On Fri, Feb 21, 2020 at 10:45:45AM +0500, Roman Mamedov wrote:
> On Thu, 20 Feb 2020 21:38:04 -0800
> Marc MERLIN <marc@xxxxxxxxxxx> wrote:
>
> > I had a closer look, and even with 5.4.20, my whole lv is full now:
> > LV Name thinpool2
> > Allocated pool data 99.99%
> > Allocated metadata 59.88%
>
> Oversubscribing thin storage should be done carefully and only with a very
> good reason, and when you run out of something you didn't have in the first
> place, seems hard to blame Btrfs or anyone else for it.
let's rewind.
It's a backup server, I used to have everything in a single 14TB
filesystem, I had too many snapshots, and was told to break it up in
smaller filesystems to work around btrfs' inability to scale properly
past a hundred snapshots or so (and that many snapshots blowing up both
kinds of btrfs check --repair, one of them forced me to buy 16GB of RAM
to max out my server until it still ran out of RAM and now I can't add
any).
I'm obviously not going to the olden days of making actual partitions
and guessing wrong every time how big each partition should be, so my
only solution left was to use dm-thin and subscribe the entire space to
all LVs.
I then have a cronjob that warns me if I start running low on in the
global VG pool.
Now, where it got confusing is that around the time I put the 5.4 with
the df problem, is the same time df filled up to 100% and started
mailing me. I ignored it because I knew about the bug.
However, I just found out that my LV actually filled up due to another
bug that was actually my fault.
Now, I triggered some real bugs in btrfs, see:
gargamel:/mnt/btrfs_pool2/backup/ubuntu# btrfs fi show .
Label: 'ubuntu' uuid: 905c90db-8081-4071-9c79-57328b8ac0d5
Total devices 1 FS bytes used 445.73GiB
devid 1 size 14.00TiB used 8.44TiB path /dev/mapper/vgds2-ubuntu
Ok, I'm using 445GB, but losing 8.4TB, sigh.
LV Path /dev/vgds2/ubuntu
LV Name ubuntu
LV Pool name thinpool2
LV Size 14.00 TiB
Mapped size 60.25% <= this is all the space free in my VG, so it's full now
We talked about fstrim, let's try that:
gargamel:/mnt/btrfs_pool2/backup/ubuntu# fstrim -v .
.: 5.6 TiB (6116423237632 bytes) trimmed
Oh, great. Except this freed up nothing in LVM.
gargamel:/mnt/btrfs_pool2/backup/ubuntu# btrfs balance start -musage=0 -v .
Dumping filters: flags 0x6, state 0x0, force is off
METADATA (flags 0x2): balancing, usage=0
SYSTEM (flags 0x2): balancing, usage=0
ERROR: error during balancing '.': Read-only file system
Ok, right, need to unmount/remount to clear read-only;
gargamel:/mnt/btrfs_pool2/backup/ubuntu# btrfs balance start -musage=0 -v .
Dumping filters: flags 0x6, state 0x0, force is off
METADATA (flags 0x2): balancing, usage=0
SYSTEM (flags 0x2): balancing, usage=0
Done, had to relocate 0 out of 8624 chunks
gargamel:/mnt/btrfs_pool2/backup/ubuntu# btrfs balance start -dusage=0 -v .
Dumping filters: flags 0x1, state 0x0, force is off
DATA (flags 0x2): balancing, usage=0
Done, had to relocate 0 out of 8624 chunks
gargamel:/mnt/btrfs_pool2/backup/ubuntu# btrfs fi show .
Label: 'ubuntu' uuid: 905c90db-8081-4071-9c79-57328b8ac0d5
Total devices 1 FS bytes used 8.42TiB
devid 1 size 14.00TiB used 8.44TiB path /dev/mapper/vgds2-ubuntu
Well, carap, see how 'used' went from 445.73GiB to 8.42TiB after balance?
I ran du to make sure my data is indeed only using 445GB.
So now, I'm pretty much hosed, the fielsystem seems to have been damaged in interesting ways.
I'll wait until tomorrow in case someone wants something from it, and I'll delete the entire
LV and start over.
And now for extra points, this also damaged a 2nd of my filesystems on the same VG :(
[64723.601630] BTRFS error (device dm-17): bad tree block start, want 5782272294912 have 0
[64723.628708] BTRFS error (device dm-17): bad tree block start, want 5782272294912 have 0
[64897.028176] BTRFS error (device dm-13): parent transid verify failed on 22724608 wanted 10005 found 10001
[64897.080355] BTRFS error (device dm-13): parent transid verify failed on 22724608 wanted 10005 found 10001
Marc
--
"A mouse is a device used to point at the xterm you want to type in" - A.S.R.
Home page: http://marc.merlins.org/ | PGP 7F55D5F27AAF9D08