On Thu, Jul 16, 2020 at 12:27 AM Zygo Blaxell
<ce3g8jdj@xxxxxxxxxxxxxxxxxxxxx> wrote:
>
> On Tue, Jul 14, 2020 at 10:49:08PM -0400, John Petrini wrote:
> > I've done this and the filesystem mounted successfully though when
> > attempting to cancel the balance it just tells me it's not running.
>
> That's fine, as long as it stops one way or another.
>
> > > Aside: data-raid6 metadata-raid10 isn't a sane configuration. It
> > > has 2 redundant disks for data and 1 redundant disk for metadata, so
> > > the second parity disk in raid6 is wasted space.
> > >
> > > The sane configurations for parity raid are:
> > >
> > > data-raid6 metadata-raid1c3 (2 parity stripes for data, 3 copies
> > > for metadata, 2 disks can fail, requires 3 or more disks)
> > >
> > > data-raid5 metadata-raid10 (1 parity stripe for data, 2 copies
> > > for metadata, 1 disk can fail, requires 4 or more disks)
> > >
> > > data-raid5 metadata-raid1 (1 parity stripe for data, 2 copies
> > > for metadata, 1 disk can fail, requires 2 or more disks)
> > >
> >
> > This is very interesting. I had no idea that raid1c3 was an option
> > though it sounds like I may need a really recent kernel version?
>
> 5.5 or later.
Okay I'll look into getting on this version since that's a killer feature.
>
> > btrfs fi usage /mnt/storage-array/
> > WARNING: RAID56 detected, not implemented
> > Overall:
> > Device size: 67.31TiB
> > Device allocated: 65.45TiB
> > Device unallocated: 1.86TiB
> > Device missing: 0.00B
> > Used: 65.14TiB
> > Free (estimated): 1.12TiB (min: 1.09TiB)
> > Data ratio: 1.94
> > Metadata ratio: 2.00
> > Global reserve: 512.00MiB (used: 0.00B)
> >
> > Data,RAID10: Size:32.68TiB, Used:32.53TiB
> > /dev/sda 4.34TiB
> > /dev/sdb 4.34TiB
> > /dev/sdc 4.34TiB
> > /dev/sdd 2.21TiB
> > /dev/sde 2.21TiB
> > /dev/sdf 4.34TiB
> > /dev/sdi 1.82TiB
> > /dev/sdj 1.82TiB
> > /dev/sdk 1.82TiB
> > /dev/sdl 1.82TiB
> > /dev/sdm 1.82TiB
> > /dev/sdn 1.82TiB
> >
> > Data,RAID6: Size:1.04TiB, Used:1.04TiB
> > /dev/sda 413.92GiB
> > /dev/sdb 413.92GiB
> > /dev/sdc 413.92GiB
> > /dev/sdd 119.07GiB
> > /dev/sde 119.07GiB
> > /dev/sdf 413.92GiB
> >
> > Metadata,RAID10: Size:40.84GiB, Used:39.80GiB
> > /dev/sda 5.66GiB
> > /dev/sdb 5.66GiB
> > /dev/sdc 5.66GiB
> > /dev/sdd 2.41GiB
> > /dev/sde 2.41GiB
> > /dev/sdf 5.66GiB
> > /dev/sdi 2.23GiB
> > /dev/sdj 2.23GiB
> > /dev/sdk 2.23GiB
> > /dev/sdl 2.23GiB
> > /dev/sdm 2.23GiB
> > /dev/sdn 2.23GiB
> >
> > System,RAID10: Size:96.00MiB, Used:3.06MiB
> > /dev/sda 8.00MiB
> > /dev/sdb 8.00MiB
> > /dev/sdc 8.00MiB
> > /dev/sdd 8.00MiB
> > /dev/sde 8.00MiB
> > /dev/sdf 8.00MiB
> > /dev/sdi 8.00MiB
> > /dev/sdj 8.00MiB
> > /dev/sdk 8.00MiB
> > /dev/sdl 8.00MiB
> > /dev/sdm 8.00MiB
> > /dev/sdn 8.00MiB
> >
> > Unallocated:
> > /dev/sda 4.35TiB
> > /dev/sdb 4.35TiB
> > /dev/sdc 4.35TiB
> > /dev/sdd 2.22TiB
> > /dev/sde 2.22TiB
> > /dev/sdf 4.35TiB
> > /dev/sdi 1.82TiB
> > /dev/sdj 1.82TiB
> > /dev/sdk 1.82TiB
> > /dev/sdl 1.82TiB
> > /dev/sdm 1.82TiB
> > /dev/sdn 1.82TiB
>
> Plenty of unallocated space. It should be able to do the conversion.
After upgrading, the unallocated space tells a different story. Maybe
due to the newer kernel or btrfs-progs?
Unallocated:
/dev/sdd 1.02MiB
/dev/sde 1.02MiB
/dev/sdl 1.02MiB
/dev/sdn 1.02MiB
/dev/sdm 1.02MiB
/dev/sdk 1.02MiB
/dev/sdj 1.02MiB
/dev/sdi 1.02MiB
/dev/sdb 1.00MiB
/dev/sdc 1.00MiB
/dev/sda 5.90GiB
/dev/sdg 5.90GiB
This is after clearing up additional space on the filesytem. When I
started the conversion there was only ~300G available. There's now
close 1TB according to df.
/dev/sdd 68T 66T 932G 99% /mnt/storage-array
So I'm not sure what to make of this and whether it's safe to start
the conversion again. I don't feel like I can trust the unallocated
space before or after the upgrade.
Here's the versions I'm on now:
sudo dpkg -l | grep btrfs-progs
ii btrfs-progs 5.4.1-2
amd64 Checksumming Copy on Write Filesystem utilities
uname -r
5.4.0-40-generic
>
> > > You didn't post the dmesg messages from when the filesystem went
> > > read-only, but metadata 'total' is very close to 'used', you were doing
> > > a balance, and the filesystem went read-only, so I'm guessing you hit
> > > ENOSPC for metadata due to lack of unallocated space on at least 4 drives
> > > (minimum for raid10).
> > >
> >
> > Here's a paste of everything in dmesg: http://paste.openstack.org/show/795929/
>
> Unfortunately the original errors are no longer in the buffer. Maybe
> try /var/log/kern.log?
>
Found it. So this was a space issue. I knew the filesystem was very
full but figured ~300G would be enough.
kernel: [3755232.352221] BTRFS: error (device sdd) in
__btrfs_free_extent:4860: errno=-28 No space left
kernel: [3755232.352227] BTRFS: Transaction aborted (error -28)
ernel: [3755232.354693] BTRFS info (device sdd): forced readonly
kernel: [3755232.354700] BTRFS: error (device sdd) in
btrfs_run_delayed_refs:2795: errno=-28 No space left
> > > > uname -r
> > > > 5.3.0-40-generic
> > >
> > > Please upgrade to 5.4.13 or later. Kernels 5.1 through 5.4.12 have a
> > > rare but nasty bug that is triggered by writing at exactly the wrong
> > > moment during balance. 5.3 has some internal defenses against that bug
> > > (the "write time tree checker"), but if they fail, the result is metadata
> > > corruption that requires btrfs check to repair.
> > >
> >
> > Thanks for the heads up. I'm getting it updated now and will attempt
> > to remount once I do. Once it's remounted how should I proceed? Can I
> > just assume the filesystem is healthy at that point? Should I perform
> > a scrub?
>
> If scrub reports no errors it's probably OK.
I did run a scrub and it came back clean.
>
> A scrub will tell you if any data or metadata is corrupted or any
> parent-child pointers are broken. That will cover most of the common
> problems. If the original issue was a spurious ENOSPC then everything
> should be OK. If the original issue was a write time tree corruption
> then it should be OK. If the original issue was something else, it
> will present itself again during the scrub or balance.
>
> If there are errors, scrub won't attribute them to the right disks for
> raid6. It might be worth reading
>
> https://lore.kernel.org/linux-btrfs/20200627032414.GX10769@xxxxxxxxxxxxxx/
>
> for a list of current raid5/6 issues to be aware of.
Thanks. This is good info.