Re: btrfs balance to add new drive taking ~60 hours, no progress?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[Resending because gmail on iphone didn't do plain text]

On Sun, Mar 1, 2020 at 5:10 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
> Free is 1.82 exactly half of  unallocated on one drive and no
> unallocate on the other drives, so yeah this file system is 100% full.
> Adding one drive was not enough, it's raid1. You needed to add two
> drives.

I'm not following the btrfs logic here - I had three drives, 2 x 1 TB
and a 1 x 4 TB and added a 4TB.

That was a total of 4TB in RAID0.  Wouldn't adding a fourth drive give
me 6TB and some of the blocks just moved from the three drives onto
the fourth during the rebalance?

Is there a particular 2nd copy policy I'm not aware of?

Or is it that it is trying to create new allocations on the new drive
as part of the balance but can't because they wouldn't be mirrored?
But I still don't get why it wouldn't move blocks from the full
drives...

>
> So now what? The problem is you have a balance in-progress, and a
> cancel in-progress, and I'm not sure which is less risky:
>
> - add another device, even if it's small like a 32G partition or flash drive
> - force reboot

I have a 150 GB of files I can remove ... I'll try that first.

Thank you for your help.

> What I *would* do before you do anything else is disable the write
> cache on all the drives. At least that way if you have to force a
> reboot, there's less of a chance COW and barrier guarantees can be
> thwarted.
>
> Be careful with hdparm, small w is dangerous, capital W is what you want.

Oh good idea!

>
>
> --
> Chris Murphy

On Sun, Mar 1, 2020 at 5:10 PM Chris Murphy <lists@xxxxxxxxxxxxxxxxx> wrote:
>
> On Sun, Mar 1, 2020 at 1:32 PM Rich Rauenzahn <rrauenza@xxxxxxxxx> wrote:
> >
> > (Is this just taking really long because I didn't provide filters when
> > balancing across the new drive?)
>
> I don't think so. It might be fairly wedged in because it has no
> unallocated space on 3 of 4 drives, and is writing into already
> allocated block groups.
>
> I think the mistake was adding only one new drive instead of two *and*
> then also doing a balance.
>
> I also think it's possible there's a bug, where Btrfs is trying too
> hard to avoid ENOSPC. Ironic if true. It should just give up, or at
> least it should cancel faster.
>
> >
> > $ sudo btrfs fi show /.BACKUPS/
> > Label: 'BACKUPS'  uuid: cfd65dcd-2a63-4fb1-89a7-0bb9ebe66ddf
> >         Total devices 4 FS bytes used 3.64TiB
> >         devid    2 size 1.82TiB used 1.82TiB path /dev/sda1
> >         devid    3 size 1.82TiB used 1.82TiB path /dev/sdc1
> >         devid    4 size 3.64TiB used 3.64TiB path /dev/sdb1
> >         devid    5 size 3.64TiB used 8.31MiB path /dev/sdj1
>
> This suggests 3 of 4 are full.
>
>
>
> > $ sudo btrfs fi usage /.BACKUPS/
> > Overall:
> >     Device size:                  10.92TiB
> >     Device allocated:              7.28TiB
> >     Device unallocated:            3.64TiB
> >     Device missing:                  0.00B
> >     Used:                          7.27TiB
> >     Free (estimated):              1.82TiB      (min: 1.82TiB)
> >     Data ratio:                       2.00
> >     Metadata ratio:                   2.00
> >     Global reserve:              512.00MiB      (used: 0.00B)
> >
> > Data,RAID1: Size:3.63TiB, Used:3.63TiB
> >    /dev/sda1       1.82TiB
> >    /dev/sdb1       3.63TiB
> >    /dev/sdc1       1.82TiB
> >    /dev/sdj1       8.31MiB
> >
> > Metadata,RAID1: Size:5.00GiB, Used:3.88GiB
> >    /dev/sda1       3.00GiB
> >    /dev/sdb1       5.00GiB
> >    /dev/sdc1       2.00GiB
> >
> > System,RAID1: Size:32.00MiB, Used:736.00KiB
> >    /dev/sda1      32.00MiB
> >    /dev/sdb1      32.00MiB
> >
> > Unallocated:
> >    /dev/sda1       1.00MiB
> >    /dev/sdb1       1.00MiB
> >    /dev/sdc1       1.00MiB
> >    /dev/sdj1       3.64TiB
>
> Free is 1.82 exactly half of  unallocated on one drive and no
> unallocate on the other drives, so yeah this file system is 100% full.
> Adding one drive was not enough, it's raid1. You needed to add two
> drives.
>
> So now what? The problem is you have a balance in-progress, and a
> cancel in-progress, and I'm not sure which is less risky:
>
> - add another device, even if it's small like a 32G partition or flash drive
> - force reboot
>
> What I *would* do before you do anything else is disable the write
> cache on all the drives. At least that way if you have to force a
> reboot, there's less of a chance COW and barrier guarantees can be
> thwarted.
>
> Be careful with hdparm, small w is dangerous, capital W is what you want.
>
> --
> Chris Murphy



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux