Re: Errors not found by btrfsck or scrub

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Jan 11, 2013 at 01:13:24PM -0500, Chris Carlin wrote:
> I have a week-old filesystem that is reported clean by btrfsck and
> scrub, but that fails under operations ranging from du to sync and
> umount (but no failures if mounted readonly).
> 
> My problem sounds similar to a few other reports (e.g. TM's in
> http://thread.gmane.org/gmane.comp.file-systems.btrfs/22014 ) that
> seem to hint at problems with full metadata. My df shows:
> 
> # btrfs fi df /mnt/btrfs
> Data, RAID0: total=776.32GB, used=717.56GB
> Data: total=81.00GB, used=29.44GB
> System, DUP: total=8.00MB, used=72.00KB
> System: total=4.00MB, used=0.00
> Metadata, RAID1: total=512.00MB, used=511.60MB
> Metadata, DUP: total=1.00GB, used=1022.39MB
> 
> That looks suspicious to me, both the 1GB vs 1022MB and that there is

   1GB in this output is 1024 MiB (i.e. it's actually 1 GiB, not 1
GB), so it's not screwed-up accounting, just confusing reporting.

> both DUP and RAID1 metadata. The balance operation I ran after adding
> a second device finished without errors; could it have actually
> failed? At this point balance DOES fail (locks up) every time...

   The balance probably did half the DUP -> RAID-1 conversion of your
metadata and then had its problems. I wouldn't worry about this too
much.

> This computer is Ubuntu, but I've updated to the latest kernel and
> btrfs-tools I could find, and the problems remain.
> 
> Below is what showed up in dmesg during the run of scrub. Most of the
> time the error is "btrfs: block rsv returned -28", but the aborted
> transaction and auto-ro is always there.

   Just for reference, -28 is -ENOSPC.

> Anything I can do to help identify a bug here? Clearly one problem is
> that the filesystem checking tools can't find anything wrong, much
> less fix the filesystem.

> [12208.367199] Pid: 1955, comm: btrfs-transacti Not tainted
> 3.5.7-03050702-generic #201212170935
  ^^^^^ There's significant ENOSPC fixes since this point. A new
kernel (3.7) will probably help some of the way -- see below for some
of the details.

   The other thing I'd like to check is what balance command you're
using. With your current problems, I'd suggest the following:

# btrfs balance start -dusage=5 /mountpoint

   This will attempt to move the data in every data chunk which is
less than 5% full. With a 3.7 kernel (but not 3.5 IIRC(*)), that
should free up most of the 60 GiB of allocated but unused data space
you've got.

   Hugo.

(*) Kernels before 3.5, possibly 3.6 -- I can't recall exactly when it
got fixed -- had a problem where they'd massively overallocate chunks.
With those earlier kernels, you could end up with a situation like
yours which wouldn't be helped by the balance operation.

-- 
=== Hugo Mills: hugo@... carfax.org.uk | darksatanic.net | lug.org.uk ===
  PGP key: 515C238D from wwwkeys.eu.pgp.net or http://www.carfax.org.uk
   --- Two things came out of Berkeley in the 1960s: LSD and Unix. ---   
                       This is not a coincidence.                        

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux