Re: Massive metadata size increase after upgrade from 3.2.18 to 3.4.1

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Thu, 14 Jun 2012 13:33:16 +0200
David Sterba <dave@xxxxxxxx> wrote:

> On Sat, Jun 09, 2012 at 01:38:22AM +0600, Roman Mamedov wrote:
> > Before the upgrade (on 3.2.18):
> > 
> > Metadata, DUP: total=9.38GB, used=5.94GB
> > 
> > After the FS has been mounted once with 3.4.1:
> > 
> > Data: total=3.44TB, used=2.67TB
> > System, DUP: total=8.00MB, used=412.00KB
> > System: total=4.00MB, used=0.00
> > Metadata, DUP: total=84.38GB, used=5.94GB
> > 
> > Where did my 75 GB of free space just went?
> 
> This is caused by the patch (credits for bisecting it go to Arne)
> 
> commit cf1d72c9ceec391d34c48724da57282e97f01122
> Author: Chris Mason <chris.mason@xxxxxxxxxx>
> Date:   Fri Jan 6 15:41:34 2012 -0500
> 
>     Btrfs: lower the bar for chunk allocation
> 
>     The chunk allocation code has tried to keep a pretty tight lid on creating new
>     metadata chunks.  This is partially because in the past the reservation
>     code didn't give us an accurate idea of how much space was being used.
> 
>     The new code is much more accurate, so we're able to get rid of some of these
>     checks.
> ---
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -3263,27 +3263,12 @@ static int should_alloc_chunk(struct btrfs_root *root,
>                 if (num_bytes - num_allocated < thresh)
>                         return 1;
>         }
> -
> -       /*
> -        * we have two similar checks here, one based on percentage
> -        * and once based on a hard number of 256MB.  The idea
> -        * is that if we have a good amount of free
> -        * room, don't allocate a chunk.  A good mount is
> -        * less than 80% utilized of the chunks we have allocated,
> -        * or more than 256MB free
> -        */
> -       if (num_allocated + alloc_bytes + 256 * 1024 * 1024 < num_bytes)
> -               return 0;
> -
> -       if (num_allocated + alloc_bytes < div_factor(num_bytes, 8))
> -               return 0;
> -
>         thresh = btrfs_super_total_bytes(root->fs_info->super_copy);
> 
> -       /* 256MB or 5% of the FS */
> -       thresh = max_t(u64, 256 * 1024 * 1024, div_factor_fine(thresh, 5));
> +       /* 256MB or 2% of the FS */
> +       thresh = max_t(u64, 256 * 1024 * 1024, div_factor_fine(thresh, 2));
> 
> -       if (num_bytes > thresh && sinfo->bytes_used < div_factor(num_bytes, 3))
> +       if (num_bytes > thresh && sinfo->bytes_used < div_factor(num_bytes, 8))
>                 return 0;
>         return 1;
>  }
> ---
> 
> Originally there were 2 types of check, based on +256M and on
> percentage. The former are removed which leaves only the percentage
> thresholds. If there's less than 2% of the fs of metadata actually used,
> the metadata are reserved exactly to 2%. When acutual usage goes over
> 2%, there's always at least 20% over-reservation,
> 
>    sinfo->bytes_used < div_factor(num_bytes, 8)
> 
> ie the threshold is 80%, which may be wasteful for large fs.
> 
> So, the metadata chunks are immediately pinned to 2% of the filesystem
> after first few writes, and this is what you observe.
> 
> Running balance will remove the unused metadata chunks, but only to the
> 2% level.
> 
> [end of analysis]
> 
> So what to do now? Simply reverting the +256M checks works and restores
> more or less the original behaviour.


Thanks.
So should I try restoring both of these, and leave the rest as is?

> -       if (num_allocated + alloc_bytes + 256 * 1024 * 1024 < num_bytes)
> -               return 0;
> -
> -       if (num_allocated + alloc_bytes < div_factor(num_bytes, 8))
> -               return 0;

Or would it make more sense to try rolling back that patch completely?

-- 
With respect,
Roman

~~~~~~~~~~~~~~~~~~~~~~~~~~~
"Stallman had a printer,
with code he could not see.
So he began to tinker,
and set the software free."

Attachment: signature.asc
Description: PGP signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux