Re: [PATCH] Btrfs: reserve space specifically for the extent root

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 30, 2009 at 11:24 AM, Josef Bacik <josef@xxxxxxxxxx> wrote:
> On Fri, Oct 30, 2009 at 09:48:55AM +0800, Yan, Zheng  wrote:
>> On Fri, Oct 30, 2009 at 4:35 AM, Josef Bacik <josef@xxxxxxxxxx> wrote:
>> > If there are alot of snapshots we could conceivably end up using much more
>> > metadata space than what the worst case calculation would come up with, so
>> > instead of trying to figure out how many extents will be modified by changing
>> > items, just calculate it per tree, and then reserve space specifically for the
>> > extent root.  This patch does a max(size of extent root tree, 20% of free
>> > metadata space) reservation for the extent root, and then tracks how much
>> > space is used per transaction for the extent root.  Then if we get close to
>> > this amount in our transaction we will force the commit of the transaction to
>> > free up space.  This will help us better deal with the massive space
>> > requirements that will come when users have alot of snapshots.  This was
>> > tested with my normal enospc tests.  Thanks,
>> >
>> > Signed-off-by: Josef Bacik <josef@xxxxxxxxxx>
>> > ---
>> >  fs/btrfs/ctree.h       |    2 +
>> >  fs/btrfs/extent-tree.c |  115 +++++++++++++++++++++++++++++++++++++++++-------
>> >  fs/btrfs/transaction.c |   19 +++++++-
>> >  fs/btrfs/transaction.h |    1 +
>> >  4 files changed, 119 insertions(+), 18 deletions(-)
>> >
>> > diff --git a/fs/btrfs/ctree.h b/fs/btrfs/ctree.h
>> > index e5dd628..3e00df1 100644
>> > --- a/fs/btrfs/ctree.h
>> > +++ b/fs/btrfs/ctree.h
>> > @@ -2049,6 +2049,8 @@ void btrfs_delalloc_reserve_space(struct btrfs_root *root, struct inode *inode,
>> >                                 u64 bytes);
>> >  void btrfs_delalloc_free_space(struct btrfs_root *root, struct inode *inode,
>> >                              u64 bytes);
>> > +int btrfs_check_reserved_extent_space(struct btrfs_trans_handle *trans,
>> > +                                     struct btrfs_root *root);
>> >  /* ctree.c */
>> >  int btrfs_bin_search(struct extent_buffer *eb, struct btrfs_key *key,
>> >                     int level, int *slot);
>> > diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
>> > index c56f916..7a1f1a4 100644
>> > --- a/fs/btrfs/extent-tree.c
>> > +++ b/fs/btrfs/extent-tree.c
>> > @@ -2768,6 +2768,12 @@ static u64 calculate_bytes_needed(struct btrfs_root *root, int num_items)
>> >        u64 num_bytes;
>> >        int level;
>> >
>> > +       /*
>> > +        * The level is the number of node level's we could possibly cow.  So
>> > +        * since we track root item reservations seperately we don't need that,
>> > +        * and we don't need to count the leaf level, so the max number of
>> > +        * levels that could have nodes is MAX_LEVEL - 2.
>> > +        */
>> >        level = BTRFS_MAX_LEVEL - 2;
>> >        /*
>> >         * NOTE: these calculations are absolutely the worst possible case.
>> > @@ -2776,17 +2782,11 @@ static u64 calculate_bytes_needed(struct btrfs_root *root, int num_items)
>> >         */
>> >
>> >        /*
>> > -        * for every item we insert we could insert both an extent item and a
>> > -        * extent ref item.  Then for ever item we insert, we will need to cow
>> > -        * both the original leaf, plus the leaf to the left and right of it.
>> > -        *
>> > -        * Unless we are talking about the extent root, then we just want the
>> > -        * number of items * 2, since we just need the extent item plus its ref.
>> > +        * For each item we need we'll assume that we're going to add a leaf for
>> > +        * it, and then we may cow the leaves on either side to make room for
>> > +        * the items we're inserting, so multiply by 3.
>> >         */
>> > -       if (root == root->fs_info->extent_root)
>> > -               num_bytes = num_items * 2;
>> > -       else
>> > -               num_bytes = (num_items + (2 * num_items)) * 3;
>> > +       num_bytes = num_items * 3;
>> >
>> >        /*
>> >         * num_bytes is total number of leaves we could need times the leaf
>> > @@ -3031,6 +3031,64 @@ out:
>> >  }
>> >
>> >  /*
>> > + * This checks how many bytes we've allocated for the extent tree vs. how much
>> > + * we reserved, and if we need to go ahead and force the transaction to commit.
>> > + */
>> > +int btrfs_check_reserved_extent_space(struct btrfs_trans_handle *trans,
>> > +                                     struct btrfs_root *root)
>> > +{
>> > +       struct btrfs_fs_info *info = root->fs_info;
>> > +       struct btrfs_space_info *meta_sinfo;
>> > +       u64 used;
>> > +       u64 alloc_target;
>> > +
>> > +       /* don't force a commit if this is an embedded transaction */
>> > +       if (current->journal_info != trans)
>> > +               return 0;
>> > +
>> > +       /* get the space info for where the metadata will live */
>> > +       alloc_target = btrfs_get_alloc_profile(root, 0);
>> > +       meta_sinfo = __find_space_info(info, alloc_target);
>> > +
>> > +       spin_lock(&meta_sinfo->lock);
>> > +
>> > +       if (unlikely(!meta_sinfo->bytes_root)) {
>> > +               u64 bytes;
>> > +
>> > +               /*
>> > +                * somebody should have already reserved space before starting
>> > +                * the transaction, so WARN_ON so we can fix that person.
>> > +                */
>> > +               WARN_ON(1);
>> > +
>> > +               used = meta_sinfo->bytes_used + meta_sinfo->bytes_reserved +
>> > +                       meta_sinfo->bytes_pinned + meta_sinfo->bytes_readonly +
>> > +                       meta_sinfo->bytes_super + meta_sinfo->bytes_root +
>> > +                       meta_sinfo->bytes_may_use + meta_sinfo->bytes_delalloc;
>> > +
>> > +               bytes = meta_sinfo->total_bytes - used;
>> > +               meta_sinfo->bytes_root = calculate_bytes_needed(root, 6);
>> > +               bytes *= 20;
>> > +               bytes = div64_u64(bytes, 100);
>> > +               bytes = max(bytes,
>> > +                           btrfs_root_used(&info->extent_root->root_item));
>>
>> This won't work since the used space accounting in root item isn't accuracy.
>> We can fix the bug and use btrfsck to fix the accounting. But it introduces
>> backwards compatibility issue. I think we can use 2% of total space as
>> a temporary substitution.
>>
> I didn't look too thoroughly at this, but it looked like the accounting was done
> properly, since we update it in btrfs_update_block_group.  Oh though I guess we
> could always be using the extent_root, is that the problem?  If you let me know
> what the issue is I will fix it tomorrow.  Thanks,

All space allocations and freeing are processed by delayed ref, so we always
pass extent_root  to update_block_group. The value of used space in root item
of extent tree should be close to total used space of the FS.

Yan, Zheng
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux