Re: [PATCH] Btrfs: correctly caculate item size used when item key collision happends

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 08/14/2018 11:05 AM, ethanwu wrote:
> Item key collision is allowed for some item types, like dir item and
> inode refs, but the overall item size is limited by the leafsize.
> 
> item size(ins_len) passed from btrfs_insert_empty_items to
> btrfs_search_slot already contains size of btrfs_item.
> 
> When btrfs_search_slot reaches leaf, we'll see if we need to split leaf,
> since the ins_len includes one struct btrfs_item, the check might
> fail even though new item we try to insert could merge into the existing
> one without adding new btrfs_item.
> And split_leaf return -EOVERFLOW from following code:
> if (extend && data_size + btrfs_item_size_nr(l, slot) +
>     sizeof(struct btrfs_item) > BTRFS_LEAF_DATA_SIZE(fs_info))
>     return -EOVERFLOW;
> 
> In most cases, when callers receive -EOVERFLOW, they either return
> this error or handle in different ways. For example, in normal dir item
> insertion the userspace will get errno EOVERFLOW; in inode ref case
> INODE_EXTREF is used instead if INODE_REF is full.
> 
> However, this is not the case for rename. To avoid the unrecoverable
> situation in rename, btrfs_check_dir_item_collision is called in
> early phase of rename. In this function, when item key collision is
> detected leaf space is checked:
> 
> data_size = sizeof(*di) + name_len;
> if (data_size + btrfs_item_size_nr(leaf, slot) +
>     sizeof(struct btrfs_item) > BTRFS_LEAF_DATA_SIZE(root->fs_info))
> 
> the sizeof(struct btrfs_item) + btrfs_item_size_nr(leaf, slot) here
> refers to existing item size.
> 
> The check condition here is not consistent with the btrfs_search_slot
> when item key collision happens. We might pass check here but fail
> at btrfs_search_slot.
> 
> in the rename call path
> btrfs_add_link
>   btrfs_insert_dir_item
>     insert_with_overflow
>       btrfs_insert_empty_items (btrfs item is counted)
>         btrfs_search_slot
> 
> if (ins_len > 0 &&
>     btrfs_leaf_free_space(fs_info, b) < ins_len) {
> 
> The ins_len here contains btrfs_item and the item data, but this
> btrfs_item is already in leaf used space, two btrfs_item is counted and
> we only need one when this is item key collision cases.
> Therfore, rename fails, and abort transaction is triggered with
> following error messages:
> BTRFS: error (device loop0) in btrfs_rename:9870: errno=-75 unknown
> 
> There are two ways to fix rename issue, one is to revert the patch
> 878f2d2cb355 Btrfs: fix max dir item size calculation
> to make the condition consistent.
> 
> The other way is to handle the leaf space check correctly when
> collision happens. I prefer the second one since it correct leaf
> space check in collision case. This fix needs unify the usage of ins_len
> in btrfs_search_slot to contain btrfs_item anyway and
> adjust all callers of btrfs_search_slot that intentionally pass ins_len
> without btrfs_item size to add size of btrfs_item from now.
> 
> dir item hash collision is not easy to reproduce.

I used this when testing dir_item related things:

http://crypto.junod.info/2012/12/13/hash-dos-and-btrfs/

> The following is a leaf sample filled with inode ref.
> 
> Before applying the patch, when item data reaches 16200
> and we want to add another link with namelen 26(inode ref size 36)
> It will not pass the leafspace check
> 10(inode ref item) + 26(name len) + 25(btrfs item) >
>     leaf free space 58
> and use BTRFS_INODE_EXTREF_KEY instead.
> Nevertheless, The 25 bytes btrfs_item is not needed because the
> newly inserted item could be merged with the existing one.
> 
> before patch:
> leaf 31571968 items 1 free space 58 generation 178 owner 262
> fs uuid 1abc143e-54af-491f-bff8-e58e21ad26e5
> chunk uuid 688bc1b5-5407-4f2d-9986-3dc3bf3019d3
>     item 0 key (261 INODE_REF 257) itemoff 83 itemsize 16200
>         inode ref index 504 namelen 26 name: abcdefghijklmnopqrstuv0001
>         inode ref index 505 namelen 26 name: abcdefghijklmnopqrstuv0002
>         inode ref index 506 namelen 26 name: abcdefghijklmnopqrstuv0003
>         ...
>         inode ref index 953 namelen 26 name: abcdefghijklmnopqrstuv0450
> 
> after patch:
> leaf 31899648 items 1 free space 22 generation 180 owner 262
> fs uuid 1abc143e-54af-491f-bff8-e58e21ad26e5
> chunk uuid 688bc1b5-5407-4f2d-9986-3dc3bf3019d3
>     item 0 key (263 INODE_REF 262) itemoff 47 itemsize 16236
>         inode ref index 504 namelen 26 name: abcdefghijklmnopqrstuv0001
>         inode ref index 505 namelen 26 name: abcdefghijklmnopqrstuv0002
>         inode ref index 506 namelen 26 name: abcdefghijklmnopqrstuv0003
>         ...
>         inode ref index 953 namelen 26 name: abcdefghijklmnopqrstuv0450
>         inode ref index 452 namelen 26 name: abcdefghijklmnopqrstuv0451
> 
> Signed-off-by: ethanwu <ethanwu@xxxxxxxxxxxx>
> ---
>  fs/btrfs/ctree.c       | 15 +++++++++++++--
>  fs/btrfs/extent-tree.c |  5 +++--
>  fs/btrfs/file-item.c   |  2 +-
>  3 files changed, 17 insertions(+), 5 deletions(-)
> 
> diff --git a/fs/btrfs/ctree.c b/fs/btrfs/ctree.c
> index 4bc326d..3614dd9 100644
> --- a/fs/btrfs/ctree.c
> +++ b/fs/btrfs/ctree.c
> @@ -2678,8 +2678,8 @@ static struct extent_buffer *btrfs_search_slot_get_root(struct btrfs_root *root,
>   * @p:		Holds all btree nodes along the search path
>   * @root:	The root node of the tree
>   * @key:	The key we are looking for
> - * @ins_len:	Indicates purpose of search, for inserts it is 1, for
> - *		deletions it's -1. 0 for plain searches
> + * @ins_len:	Indicates purpose of search, for inserts it is item size
> + *		including btrfs_item, for deletions it's -1. 0 for plain searches
>   * @cow:	boolean should CoW operations be performed. Must always be 1
>   *		when modifying the tree.
>   *
> @@ -2893,6 +2893,17 @@ int btrfs_search_slot(struct btrfs_trans_handle *trans, struct btrfs_root *root,
>  			}
>  		} else {
>  			p->slots[level] = slot;
> +			/*
> +			 * item key collision happens. In this case, if we are allow
> +			 * to insert the item(for example, in dir_item case, item key
> +			 * collision is allowed), it will be merged with the original
> +			 * item. Only the item size grows, no new btrfs item will be
> +			 * added. Since the ins_len already accounts the size btrfs_item,
> +			 * this value is counted twice. Duduct this value here so the
> +			 * leaf space check will be correct.
> +			 */
> +			if (ret == 0 && ins_len > 0)
> +				ins_len -= sizeof(struct btrfs_item);
>  			if (ins_len > 0 &&
>  			    btrfs_leaf_free_space(fs_info, b) < ins_len) {
>  				if (write_lock_level < 1) {
> diff --git a/fs/btrfs/extent-tree.c b/fs/btrfs/extent-tree.c
> index 3d9fe58..4e3aa09 100644
> --- a/fs/btrfs/extent-tree.c
> +++ b/fs/btrfs/extent-tree.c
> @@ -1094,7 +1094,7 @@ static int convert_extent_item_v0(struct btrfs_trans_handle *trans,
>  
>  	new_size -= sizeof(*ei0);
>  	ret = btrfs_search_slot(trans, root, &key, path,
> -				new_size + extra_size, 1);
> +				new_size + extra_size + sizeof(struct btrfs_item), 1);
>  	if (ret < 0)
>  		return ret;
>  	BUG_ON(ret); /* Corruption */
> @@ -1644,7 +1644,8 @@ int lookup_inline_extent_backref(struct btrfs_trans_handle *trans,
>  	}
>  
>  again:
> -	ret = btrfs_search_slot(trans, root, &key, path, extra_size, 1);
> +	ret = btrfs_search_slot(trans, root, &key, path,
> +			    extra_size + sizeof(struct btrfs_item), 1);
>  	if (ret < 0) {
>  		err = ret;
>  		goto out;
> diff --git a/fs/btrfs/file-item.c b/fs/btrfs/file-item.c
> index f9dd6d1..0b6c581 100644
> --- a/fs/btrfs/file-item.c
> +++ b/fs/btrfs/file-item.c
> @@ -804,7 +804,7 @@ int btrfs_csum_file_blocks(struct btrfs_trans_handle *trans,
>  	 */
>  	btrfs_release_path(path);
>  	ret = btrfs_search_slot(trans, root, &file_key, path,
> -				csum_size, 1);
> +				csum_size + sizeof(struct btrfs_item), 1);
>  	if (ret < 0)
>  		goto fail_unlock;
>  
> 


-- 
Hans van Kranenburg



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux