On 27.05.19 г. 8:10 ч., Qu Wenruo wrote:
> In convert we use trans->block_reserved >= 4096 as a threshold to commit
> transaction, where block_reserved is the number of new tree blocks
> allocated inside a transaction.
>
> The problem is, we still have a hidden bug in delayed ref implementation
> in btrfs-progs, when we have a large enough transaction, delayed ref may
> failed to find certain tree blocks in extent tree and cause transaction
> abort.
>
Wouldn't it make more sense to actually debug that hidden bug? Can it be
reproduced by some synthetic test? Arguably, it should be a lot easier
to debug that in user space with gdb and all that ?
> This workaround will workaround it by committing transaction at a much
> lower threshold.
>
> The old 4096 means 4096 new tree blocks, when using default (16K)
> nodesize, it's 64M, which can contain over 12k inlined data extent or
> csum for around 60G, or over 800K file extents.
>
> The new threshold will limit the size of new tree blocks to 2M, aligning
> with the chunk preallocator threshold, and reducing the possibility to
> hit that delayed ref bug.
>
> Signed-off-by: Qu Wenruo <wqu@xxxxxxxx>
> ---
> convert/source-ext2.c | 13 ++++++++++++-
> 1 file changed, 12 insertions(+), 1 deletion(-)
>
> diff --git a/convert/source-ext2.c b/convert/source-ext2.c
> index a136e5652898..63cf71fe10d5 100644
> --- a/convert/source-ext2.c
> +++ b/convert/source-ext2.c
> @@ -829,7 +829,18 @@ static int ext2_copy_inodes(struct btrfs_convert_context *cctx,
> pthread_mutex_unlock(&p->mutex);
> if (ret)
> return ret;
> - if (trans->blocks_used >= 4096) {
> + /*
> + * blocks_used is the number of new tree blocks allocated in
> + * current transaction.
> + * Use a small amount of it to workaround a bug where delayed
> + * ref may fail to locate tree blocks in extent tree.
> + *
> + * 2M is the threshold to kick chunk preallocator into work,
> + * For default (16K) nodesize it will be 128 tree blocks,
> + * large enough to contain over 300 inlined files or
> + * around 26k file extents. Which should be good enough.
> + */
> + if (trans->blocks_used >= SZ_2M / root->fs_info->nodesize) {
> ret = btrfs_commit_transaction(trans, root);
> BUG_ON(ret);
> trans = btrfs_start_transaction(root, 1);
>