The subject should reflect the problem being fixed. On Fri, May 22, 2015 at 01:46:54PM +0800, Zhaolei wrote: > From: Zhao Lei <zhaolei@xxxxxxxxxxxxxx> > > xfstests btrfs/070 sometimes failed. > In my test machine, its fail rate is about 30%. > In another vm(vmware), its fail rate is about 50%. > > Reason: > btrfs/070 do replace and defrag with fsstress simultaneously, > after above operation, checksum error is found by scrub. > Actually, it have no relationship with defrag operation, only > replace with fsstress can trigger this bug. > New data writen to target device have possibility rewrited by > old data from source device by replace code in debug, and can > be fixed by set chunk to ro in replace operation. Please improve the description, this is very condensed and unclear how exactly does the read-only/read-write changes happen. > --- a/fs/btrfs/scrub.c > +++ b/fs/btrfs/scrub.c > @@ -3446,6 +3446,23 @@ int scrub_enumerate_chunks(struct scrub_ctx *sctx, > if (!cache) > goto skip; > > + /* > + * we need call btrfs_inc_block_group_ro() with scrubs_paused, > + * to avoid deadlock caused by: > + * btrfs_inc_block_group_ro() > + * -> btrfs_wait_for_commit() > + * -> btrfs_commit_transaction() > + * -> btrfs_scrub_pause() > + */ > + atomic_inc(&fs_info->scrubs_paused); > + wake_up(&fs_info->scrub_pause_wait); > + btrfs_inc_block_group_ro(root, cache); > + mutex_lock(&fs_info->scrub_lock); > + __scrub_blocked_if_needed(fs_info); > + atomic_dec(&fs_info->scrubs_paused); > + mutex_unlock(&fs_info->scrub_lock); > + wake_up(&fs_info->scrub_pause_wait); Please put that into a helper, similar to scrub_blocked_if_needed . > + > dev_replace->cursor_right = found_key.offset + length; > dev_replace->cursor_left = found_key.offset; > dev_replace->item_needs_writeback = 1; -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
