Re: [PATCH] Btrfs: fix a bug of sleeping in atomic context

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Nov 20, 2015 at 08:29:06PM -0700, Jens Axboe wrote:
> On 11/20/2015 08:14 PM, Liu Bo wrote:
> >On Fri, Nov 20, 2015 at 07:26:45PM -0700, Jens Axboe wrote:
> >>On 11/20/2015 04:08 PM, Liu Bo wrote:
> >>>On Fri, Nov 20, 2015 at 02:30:43PM -0700, Jens Axboe wrote:
> >>>>On 11/20/2015 06:13 AM, Chris Mason wrote:
> >>>>>On Thu, Nov 19, 2015 at 05:49:37PM -0800, Liu Bo wrote:
> >>>>>>while xfstesting, this bug[1] is spotted by both btrfs/061 and btrfs/063,
> >>>>>>so those sub-stripe writes are gatherred into plug callback list and
> >>>>>>hopefully we can have a full stripe writes.
> >>>>>>
> >>>>>>However, while processing these plugged callbacks, it's within an atomic
> >>>>>>context which is provided by blk_sq_make_request() because of a get_cpu()
> >>>>>>in blk_mq_get_ctx().
> >>>>>>
> >>>>>>This changes to always use btrfs_rmw_helper to complete the pending writes.
> >>>>>>
> >>>>>
> >>>>>Thanks Liu, but MD raid has the same troubles, we're not atomic in our unplugs.
> >>>>>
> >>>>>Jens?
> >>>>
> >>>>Yeah, blk-mq does have preemption disabled when it flushes, for the single
> >>>>queue setup. That's a bug. Attached is an untested patch that should fix it,
> >>>>can you try it?
> >>>>
> >>>
> >>>Although it runs into a warning one time of 50 tries, that was not atomic warning but another racy issue.
> >>>
> >>>WARNING: CPU: 2 PID: 8531 at fs/btrfs/ctree.c:1162 __btrfs_cow_block+0x431/0x610 [btrfs]()
> >>>
> >>>So overall the patch is good.
> >>>
> >>>>I'll rework this to be a proper patch, not convinced we want to add the new
> >>>>request before flush, that might destroy merging opportunities. I'll unify
> >>>>the mq/sq parts.
> >>>
> >>>That's true, xfstests didn't notice any performance difference but that cannot prove anything.
> >>>
> >>>I'll test the new patch when you send it out.
> >>
> >>Try this one, that should retain the plug issue characteristics we care
> >>about as well.
> >
> >The test does not complain any more, thank for the quick patch.
> >
> >Tested-by: Liu Bo <bo.li.liu@xxxxxxxxxx>
> 
> Can I talk you into trying this one? It's simpler, does the same thing. We
> don't need to overcomplicate it, it's fine not having preempt disabled for
> adding to the list.

Of course, I've run the case with the patch for 20 times, no more
warnings, thanks for the fix.

Thanks,

-liubo

> 
> -- 
> Jens Axboe
> 

> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3ae09de62f19..6d6f8feb48c0 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1291,15 +1291,16 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
>  		blk_mq_bio_to_request(rq, bio);
>  
>  		/*
> -		 * we do limited pluging. If bio can be merged, do merge.
> +		 * We do limited pluging. If the bio can be merged, do that.
>  		 * Otherwise the existing request in the plug list will be
>  		 * issued. So the plug list will have one request at most
>  		 */
>  		if (plug) {
>  			/*
>  			 * The plug list might get flushed before this. If that
> -			 * happens, same_queue_rq is invalid and plug list is empty
> -			 **/
> +			 * happens, same_queue_rq is invalid and plug list is
> +			 * empty
> +			 */
>  			if (same_queue_rq && !list_empty(&plug->mq_list)) {
>  				old_rq = same_queue_rq;
>  				list_del_init(&old_rq->queuelist);
> @@ -1380,12 +1381,15 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
>  		blk_mq_bio_to_request(rq, bio);
>  		if (!request_count)
>  			trace_block_plug(q);
> -		else if (request_count >= BLK_MAX_REQUEST_COUNT) {
> +
> +		blk_mq_put_ctx(data.ctx);
> +
> +		if (request_count >= BLK_MAX_REQUEST_COUNT) {
>  			blk_flush_plug_list(plug, false);
>  			trace_block_plug(q);
>  		}
> +
>  		list_add_tail(&rq->queuelist, &plug->mq_list);
> -		blk_mq_put_ctx(data.ctx);
>  		return cookie;
>  	}
>  

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux