On Fri, Nov 20, 2015 at 08:29:06PM -0700, Jens Axboe wrote:
> On 11/20/2015 08:14 PM, Liu Bo wrote:
> >On Fri, Nov 20, 2015 at 07:26:45PM -0700, Jens Axboe wrote:
> >>On 11/20/2015 04:08 PM, Liu Bo wrote:
> >>>On Fri, Nov 20, 2015 at 02:30:43PM -0700, Jens Axboe wrote:
> >>>>On 11/20/2015 06:13 AM, Chris Mason wrote:
> >>>>>On Thu, Nov 19, 2015 at 05:49:37PM -0800, Liu Bo wrote:
> >>>>>>while xfstesting, this bug[1] is spotted by both btrfs/061 and btrfs/063,
> >>>>>>so those sub-stripe writes are gatherred into plug callback list and
> >>>>>>hopefully we can have a full stripe writes.
> >>>>>>
> >>>>>>However, while processing these plugged callbacks, it's within an atomic
> >>>>>>context which is provided by blk_sq_make_request() because of a get_cpu()
> >>>>>>in blk_mq_get_ctx().
> >>>>>>
> >>>>>>This changes to always use btrfs_rmw_helper to complete the pending writes.
> >>>>>>
> >>>>>
> >>>>>Thanks Liu, but MD raid has the same troubles, we're not atomic in our unplugs.
> >>>>>
> >>>>>Jens?
> >>>>
> >>>>Yeah, blk-mq does have preemption disabled when it flushes, for the single
> >>>>queue setup. That's a bug. Attached is an untested patch that should fix it,
> >>>>can you try it?
> >>>>
> >>>
> >>>Although it runs into a warning one time of 50 tries, that was not atomic warning but another racy issue.
> >>>
> >>>WARNING: CPU: 2 PID: 8531 at fs/btrfs/ctree.c:1162 __btrfs_cow_block+0x431/0x610 [btrfs]()
> >>>
> >>>So overall the patch is good.
> >>>
> >>>>I'll rework this to be a proper patch, not convinced we want to add the new
> >>>>request before flush, that might destroy merging opportunities. I'll unify
> >>>>the mq/sq parts.
> >>>
> >>>That's true, xfstests didn't notice any performance difference but that cannot prove anything.
> >>>
> >>>I'll test the new patch when you send it out.
> >>
> >>Try this one, that should retain the plug issue characteristics we care
> >>about as well.
> >
> >The test does not complain any more, thank for the quick patch.
> >
> >Tested-by: Liu Bo <bo.li.liu@xxxxxxxxxx>
>
> Can I talk you into trying this one? It's simpler, does the same thing. We
> don't need to overcomplicate it, it's fine not having preempt disabled for
> adding to the list.
Of course, I've run the case with the patch for 20 times, no more
warnings, thanks for the fix.
Thanks,
-liubo
>
> --
> Jens Axboe
>
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3ae09de62f19..6d6f8feb48c0 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -1291,15 +1291,16 @@ static blk_qc_t blk_mq_make_request(struct request_queue *q, struct bio *bio)
> blk_mq_bio_to_request(rq, bio);
>
> /*
> - * we do limited pluging. If bio can be merged, do merge.
> + * We do limited pluging. If the bio can be merged, do that.
> * Otherwise the existing request in the plug list will be
> * issued. So the plug list will have one request at most
> */
> if (plug) {
> /*
> * The plug list might get flushed before this. If that
> - * happens, same_queue_rq is invalid and plug list is empty
> - **/
> + * happens, same_queue_rq is invalid and plug list is
> + * empty
> + */
> if (same_queue_rq && !list_empty(&plug->mq_list)) {
> old_rq = same_queue_rq;
> list_del_init(&old_rq->queuelist);
> @@ -1380,12 +1381,15 @@ static blk_qc_t blk_sq_make_request(struct request_queue *q, struct bio *bio)
> blk_mq_bio_to_request(rq, bio);
> if (!request_count)
> trace_block_plug(q);
> - else if (request_count >= BLK_MAX_REQUEST_COUNT) {
> +
> + blk_mq_put_ctx(data.ctx);
> +
> + if (request_count >= BLK_MAX_REQUEST_COUNT) {
> blk_flush_plug_list(plug, false);
> trace_block_plug(q);
> }
> +
> list_add_tail(&rq->queuelist, &plug->mq_list);
> - blk_mq_put_ctx(data.ctx);
> return cookie;
> }
>
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html