On 2012-07-03 14:38 NeilBrown <neilb@xxxxxxx> Wrote:
>On Tue, 3 Jul 2012 14:23:41 +0800 majianpeng <majianpeng@xxxxxxxxx> wrote:
>
>> Because bios will merge at block-layer,so bios-error may caused by other
>> bio which be merged into to the same request.
>> Using this flag,it will find exactly error-sector and not do redundant
>> operation like re-write and re-read.
>>
>> Signed-off-by: majianpeng <majianpeng@xxxxxxxxx>
>
>Hi,
> I think this patch needs a more detailed explanation.
>
>What exactly is the situation that causes a problem, and what exactly is the
>problem that it causes?
>
>Pretend that I don't know anything about what happens below the md level..
>
>Thanks,
>NeilBrown
>
>
>> ---
>> block/blk-core.c | 8 ++++++++
>> drivers/md/raid5.c | 16 +++++++++++++---
>> drivers/md/raid5.h | 1 +
>> 3 files changed, 22 insertions(+), 3 deletions(-)
>>
>> diff --git a/block/blk-core.c b/block/blk-core.c
>> index 3c923a7..ee04bfc 100644
>> --- a/block/blk-core.c
>> +++ b/block/blk-core.c
>> @@ -1401,6 +1401,9 @@ void init_request_from_bio(struct request *req, struct bio *bio)
>> if (bio->bi_rw & REQ_RAHEAD)
>> req->cmd_flags |= REQ_FAILFAST_MASK;
>>
>> + if (unlikely(bio->bi_rw & REQ_NOMERGE))
>> + req->cmd_flags |= REQ_NOMERGE;
>> +
>> req->errors = 0;
>> req->__sector = bio->bi_sector;
>> req->ioprio = bio_prio(bio);
>> @@ -1428,6 +1431,11 @@ void blk_queue_bio(struct request_queue *q, struct bio *bio)
>> goto get_rq;
>> }
>>
>> + if (unlikely(bio->bi_rw & REQ_NOMERGE)) {
>> + spin_lock_irq(q->queue_lock);
>> + where = ELEVATOR_INSERT_BACK;
>> + goto get_rq;
>> + }
>> /*
>> * Check if we can merge with the plugged list before grabbing
>> * any locks.
>> diff --git a/drivers/md/raid5.c b/drivers/md/raid5.c
>> index d267672..04f78d2 100644
>> --- a/drivers/md/raid5.c
>> +++ b/drivers/md/raid5.c
>> @@ -632,6 +632,9 @@ static void ops_run_io(struct stripe_head *sh, struct stripe_head_state *s)
>> else
>> bi->bi_sector = (sh->sector
>> + rdev->data_offset);
>> + if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
>> + bi->bi_rw |= REQ_NOMERGE;
>> +
>> bi->bi_flags = 1 << BIO_UPTODATE;
>> bi->bi_idx = 0;
>> bi->bi_io_vec[0].bv_len = STRIPE_SIZE;
>> @@ -1731,7 +1734,9 @@ static void raid5_end_read_request(struct bio * bi, int error)
>> atomic_add(STRIPE_SECTORS, &rdev->corrected_errors);
>> clear_bit(R5_ReadError, &sh->dev[i].flags);
>> clear_bit(R5_ReWrite, &sh->dev[i].flags);
>> - }
>> + } else if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags))
>> + clear_bit(R5_ReadNoMerge, &sh->dev[i].flags);
>> +
>> if (atomic_read(&rdev->read_errors))
>> atomic_set(&rdev->read_errors, 0);
>> } else {
>> @@ -1773,7 +1778,11 @@ static void raid5_end_read_request(struct bio * bi, int error)
>> else
>> retry = 1;
>> if (retry)
>> - set_bit(R5_ReadError, &sh->dev[i].flags);
>> + if (test_bit(R5_ReadNoMerge, &sh->dev[i].flags)) {
>> + set_bit(R5_ReadError, &sh->dev[i].flags);
>> + clear_bit(R5_ReadNoMerge, &sh->dev[i].flags);
>> + } else
>> + set_bit(R5_ReadNoMerge, &sh->dev[i].flags);
>> else {
>> clear_bit(R5_ReadError, &sh->dev[i].flags);
>> clear_bit(R5_ReWrite, &sh->dev[i].flags);
>> @@ -4481,7 +4490,8 @@ static int retry_aligned_read(struct r5conf *conf, struct bio *raid_bio)
>> conf->retry_read_aligned = raid_bio;
>> return handled;
>> }
>> -
>> + if (likely(raid_bio->bi_size >> 9) > STRIPE_SECTORS)
>> + set_bit(R5_ReadNoMerge, &sh->dev[dd_idx].flags);
>> handle_stripe(sh);
>> release_stripe(sh);
>> handled++;
>> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
>> index 2164021..6767d07 100644
>> --- a/drivers/md/raid5.h
>> +++ b/drivers/md/raid5.h
>> @@ -273,6 +273,7 @@ enum r5dev_flags {
>> R5_Wantwrite,
>> R5_Overlap, /* There is a pending overlapping request
>> * on this block */
>> + R5_ReadNoMerge, /* prevent bio from merging in block-layer */
>> R5_ReadError, /* seen a read error here recently */
>> R5_ReWrite, /* have tried to over-write the readerror */
>>
>
>
How about the below explanation:
Because bio will be merged at block-layer,so bios-error may caused by other
bio which be merged into to the same request.
For example: if chunk_aligned_read failed, it will add bio to some stipe.
But because the bio-merge function,those bios at most be merged.It will like
the chunk_aligned_read and returen error.Then it will re-write and re-read.
Suppose RAID5 created by n disk and chunk-size is 512K.
If read 512k chunk_aligned and met error(sector 0 is media error),
then add 512/4=128 stipe.If those bios merged and must be error.The rewrite operation
will read (n-1) * 128 and computer 128 stripe.But using this flag,we only exec one rewrite.
This occur in resync/repair situation and chunk-aligned-read.
May be using this flag,we can find exact bad-sector.Not recorded the whole request bios
as bad-sector.?韬{.n?????%??檩??w?{.n???{炳盯w???塄}?财??j:+v??????2??璀??摺?囤??z夸z罐?+?????w棹f
[ATA RAID]
[Linux SCSI Target Infrastructure]
[Managing RAID on Linux]
[Linux IDE]
[Linux SCSI]
[Linux Hams]
[Device-Mapper]
[Kernel]
[Linux Books]
[Linux Admin]
[Linux Net]
[GFS]
[RPM]
[git]
[Photos]
[Yosemite Photos]
[Yosemite News]
[AMD 64]
[Linux Networking]