Re: [PATCH v2] btrfs: raid56: data corruption on a device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



[+Cc Ming and full quote for reference]

On 04/01/2019 17:49, David Sterba wrote:
> On Fri, Dec 14, 2018 at 08:48:50PM +0300, Dmitriy Gorokh wrote:
>> RAID5 or RAID6 filesystem might get corrupted in the following scenario:
>>
>> 1. Create 4 disks RAID6 filesystem
>> 2. Preallocate 16 10Gb files
>> 3. Run fio: 'fio --name=testload --directory=./ --size=10G
>> --numjobs=16 --bs=64k --iodepth=64 --rw=randrw --verify=sha256
>> --time_based --runtime=3600’
>> 4. After few minutes pull out two drives: 'echo 1 >
>> /sys/block/sdc/device/delete ;  echo 1 > /sys/block/sdd/device/delete’
>>
>> About 5 of 10 times the test is run, it led to a silent data
>> corruption of a random stripe, resulting in ‘IO Error’ and ‘csum
>> failed’ messages while trying to read the affected file. It usually
>> affects only small portion of the files.
>>
>> It is possible that few bios which were being processed during the
>> drives removal, contained non zero bio->bi_iter.bi_done field despite
>> of EIO bi_status. bi_sector field was also increased from original one
>> by that 'bi_done' value. Looks like this is a quite rare condition.
>> Subsequently, in the raid_rmw_end_io handler that failed bio can be
>> translated to a wrong stripe number and fail wrong rbio.
>>
>> Reviewed-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
>> Signed-off-by: Dmitriy Gorokh <dmitriy.gorokh@xxxxxxx>
>> ---
>>  fs/btrfs/raid56.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
>> index 3c8093757497..cd2038315feb 100644
>> --- a/fs/btrfs/raid56.c
>> +++ b/fs/btrfs/raid56.c
>> @@ -1451,6 +1451,12 @@ static int find_bio_stripe(struct btrfs_raid_bio *rbio,
>>   struct btrfs_bio_stripe *stripe;
>>
>>   physical <<= 9;
>> + /*
>> +  * Since the failed bio can return partial data, bi_sector might be
>> +  * incremented by that value. We need to revert it back to the
>> +  * state before the bio was submitted.
>> +  */
>> + physical -= bio->bi_iter.bi_done;
> 
> The bi_done member has been removed in recent block layer changes
> commit 7759eb23fd9808a2e4498cf36a798ed65cde78ae ("block: remove
> bio_rewind_iter()"). I wonder what kind of block-magic do we need to do
> as the iterators seem to be local and there's nothing available in the
> call chain leading to find_bio_stripe. Johannes, any ideas?

Right, what we could do here is go the same way Ming did in
7759eb23fd980 ("block: remove bio_rewind_iter()") and save a bvec_iter
somewhere before submission and then see if we returned partial data,
but this somehow feels wrong to me (at least to do in btrfs code instead
of the block layer).

Ming can we resurrect ->bi_done, or do you have a different suggestion
for finding about partial written bios?

-- 
Johannes Thumshirn                            SUSE Labs Filesystems
jthumshirn@xxxxxxx                                +49 911 74053 689
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: Felix Imendörffer, Jane Smithard, Graham Norton
HRB 21284 (AG Nürnberg)
Key fingerprint = EC38 9CAB C2C4 F25D 8600 D0D0 0393 969D 2D76 0850



[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux