[PATCH v2] btrfs: raid56: data corruption on a device removal

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



RAID5 or RAID6 filesystem might get corrupted in the following scenario:

1. Create 4 disks RAID6 filesystem
2. Preallocate 16 10Gb files
3. Run fio: 'fio --name=testload --directory=./ --size=10G
--numjobs=16 --bs=64k --iodepth=64 --rw=randrw --verify=sha256
--time_based --runtime=3600’
4. After few minutes pull out two drives: 'echo 1 >
/sys/block/sdc/device/delete ;  echo 1 > /sys/block/sdd/device/delete’

About 5 of 10 times the test is run, it led to a silent data
corruption of a random stripe, resulting in ‘IO Error’ and ‘csum
failed’ messages while trying to read the affected file. It usually
affects only small portion of the files.

It is possible that few bios which were being processed during the
drives removal, contained non zero bio->bi_iter.bi_done field despite
of EIO bi_status. bi_sector field was also increased from original one
by that 'bi_done' value. Looks like this is a quite rare condition.
Subsequently, in the raid_rmw_end_io handler that failed bio can be
translated to a wrong stripe number and fail wrong rbio.

Reviewed-by: Johannes Thumshirn <jthumshirn@xxxxxxx>
Signed-off-by: Dmitriy Gorokh <dmitriy.gorokh@xxxxxxx>
---
 fs/btrfs/raid56.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/fs/btrfs/raid56.c b/fs/btrfs/raid56.c
index 3c8093757497..cd2038315feb 100644
--- a/fs/btrfs/raid56.c
+++ b/fs/btrfs/raid56.c
@@ -1451,6 +1451,12 @@ static int find_bio_stripe(struct btrfs_raid_bio *rbio,
  struct btrfs_bio_stripe *stripe;

  physical <<= 9;
+ /*
+  * Since the failed bio can return partial data, bi_sector might be
+  * incremented by that value. We need to revert it back to the
+  * state before the bio was submitted.
+  */
+ physical -= bio->bi_iter.bi_done;

  for (i = 0; i < rbio->bbio->num_stripes; i++) {
  stripe = &rbio->bbio->stripes[i];
-- 
2.17.0




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux