Re: [BUG] Btrfs scrub sometime recalculate wrong parity in raid5

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Sat, Jul 16, 2016 at 06:51:11PM +0300, Jarkko Lavinen wrote:
>  The modified script behaves very much like the original dd version.

Not quite. The bad sector simulation works like old hard drives without error correction and bad block remapping. This changes the error behaviour.

My script prints now kernel messages once the check_fs fails. The time range of messages messages is from the adding of the bad sector device to the point when check_fs fails.

The parity test which often passes with the Goffredo's script, always fails with my bad sector version and scrub says the error is uncorrectable. In the kernel messages there are two buffer IO read errors but no write error as if scrub quits before writing?

In the data2 test scrub again says the error is uncorrectable but according to the kernel messages the bad sector is read 4 times and written twice during the scrub. In my bad sector script the data2 is still corrupted and parity ok since the bad sector cannot be written and scrub likely quits earlier than in Goffredo's script. In his script the data2 gets fixed but the parity gets corrupted.

Jarkko Lavinen

$ bash h2.sh
--- test 1: corrupt parity
scrub started on mnt/., fsid 2625e2d0-420c-40b6-befa-97fc18eaed48 (pid=32490)
ERROR: there are uncorrectable errors
******* Wrong data on disk:off /dev/mapper/loop0:61931520 (parity)
Data read ||, expected |0300 0303|

Kernel messages in the test
First Check_fs started
Buffer I/O error on dev dm-0, logical block 15120, async page read
Scrub started
Second Check_fs started
Buffer I/O error on dev dm-0, logical block 15120, async page read

--- test 2: corrupt data2
scrub started on mnt/., fsid 8e506268-16c7-48fa-b176-0a8877f2a7aa (pid=434)
ERROR: there are uncorrectable errors
******* Wrong data on disk:off /dev/mapper/loop2:81854464 (data2)
Data read ||, expected |bdbbb|

Kernel messages in the test
First Check_fs started
Buffer I/O error on dev dm-2, logical block 19984, async page read
Scrub started
BTRFS warning (device dm-0): i/o error at logical 142802944 on dev /dev/mapper/loop2, sector 159872, root 5, inode 257, offset 65536, length 4096, links 1 (path: out.txt)
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 1, rd 1, flush 0, corrupt 0, gen 0
BTRFS warning (device dm-0): i/o error at logical 142802944 on dev /dev/mapper/loop2, sector 159872, root 5, inode 257, offset 65536, length 4096, links 1 (path: out.txt)
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 1, rd 2, flush 0, corrupt 0, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 142802944 on dev /dev/mapper/loop2
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 2, rd 2, flush 0, corrupt 0, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 142802944 on dev /dev/mapper/loop2
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 2, rd 3, flush 0, corrupt 0, gen 0
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 2, rd 4, flush 0, corrupt 0, gen 0
Second Check_fs started
BTRFS info (device dm-0): bdev /dev/mapper/loop2 errs: wr 2, rd 4, flush 0, corrupt 0, gen 0
Buffer I/O error on dev dm-2, logical block 19984, async page read

--- test 3: corrupt data1
scrub started on mnt/., fsid f8a4ecca-2475-4e5e-9651-65d9478b56fe (pid=856)
ERROR: there are uncorrectable errors
******* Wrong data on disk:off /dev/mapper/loop1:61931520 (data1)
Data read ||, expected |adaaa|

Kernel messages in the test
First Check_fs started
Buffer I/O error on dev dm-1, logical block 15120, async page read
Scrub started
BTRFS warning (device dm-0): i/o error at logical 142737408 on dev /dev/mapper/loop1, sector 120960, root 5, inode 257, offset 0, length 4096, links 1 (path: out.txt)
BTRFS error (device dm-0): bdev /dev/mapper/loop1 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
BTRFS warning (device dm-0): i/o error at logical 142737408 on dev /dev/mapper/loop1, sector 120960, root 5, inode 257, offset 0, length 4096, links 1 (path: out.txt)
BTRFS error (device dm-0): bdev /dev/mapper/loop1 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
BTRFS error (device dm-0): bdev /dev/mapper/loop1 errs: wr 1, rd 2, flush 0, corrupt 0, gen 0
BTRFS error (device dm-0): unable to fixup (regular) error at logical 142737408 on dev /dev/mapper/loop1
BTRFS error (device dm-0): unable to fixup (regular) error at logical 142737408 on dev /dev/mapper/loop1
BTRFS error (device dm-0): bdev /dev/mapper/loop1 errs: wr 1, rd 3, flush 0, corrupt 0, gen 0
Second Check_fs started
BTRFS error (device dm-0): bdev /dev/mapper/loop1 errs: wr 1, rd 4, flush 0, corrupt 0, gen 0
BTRFS info (device dm-0): bdev /dev/mapper/loop1 errs: wr 1, rd 4, flush 0, corrupt 0, gen 0
Buffer I/O error on dev dm-1, logical block 15120, async page read

--- test 4: corrupt data2; read without scrub
******* Wrong data on disk:off /dev/mapper/loop2:81854464 (data2)
Data read ||, expected |bdbbb|

Kernel messages in the test
First Check_fs started
Buffer I/O error on dev dm-2, logical block 19984, async page read
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 0, rd 1, flush 0, corrupt 0, gen 0
Second Check_fs started
BTRFS error (device dm-0): bdev /dev/mapper/loop2 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
BTRFS info (device dm-0): bdev /dev/mapper/loop2 errs: wr 0, rd 2, flush 0, corrupt 0, gen 0
Buffer I/O error on dev dm-2, logical block 19984, async page read

Attachment: h2.sh
Description: Bourne shell script


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux