7) When we do the actual write of this stripe, because it's a partial
stripe write
(we aren't writing to all the pages of all the stripes of the full
stripe), we
need to read the remaining pages of stripe 2 (page indexes from 4 to 15) and
all the pages of stripe 1 from disk in order to compute the content for the
parity stripe. So we submit bios to read those pages from the corresponding
devices (we do this at raid56.c:raid56_rmw_stripe()).
The problem is that we
assume whatever we read from the devices is valid -
Any idea why we have to assume here, shouldn't the csum / parent
transit id verification fail at this stage?
There is raid1 test case [1] which is more consistent to reproduce.
[1] https://patchwork.kernel.org/patch/11475417/
looks like its result of avoiding update to the generation for nocsum
file data modifications.
Thanks, Anand
in this case what we read
from device 3, to which stripe 2 is mapped, is invalid since in the degraded
mount we haven't written extent buffer 39043072 to it - so we get
garbage from
that device (either a stale extent, a bunch of zeroes due to trim/discard or
anything completely random).
Then we compute the content for the
parity stripe
based on that invalid content we read from device 3 and write the
parity stripe
(and the other two stripes) to disk;