On Tue, 22 Oct 2013 18:55:59 +0200, Bob Marley wrote: > On 22/10/2013 10:37, Stefan Behrens wrote: >> I don't believe that this issue can ever happen. I don't believe that >> somewhere on the path to the flash memory, to the magnetic disc or to >> the drive's cache memory, someone interrupts a 4KB write in the middle >> of operation to read from this 4KB area. This is not an issue IMHO. > > I think I have read that unfortunately it can happen. > SAS and SATA specs for disks do not mandate that if a write is in-flight > but still not completed, reads from the same sector should return the > value it is being written; they can return the old value. > I also think that Linux does not check either. If the _old_ 4KB block is returned, that's fine and won't cause a checksum error. The patch in question addresses the case that Btrfs submits a write request for a 4KB block, and a concurrent read request for that 4KB block reads partially the old block and partially the new block, resulting in a checksum error reported in the scrub statistic counters. > Much worse, I think I have even read that two simultaneous in-flight > writes to the same sector can be completed in any order by the disk, and > since the write which wins is the latter being completed, this results > in an indeterminate value persisting on that sector at the end. One > needs to synchronize cache between the two writes to guarantee the > outcome. Way worse is when the drives also cheat on synchronize cache, > and that one is impossible to fix I believe. Two simultaneous in-flight writes to the same superblock cannot happen in Btrfs. -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
