Re: Some very basic questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Ric Wheeler wrote:
> I think that we do handle a failure in the case that you outline above
> since the FS will be able to notice the error before it sends a commit
> down (and that commit is wrapped in the barrier flush calls). This is
> the easy case since we still have the context for the IO.

I'm no FS guy but for that to be true FS should be waiting for all the
outstanding IOs to finish before issuing a barrier and actually
doesn't need barriers at all - it can do the same with flush_cache.

> It is more challenging  (and kind of related) if the IO done in (4) has
> been ack'ed by drive, the drive later destages (not as part of the
> flush) its write cache and then an error happens. In this case, there is
> nothing waiting on the initiator side to receive the IO error. We have
> effectively lost the context for that IO.

IIUC, that should be detectable from FLUSH whether the destaging
occurred as part of flush or not, no?

> The only way to detect this is on replay (if the journal has checksums
> enabled or the error will be flagged as a media error).

If it's not reported on FLUSH, it basically amounts to silent data
corruption and only checksums can help.

Thanks.

-- 
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux