Re: dm-integrity + mdadm + btrfs = no journal?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 1/30/19 4:26 PM, Christoph Anton Mitterer wrote:
> On Wed, 2019-01-30 at 07:58 -0500, Austin S. Hemmelgarn wrote:
>> Running dm-integrity without a journal is roughly equivalent to
>> using 
>> the nobarrier mount option (the journal is used to provide the same 
>> guarantees that barriers do).  IOW, don't do this unless you are
>> willing 
>> to lose the whole volume.
> 
> That sounds a bit strange to me.
> 
> My understanding was that the idea of being able to disable the journal
> of dm-integrity was just to avoid any double work, if equivalent
> guarantees are already given by higher levels.
> 
> If btrfs is by itself already safe (by using barriers), then I'd have
> expected that not transaction is committed, unless it got through all
> lower layers... so either everything works well on the dm-integrity
> base (and thus no journal is needed)... or it fails there... but then
> btrfs would already safe by it's own means (barriers + CoW)?

This. Exactly this.

The reason that this journal of dm-integrity has to be used is because
data and the checksum of that data gets written in two different places.
The result of using it is that you'll always read back data with
matching checksums, either the previous data, or the new data.

https://arxiv.org/pdf/1807.00309.pdf
See Section 4.4 "Recovery on Write Failure".

"A device must provide atomic updating of both data and metadata.  A
situation in which one part is written to media while another part
failed must not occur."

Now, the great thing here is that btrfs does not overwrite disk data in
place. It writes out new data, metadata and then the superblock. So,
e.g. on power loss, I don't care about whatever happened to writes that
are not visible because the superblock was never written? Btrfs will not
read these disk sectors back, because it's unused space.

Also, it's not a write hole like in RAID56, because when "pulling the
plug" between writing out data and metadata, the checksums of older
existing data sectors are not corrupted, only new writes that were in
flight... I think... But the the pdf is still mentioning (also in 4.4)
"Furthermore, metadata sectors are packed with tags for multiple
sectors; thus, a write failure must not cause an integrity validation
failure for other sectors". From the design, I can however not see how
this could happen.

I asked on dm-devel list a while ago about this, but the mailing list
post never got any reply.

Hans




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux