On 24/05/2020 21:56, David Sterba wrote:
> On Thu, May 14, 2020 at 11:24:15AM +0200, Johannes Thumshirn wrote:
>> +User-data
>> +~~~~~~~~~
>> +
>> +The checksums for the user or file-data are stored in a separate b-tree, the
>> +checksum tree. As this tree in itself is authenticated, only the data stored
>> +in it needs to be authenticated. This is done by replacing the checksums
>> +stored on disk by the cryptographically secure keyed hash algorithm used for
>> +the super-block and other meta-data. So each written file block will get
>> +checksummed with the authentication key and without supplying the correct key
>> +it is impossible to write data on disk, which can be read back without
>> +failing the authentication test. If this test is failed, an I/O error is
>> +reported back to the user.
>
> With same key K and same contents of data block B, the keyed hash on two
> different filesystems is the same. Ie. there's no per-filesystem salt
> (like a UUID) or per-transaction salt (generation, block address).
Correct.
>
> For metadata the per-transaction salt is inherently there as the hash is
> calculated with the header included (containing the increasing
> generation) and the filesystem UUID (available via blkid) or chunk tree
> UUID (not so easy to user to read).
>
> So there's an obvious discrepancy in the additional data besides the
> variable contents of the data and metadata blocks.
>
> The weakness of the data blocks may aid some attacks (I don't have a
> concrete suggestion where and how exatly).
Yes but wouldn't this also need a hash that is prone to a known plaintext
attack or that has known collisions? But it would probably help in
brute-forcing the key K of the filesystem. OTOH fsid, generation and the
chunk-tree UUID can be read in plaintext from the FS as well so this would
only mitigate a rainbow table like attack, wouldn't it?
>
> Suggested fix is to have a data block "header", with similar contents as
> the metadata blocks, eg.
>
> struct btrfs_hash_header {
> u8 fsid[BTRFS_FSID_SIZE];
> u8 chunk_tree_uuid[BTRFS_UUID_SIZE];
> __le64 generation;
> };
>
> Perhaps also with some extra item for future extensions, set to zeros
> for now.
>
This addition would be possible, yes. But if we'd add this header to every
checksum in the checksum tree it would be an incompatible on-disk format
change.
We could add this only for authenticated filesystems though, but would this
deviation make sense? I need to think more about it (and actually look at the
code to see how this could be done).