On Thu, 2015-03-12 at 12:07 +0800, Liu Bo wrote: > checksum is updated along with the corresponding datablock's change, but > updating the super block depends on when btrfs starts committing > transaction. Well I guess it was clear that you don't update the whole chain on every single write =) (other wise btrfs would write to it all the time)... but AFAIU it would be enough if everything is flushed out to disk at the end (i.e. when the user unmounts and the checksum is printed). Of course one general problem remains (which I didn't think of before): What happens in case the system crashes... then the user would have no way to now the last valid super-checksum. I'm not sure whether this can be easily solved.. if at all. One idea would be to make a snapshot of the whole fs in the beginning, so in case of a crash, the user could go at least back to that validated state. But since snapshots are on subvols and not the whole fs, this wouldn't really work. And even if there was some way implemented to keep the pre-mounted state of the fs (until unmount) this could cost a lot of space. Hmm,... I think this is actually a bigger problem on the whole idea, maybe it makes it unreasonable to use such functionality on "unattended" filesystems and even when I sit next to my harddisk while the system crashes (where I can be sure that no one forged any data after the crash) one would need to trust the storage device. > > On mount one could then specify the expected checksum for the > > superblock. If it differs already, then the mount should obviously fails > > right away (or try backup superblocks and that like, but again only use > > them if they match the sum). > > That's exactly what we have now in btrfs, but the superblock checksum is only to > verify superblock itself, nothing more. Sure but that's obviously different and not enough for cryptographic integrity validation of *all* the data in the fs. > > Obviously one would also need an operation mode in which btrfs dies with > > bells and red signs as soon as something (data or metadata) is read > > afterwards, which doesn't match the expected checksums. > > again, RAID copies, etc. could of course be tried - but if no valid copy > > is found, then it should assume compromise and the read operation should > > error out and not deliver any data at all (it may be compromised and > > nothing should use it). > > Okay, if you kept using btrfs for a while, you can figure out the above > is basically true Hehe... well I've wrote about that some time ago, IMHO (and no one here should take this offensive) the documentation is in a suboptimal state, especially also since there are still many things going on. Especially what one could really expect from the system, things like - Over which data are checksums calculated, and what exactly happens on errors (both data and metadata). Endusers are do not necessarily know that btrfs will always verify checksums and with RAID look for a valid block and give back only such (unlike MD or hardware RAID do usually)- - What happens if I abort defrag (e.g. Ctrl-C)? Will it abort cleanly? Will all data be lost? What if the system crashes during defrag or balance? Are these procedures so safe that they cannot loose data in these cases? Will the procedure continue automatically after reboot? - Why does btrfs still need a log and what is it used for. Anyway,... different topic ;) > but there are some differences, > a) if we find a metadata checksum mismatch, we do go to > get another copy(we usually have two copies for metadata) for good copy > if it is, and if no, we dont make btrfs die but throw a warning(actually > it can refuse to mount if you got such errors during mount and couldnt > find good copies). And what if bad meta-data is found during normal operations (e.g. read/write) with no good copy being found? Does it just give a warning and try to do it's best to give back the read file/it's attrs/permissions/etc. (which could already be a problem)? Or will at also fail to read any files/dirs/etc. for which that meta-data would have been needed (e.g. their checksums). > > Of course people might still want to read such "compromised" blocks > > (e.g. when they are sure that they've only suffered from accidental data > > corruption and try to rescue as much as possible), but that should then > > require a special mount option. > > So you're asking for a strict mode, but for datablock corruption, is it > OK for you to just flip btrfs into readonly mode instead of making it > die? Not sure what you mean. If btrfs would be operating in "normal" (i.e. not a kind of data recovery mode explicitly chosen by the user at mount time) than it should fail to give back any data/metadata which it cannot verify. I'm not sure whether it would need to go into readonly mode... perhaps it would be a good idea, cause one could theoretically think about blocking attacks against such integrity protected filesystem, e.g. something like: | +--trusted-keys.d | \-- ... +--revoked-keys.d \-- ... If an attacker would manage to just corrupt the metadata for some of the revoked-keys and the fs would just not give them back, he could use that for tricky attacks. So indeed, it might be better to go into read-only mode when a unrecoverable validation error occurs... or perhaps even allow the user to make a combination of remount-ro + kernel panic (some software might still continue to work even when the fs is ro and for some users it might be better to completely die than to use bad/insecure data). Cheers, Chris.
Attachment:
smime.p7s
Description: S/MIME cryptographic signature
