On Wed, Jan 27, 2010 at 12:30 PM, Andi Kleen <andi@xxxxxxxxxxxxxx> wrote: > Daniel J Blueman <daniel.blueman@xxxxxxxxx> writes: > >> For purposes of data deduplication and data synchronisation, it would >> be a powerful tool to expose file data checksums. >> >> Since eg BTRFS uses the crc32c algorithm [1], it's possible to compute >> the file's overall CRC from the accumulation of the CRCs from all it's >> extents' CRCs. >> >> For now, exposing this via an IOCTL may be sufficient, though any >> ideas for introducing it in a more standard way? (it's a pity that >> when stat64 was introduced, reserved fields weren't added) > > The problem of doing it in any "standard way" is that it would > hard code the way the file system does checksums in the applications. > So the file system could never change it without breaking > user space. I guess the filesystem would need to express this in the resulting data-structure, eg: - type 1 corresponds to using the crc32c algorithm with starting seed N and accumulating ascending over data extents, padding with modulus remainder or sparse holes with 0 - type 2 etc The next question, is does filesystem (eg BTRFS) compression come before or after checksumming? -- Daniel J Blueman -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
