On Fri, Oct 02, 2015 at 09:06:17AM +0200, - wrote: > hey everybody. > > 1. Deduplication > "When given a list of files it will hash their contents on a block > by block basis" - are those static blocks or is the length of a > block defined by its content? (that would be more resilient > regarding inserts of data and the shift of the following data caused > by it) "Block" generally refers to a 4 KiB piece of data. In btrfs, contiguous blocks form extents, and a file may be made up of none or more extents stored in different places on the disk. > Do I understand correctly, that I better use bedup to deduplicate on > a file-level before using duperomve to deduplicate on a block-level? If you really do block-level dedup, then that would make little sense to do file-level dedup afterwards. As I understand it, duperemove actually works on an extent level, rather than a block level. (Note that block-level deduplication is likely to lead to truly appalling performance, due to large amounts of seeking). > 2. Redundany > "Background scrub process for finding and fixing errors on files > with redundant copies " > So fixing is only available with full redundancy?! No parity-based > methods or other ECC-based approaches?! Correct. The checksums used for this are (currently) 32 bits for each 4 KiB block. There's simply not enough information there to do anything other than detect (most) errors. If you want parity-based repair of errors, use RAID-5 storage. > thanks for answers -- Hugo Mills | IMPROVE YOUR ORGANISMS!! hugo@... carfax.org.uk | http://carfax.org.uk/ | PGP: E2AB1DE4 | Subject line of spam email
Attachment:
signature.asc
Description: Digital signature
