Re: Deduplication and redundancy questions

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, Oct 02, 2015 at 09:06:17AM +0200, - wrote:
> hey everybody.
> 
> 1. Deduplication
> "When given a list of files it will hash their contents on a block
> by block basis" - are those static blocks or is the length of a
> block defined by its content? (that would be more resilient
> regarding inserts of data and the shift of the following data caused
> by it)

   "Block" generally refers to a 4 KiB piece of data. In btrfs,
contiguous blocks form extents, and a file may be made up of none or
more extents stored in different places on the disk.

> Do I understand correctly, that I better use bedup to deduplicate on
> a file-level before using duperomve to deduplicate on a block-level?

   If you really do block-level dedup, then that would make little
sense to do file-level dedup afterwards. As I understand it,
duperemove actually works on an extent level, rather than a block
level. (Note that block-level deduplication is likely to lead to truly
appalling performance, due to large amounts of seeking).

> 2. Redundany
> "Background scrub process for finding and fixing errors on files
> with redundant copies "
> So fixing is only available with full redundancy?! No parity-based
> methods or other ECC-based approaches?!

   Correct. The checksums used for this are (currently) 32 bits for
each 4 KiB block. There's simply not enough information there to do
anything other than detect (most) errors. If you want parity-based
repair of errors, use RAID-5 storage.

> thanks for answers

-- 
Hugo Mills             | IMPROVE YOUR ORGANISMS!!
hugo@... carfax.org.uk |
http://carfax.org.uk/  |
PGP: E2AB1DE4          |                            Subject line of spam email

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux