Re: [PATCH 1/2] Btrfs: online data deduplication

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 	mon, 8 Apr 2013 15:47:27 +0200, David Sterba wrote:
> On Sun, Apr 07, 2013 at 09:12:48PM +0800, Liu Bo wrote:
>> (2) WHAT is deduplication?
>>     Two key ways for practical deduplication implementations,
>>     *  When the data is deduplicated
>>        (inband vs background)
>>     *  The granularity of the deduplication.
>>        (block level vs file level)
>>
>>     For btrfs, we choose
>>     *  inband(synchronous)
>>     *  block level
> 
> Block level may be too fine grained leading to excessive fragmentation
> and increased metadata usage given that there's a much higher chance to
> find duplicate (4k) blocks here and there.
> 
> There's always a tradeoff, the practical values that are considered for
> granularity range from 8k to 64, see eg. this paper for graphs and analyses
> 
> http://static.usenix.org/event/fast11/tech/full_papers/Meyer.pdf .
> 
> This also depends on file data type and access patterns, fixing the dedup
> basic chunk size to one block does not IMHO fit most usecases.

Maybe we can make btrfs(including dedup) support the bigalloc just like ext4.

Thanks
Miao

> 
>> (3) HOW does deduplication works?
> ...
>>     Here we have
>>     a)  a new dedicated tree(DEDUP tree) and
>>     b)  a new key(BTRFS_DEDUP_ITEM_KEY), which consists of
>>         (stop 64bits of hash, type, disk offset),
>>         *  stop 64bits of hash
>>            It comes from sha256, which is very helpful on avoiding collision.
>>            And we take the stop 64bits as the index.
> 
> Is it safe to use just 64 bits? I'd like to see better reasoning why
> this is ok. The limitation of btrfs_key to store only 1-2 64bit items is
> clear and must be handled, but it's IMO a critical design point.
> 
>>         *  disk offset
>>            It helps to find where the data is stored.
> 
> Does the disk offset also help to resolving block hash collisions?
> 
>>     So the whole deduplication process works as,
>>     1) write something,
>>     2) calculate the hash of this "something",
>>     3) try to find the match of hash value by searching DEDUP keys in
>>        a dedicated tree, DEDUP tree.
>>     4) if found, skip real IO and link to the existing copy
>>        if not, do real IO and insert a DEDUP key to the DEDUP tree.
> 
> ... how are the hash collisions handled? Using part of a secure has
> cannot be considered equally strong (given that there is not other
> safety checks like comparing the whole blocks).
> 
> Last but not least, there was another dedup proposal (author CCed)
> 
> http://thread.gmane.org/gmane.comp.file-systems.btrfs/21722
> 
> 
> david
> --
> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux