Re: send/receive and bedup

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Mon, May 19, 2014 at 01:59:01PM -0400, Austin S Hemmelgarn wrote:
> On 2014-05-19 13:12, Konstantinos Skarlatos wrote:
> > I have been testing duperemove and it seems to work just fine, in
> > contrast with bedup that i have been unable to install/compile/sort out
> > the mess with python versions. I have 2 questions about duperemove:
> > 1) can it use existing filesystem csums instead of calculating its own?
> While this might seem like a great idea at first, it really isn't.
> BTRFS uses CRC32c at the moment as it's checksum algorithm, and while
> that is relatively good at detecting small differences (i.e. a single
> bit flipped out of every 64 or so bytes), it is known to have issues
> with hash collisions.  Normally, the data on disk won't change enough
> even from a media error to cause a hash collision, but when you start
> using it to compare extents that aren't known to be the same to begin
> with, and then try to merge those extents, you run the risk of serious
> file corruption.  Also, AFAIK, BTRFS doesn't expose the block checksum
> to userspace directly (although I may be wrong about this, in which case
> i retract the following statement) this would therefore require some
> kernelspace support.

I'm pretty sure you could get the checkums via ioctl. The thing about dedupe
though is that kernel is always doing a byte-by-byte comparison of the file
data before merging it so we should never corrupt just because userspace
gave us a bad range to dedupe. That said I don't necessarily disagree that
it might not be as good an idea as it sounds.
	--Mark

--
Mark Fasheh
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux