On Mon, Sep 11, 2017 at 2:55 PM, Qu Wenruo <quwenruo.btrfs@xxxxxxx> wrote: > > > On 2017年09月11日 17:14, Qu Wenruo wrote: >> >> >> >> On 2017年09月11日 16:57, shally verma wrote: >>> >>> On Mon, Sep 11, 2017 at 1:42 PM, Qu Wenruo <quwenruo.btrfs@xxxxxxx> >>> wrote: >>>> >>>> >>>> >>>> On 2017年09月11日 15:54, shally verma wrote: >>>>> >>>>> >>>>> On Mon, Sep 11, 2017 at 12:16 PM, Qu Wenruo <quwenruo.btrfs@xxxxxxx> >>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> On 2017年09月11日 14:05, shally verma wrote: >>>>>>> >>>>>>> >>>>>>> >>>>>>> I was going through BTRFS Deduplication page >>>>>>> (https://btrfs.wiki.kernel.org/index.php/Deduplication) and I read >>>>>>> >>>>>>> "As such, xfs_io, is able to perform deduplication on a BTRFS file >>>>>>> system," .. >>>>>>> >>>>>>> following this, I followed on to xfs_io link >>>>>>> https://linux.die.net/man/8/xfs_io >>>>>>> >>>>>>> As I understand, these are set of commands allow us to do different >>>>>>> operations on "xfs" filesystem. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Nope, it's just a tool triggering different read/write or ioctls. >>>>>> In fact most of its command is fs independent. >>>>>> Only a limited number of operations are only supported by XFS. >>>>>> >>>>>> It's just due to historical reasons it's still named as xfs_io. >>>>>> >>>>>> I won't be surprised if one day it's split as an independent tool. >>>>>> >>>>>>> and command set mentioned here, couldn't see which is command to >>>>>>> invoke dedupe task. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> "dedupe" and "reflink" command. >>>>> >>>>> >>>>> Oh. That means page link referred on BTRFS Wiki page is not updated >>>>> with this. I googled another page that has reference of these two >>>>> command in xfs_io here >>>>> https://www.systutorials.com/docs/linux/man/8-xfs_io/ >>>>> May be Wiki need an update here. >>>> >>>> >>>> >>>> If XFS has a regularly updated online man page, we can just use that. >>>> (But unfortunately, not every fs user tools use asciidoc like btrfs, >>>> which >>>> can generate both man page and html). >>>> >>>>> >>>>>> >>>>>>> and how this works with BTRFS. >>>>>> >>>>>> >>>>>> >>>>>> >>>>>> Fs support FIDEDUPERANGE or BTRFS_IOC_FILE_EXTENT_SAME ioctl can use >>>>>> it >>>>>> to >>>>>> determine if two ranges are containing identical data. >>>>>> >>>>>> And if they are identical, we use FICLONERANGE or >>>>>> BTRFS_IOC_CLONE_RANGE >>>>>> ioctl to reflink one to another, freeing one of them. >>>>>> >>>>>> BTW nowadays, such dedupe and reflink ioctl is genericized in VFS. >>>>>> file_operations structure now includes both clone_file_range() and >>>>>> dedupe_file_range() callbacks now. >>>>> >>>>> >>>>> Yea. Understand that part. So going by description of "dedupe" and >>>>> "reflink", seems through these commands, one can do deduplication part >>>>> and NOT duplicate find part. >>>> >>>> >>>> >>>> Yes, one don't need to call "dedupe" ioctl if they already knows some >>>> data >>>> is identical and can go reflink straightforward. >>>> >>>>> That's still out of xfs_io command scope. >>>> >>>> >>>> >>>> Not sure what the scope here you mean, sorry for that. >>>> >>> By "scope", I meant duplicate find part but that contradicts statement >>> you just written below: >>>> >>>> Since xfs_io can be used to find duplication, >>> >>> >>> Since "dedupe" command input only a "source file" and src and >>> dst_offset within that, so it can deduplicate the content within a >>> file where actual FS dedupe IOCTL can first ensure if two extents are >>> identical and if yes, then deduplicate them. >> >> >> By "deduplicate", if you mean "removing duplication" then xfs_io "dedupe" >> command itself doesn't do that. >> >> The old btrfs ioctl describe this better, FILE_EXTENT_SAME. >> "dedupe" command itself is only verifying if they have the same content. >> >> So to make it clear, "dedupe" command and ioctl only do the *verification* >> work. > > > Sorry, I just checked the code and tried the ioctl. > > If they are the same, "dedupe" will do "reflink" part also. > > Code also shows that: > --- > /* pass original length for comparison so we stay within i_size */ > ret = btrfs_cmp_data(olen, &cmp); > if (ret == 0) > ret = btrfs_clone(src, dst, loff, olen, len, dst_loff, 1); > --- > > So "dedupe" ioctl itself can do de-duplication. > And my previous answer is just totally wrong. > Yea. That corroborate my findings too. Thanks for confirming that :). Thanks Shally > Sorry for that, > Qu > > >> >> "Reflink" will really remove the duplication (or even non-duplicated data >> if you really want). >> >> >> But please be careful, "reflink" is much like copy, so it can be executed >> on file ranges with different contents. >> In that case, reflink can free some space, but it also modifies the >> content. >> >> So for full de-duplication, one must go through the full *verify* then >> *reflink* circle. >> Although "dedupe"(FILE_EXTENT_SAME) ioctl provides one verification >> method, it's not the only solution. >> >> But anyway, "dedupe" and "reflink" command provided by xfs_io does provide >> every pieces to do de-duplication, so the wiki is still correct IMHO. >> >> Thanks, >> Qu >> >>> >>> Is that correct? >>> >>> Thanks >>> Shally >>> >>> and can remove duplication, I >>>> >>>> don't find anything strange in that wiki page. >>>> (Especially considering how popular the tool is, you can't find any more >>>> handy tool than xfs_io) >>>> >>>> Thanks, >>>> Qu >>>> >>>> >>>>> Is that understanding correct? >>>>> Thanks >>>>> Shally >>>>>> >>>>>> >>>>>> >>>>>> Thanks, >>>>>> Qu >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> So, can anyone help here and point me what am I missing here. >>>>>>> >>>>>>> Thanks >>>>>>> Shally >>>>>>> -- >>>>>>> To unsubscribe from this list: send the line "unsubscribe >>>>>>> linux-btrfs" >>>>>>> in >>>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>>>> >>>>>> >>>>> -- >>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" >>>>> in >>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>> >> -- >> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >> the body of a message to majordomo@xxxxxxxxxxxxxxx >> More majordomo info at http://vger.kernel.org/majordomo-info.html -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
