On May 23, 2014, at 9:48 AM, Konstantinos Skarlatos <k.skarlatos@xxxxxxxxx> wrote: > On 21/5/2014 3:58 πμ, Chris Murphy wrote: >> On May 20, 2014, at 4:56 PM, Konstantinos Skarlatos <k.skarlatos@xxxxxxxxx> wrote: >> >>> On 21/5/2014 1:37 πμ, Mark Fasheh wrote: >>>> On Tue, May 20, 2014 at 01:07:50AM +0300, Konstantinos Skarlatos wrote: >>>>>> Duperemove will be shipping as supported software in a major SUSE release so >>>>>> it will be bug fixed, etc as you would expect. At the moment I'm very busy >>>>>> trying to fix qgroup bugs so I haven't had much time to add features, or >>>>>> handle external bug reports, etc. Also I'm not very good at advertising my >>>>>> software which would be why it hasn't really been mentioned on list lately >>>>>> :) >>>>>> >>>>>> I would say that state that it's in is that I've gotten the feature set to a >>>>>> point which feels reasonable, and I've fixed enough bugs that I'd appreciate >>>>>> folks giving it a spin and providing reasonable feedback. >>>>> Well, after having good results with duperemove with a few gigs of data, i >>>>> tried it on a 500gb subvolume. After it scanned all files, it is stuck at >>>>> 100% of one cpu core for about 5 hours, and still hasn't done any deduping. >>>>> My cpu is an Intel(R) Xeon(R) CPU E3-1230 V2 @ 3.30GHz, so i guess thats >>>>> not the problem. So I guess the speed of duperemove drops dramatically as >>>>> data volume increases. >>>> Yeah I doubt it's your CPU. Duperemove is right now targeted at smaller data >>>> sets (a few VMS, iso images, etc) than you threw it at as you undoubtedly >>>> have figured out. It will need a bit of work before it can handle entire >>>> file systems. My guess is that it was spending an enormous amount of time >>>> finding duplicates (it has a very thorough check that could probably be >>>> optimized). >>> It finished after 9 or so hours, so I agree it was checking for duplicates. It does a few GB in just seconds, so time probably scales exponentially with data size. >> I'm going to guess it ran out of memory. I wonder what happens if you take an SSD and specify a humongous swap partition on it. Like, 4x, or more, the amount of installed memory. > Just tried it again, with 32GiB swap added on an SSD. My test files are 633GiB. > duperemove -rv /storage/test 19537.67s user 183.86s system 89% cpu 6:06:56.96 total > > Duperemove was using about 1GiB or RAM, had one core at 100%, and I think swap was not touched at all. Guess currently it's not as memory intensive as it is cpu intensive while also not threading. Chris Murphy -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
