On 1/4/2012 9:11 μμ, Norbert Scheibner wrote:
On: Sun, 01 Apr 2012 20:19:24 +0300 Konstantinos Skarlatos wrote
I use btrfs for my backups. Ones a day I rsync --delete --inplace
the
complete system to a subvolume, snapshot it, delete some tempfiles
in the snapshot.
In my setup I rsync --inplace many servers and workstations, 4-6
times a day into a 12TB btrfs volume, each one in its own
subvolume. After every backup a new ro snapshot is created.
I have many cross-subvolume duplicate files (OS files, programs,
many huge media files that are copied locally from the servers to
the workstations etc), so a good "dedupe" script could save lots of
space, and allow me to keep snapshots for much longer.
So the script should be optimized not to try to deduplicate the whole
fs everytime but the newly written ones. You could take such a file
list out of the rsync output or the btrfs subvolume find-new
command.
a cron task with btrfs subvolume find-new would be ideal i think
Albeit the reflink patch, You could use such a bash-script inside one
subvolume, after the rsync and before the snapshot. I don't know how
much space it saves for You in this situation, but it's worth a try
and a good way to develop such a script, because before You write
anything to disc You can see how many duplicates are there and how
much space could be freed.
MfG Norbert
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html