Btrfs send bloat

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



I have 3-4 years worth of snapshots I use for backup purposes. I keep R-O live snapshots, two local backups, and AWS Glacier Deep Freeze. I use both send | receive and send > file. This works well but I get massive deltas when files are moved around in a GUI via samba. Reorganize a bunch of files and the next snapshot is 50 or 100 GB. Perhaps mv or cp with reflink=always would fix the problem but it's just not usable enough for my family.

I'd like a solution to the massive delta problem. Perhaps someone already has a solution, that would be great. If not, I need advice on a few ideas.

It seems a realistic solution to deduplicate the subvolume  before each snapshot is taken, and in theory I could write a small program to do that. However I don't know if that would work. Will Btrfs will let me deduplicate between a file on the live subvolume and a file on the R-O snapshot (really the same file but different path). If so, will Btrfs send with -p result in a small delta?

Failing that I could probably make changes to the send data stream, but that's suboptimal for the live volume and any backup volumes where data has been received.

Also, is it possible to access the Btrfs hash values for files so I don't have to recalculate file hashes for the whole volume myself?

Thanks in advance for any advice.




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux