|
|
|
Re: Data deduplication in LVM? | |
| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] | |
Les Mikesell wrote:
Roy Sigurd Karlsbakk wrote:On 11. juni. 2009, at 00.30, Stuart D. Gathman wrote:One OSS backup product that doesdeduplication is BackupPC (written in Perl). In the backup server, every file gets hard linked to a name in a special directory that is its md5 checksum(plus some fiddly logic to handle metadata)This sounds like file-level deduplication. Most storage systems sing dedup, uses block-level dedup. NetApp is one example; they dedup everything with 4k blocks, doing the actual deduplication at night.Yes, it is a different concept. However it does work very well when you are storing your backups on a filesystem without block-level dedup. And that is probably the place where you have the most redundancy - or if you don't already, you'll be able to store a much longer history.
Apologies for following up my own post, but this does remind me of a slightly related problem that someone here might have solved. The backuppc archive ends up containing such a large number of directory entries and hardlinks that it is typically impractical to copy by any file-oriented means or even rsync. A recurring topic on the backuppc mail list is how to make a copy for offsite storage.
Personally I use a RAID1 created with 3 mirror members and periodically swap one out and resync, but that's not very elegant. Is there a better way or one that could be incrementally updated across a WAN? Does LVM have a mechanism like zfs's incremental snapshot send/receive? (Not sure if that would work either but it sounds promising). Is there any other way to do a block-oriented remote copy? Would LVM mirroring work as well or better than md-device raid? The partition can stay mounted while the raid rebuilds but realistically not much else can be happening because of the performance impact, and I unmount momentarily while removing the member to get a clean filesystem.
Are there tricks with drbd or perhaps raid over iscsi that would let a periodic sync work incrementally - well enough to use over a WAN?
--
Les Mikesell
lesmikesell@gmail.com
_______________________________________________
linux-lvm mailing list
linux-lvm@redhat.com
https://www.redhat.com/mailman/listinfo/linux-lvm
read the LVM HOW-TO at http://tldp.org/HOWTO/LVM-HOWTO/
[Site Home] [Kernel list] [Linux Clusters] [Device Mapper] [Security] [Bugtraq] [Photos] [Yosemite] [MIPS Linux] [ARM Linux] [Linux Security] [Linux RAID] [Linux Resources]
![]() |
![]() |