Re: 5 _thousand_ snapshots? even 160? (was: device balance times)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Oct 22, 2014 at 07:41:32AM +0000, Duncan wrote:
> Tomasz Chmielewski posted on Wed, 22 Oct 2014 09:14:14 +0200 as excerpted:
> >> Tho that is of course per subvolume.  If you have multiple subvolumes
> >> on the same filesystem, that can still end up being a thousand or two
> >> snapshots per filesystem.  But those are all groups of something under
> >> 300 (under 100 with hourly) highly connected to each other, with the
> >> interweaving inside each of those groups being the real complexity in
> >> terms of btrfs management.
> 
> IOW, if you thin down the snapshots per subvolume to something reasonable 
> (under 300 for sure, preferably under 100), then depending on the number 
> of subvolumes you're snapshotting, you might have a thousand or two.  
> However, of those couple thousand, btrfs will only have to deal with the 
> under 300 and preferably well under a hundred in the same group, that are 
> snapshots of the same thing and thus related to each other, at any given 
> time.  The other snapshots will be there but won't be adding to the 
> complexity near as much since they're of different subvolumes and aren't 
> logically interwoven together with the ones being considered at that 
> moment.
> 
> But even then, at say 250 snapshots per subvolume, 2000 snapshots is 8 
> independent subvolumes.  That could happen.  But 5000 snapshots?  That'd 
> be 20 independent subvolumes, which is heading toward the extreme again.  
> Yes it could happen, but better if it does to cut down on the per-
> subvolume snapshots further, to say the 25 per subvolume I mentioned, or 
> perhaps even further.  25 snapshots per subvolume with those same 20 
> subvolumes... 500 snapshots total instead of 5000. =:^)

If you have one subvolume per user and 1000 user directories on a server,
it's only 5 snapshots per user (last hour, last day, last week, last
month, and last year).  I hear this is a normal use case in the ZFS world.
It would certainly be attractive if there was working quota support.

I have datasets where I record 14000+ snapshots of filesystem directory
trees scraped from test machines and aggregated onto a single server
for deduplication...but I store each snapshot as a git commit, not as
a btrfs snapshot or even subvolume.

We do sometimes run queries like "in the last two years, how many times
did $CONDITION occur?" which will scan a handful files in all of the
snapshots.  The use case itself isn't unreasonable, although using the
filesystem instead of a more domain-specific tool to achieve it may be.

Attachment: signature.asc
Description: Digital signature


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux