Re: Considerations in snapshotting and send/receive of nocow files?

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Shriramana Sharma posted on Sun, 30 Nov 2014 19:17:42 +0530 as excerpted:

> Given that snapshotting effectively reduces the usefulness of nocow, I
> suppose the preferable model to snapshotting and send/receiving such
> files would be different than other files.
> 
> Should nocow files (for me only VBox images) preferably be:
> 
> 1) under a separate subvolume?
> 
> 2) said subvol snapshotted less often?
> 
> 3) sent/received any differently?

If you look back in the list history at the nocow threads, you'll see a 
lot of my answers to exactly this sort of question.

In general I'd say "yes" to 1 and 2, separate subvolume, in part to allow 
snapshotting it less often.  For 3, I don't deal directly with send/
receive for my own use case and it's complex enough that I've not become 
as familiar with it as I have the general fragmentation issue, but 
because send does require creating a read-only snapshot, I'd characterize 
#3 as depending on #2, and would thus suggest treating it differently to 
the extent that you keep send and therefore snapshotting to the low side 
of your reasonable range.

Here's the reasoning in a more detailed step-by-step fashion.  (I'll use 
lettered points here to avoid confusing them with your numbered points 
above, which I may wish to reference below as well.)

A) The basic issue in principle: As you've apparently found from your 
research, snapshotting and nocow can be used together but disrupt 
absolute nocow, because a snapshot locks in place the existing version of 
the file, forcing a COW on the initial change written to a (4 KiB) file 
block after a snapshot covering the same file.  The file does remain 
nocow, however, and further changes written to the same file block will 
be nocow -- until the next snapshot forces another lock-in-place, of 
course.

B) The biggest immediate practical problem leading from A is that of high-
frequency automated snapshotting -- some people are going wild and 
snapshotting as often as once a minute... at least until they see some of 
the issues that can cause (like snapshots happening nearly instantly but 
snapshot deletion often taking longer than a minute, and the current 
scaling issues involved once there's several hundreds or thousands of 
snapshots to deal with).  On a busy VM triggering change-writes with a 
similar 1 minute or lower frequency, the snapshotting very quickly 
eliminates much of the anti-fragmentation benefit of nocow in the first 
place.

C) On a more general level once again, it should be easily apparent that 
the more change-writes you can squeeze between snapshots, the more 
effective the nocow is going to be, because a higher percentage of them 
will still be nocow.

D) That leads pretty directly to your points 1 and 2, put the nocow files 
on their own subvolume so snapshotting the parent doesn't affect them, 
and then snapshot the nocow subvolume at a lower frequency, as low a 
frequency as can reasonably fit within your use-case target range.  For 
example, for a normally daily snapshot scenario you might snapshot the 
parent daily and the nocow subvolume every other day or twice a week.  
For a normal 4X-daily snapshot scenario (every six hours on a 24-hour 
schedule or every two hours on an 8-hour-shift schedule), you might 
snapshot the nocow subvolume only once or twice a day.  Tho of course if 
the primary goal is the snapshotting of the nocow files (the VMs in your 
case), then you may still be snapshotting it at a higher frequency than 
the parent, which you may not in fact be snapshotting at all.  The point 
remains, snapshot the nocow subvolume at as low a frequency as can 
reasonably fit your use-case/goals.

E) Regarding your point #3, since send must be done from a read-only 
snapshot, obviously you'll need to snapshot at a frequency that at 
minimum equals that of your sends.  However, if your VMs are low activity 
enough that there's a reasonable chance they won't have written any 
changes during the send, and the send is the primary reason for the 
snapshot in the first place, you may avoid /some/ of the issue by 
deleting most snapshots as soon after the send as possible.

It would work like this.  You'd do your initial full send, creating an 
initial reference on both sides, with that snapshot retained on both 
sides /as/ that initial reference.  At your primary sending frequency, 
say once a day, you'd do the send against the original parent and delete 
the sending snapshot as soon as the send completed, thus making each 
daily incremental against the original.  At a lower frequency, perhaps 
once a week or once a month, you'd retain the sending snapshot but use 
the mitigation measures discussed in F below, and could then delete older 
initially-retained-weeklies and the original full-reference, perhaps 
keeping say two quarterly snapshots on the send side.

Then if you needed to reverse the send/receive, you'd still have the last 
weekly as a reference on both sides and could replay the last daily 
parented against it that you deleted on the one side to get back to the 
current day.  If you lost the last weekly as well, you could similarly 
replay the last weekly from the last quarterly.  Of course if you lost 
everything and were doing a full restore, you'd simply do a full send 
from the backup, without a reference to a parent on the (now) receive 
side.

F) Snapshotting effect mitigation:  A number of people using a 
snapshotting scheme such as the above have reported that while the nocow 
files they had were over the size btrfs' autodefrag mount option could 
reasonably handle, by choosing a snapshotting frequency at the low end of 
their target range they were able to reduce but not eliminate 
fragmentation, such that periodic defrag was effective mitigation for the 
remaining fragmentation.

Using the same daily-snapshot/send/delete, weekly-snapshot/send/retain 
example scenario as in #E, a weekly or biweekly defrag of the files in 
the nocow subvolume should help keep operational snapshot-effect 
fragmentation from getting /too/ extreme, and as I said, a similar defrag 
schedule has been reported by a number of people to work reasonably well, 
keeping fragmentation-related performance loss within reasonable bounds.

G) Defrag caveat:  Note that due to scaling issues, btrfs' defrag is not 
snapshot-aware.  Thus a scheduled defrag such as that suggested in #F 
would defrag only the files in the mounted subvolume it was pointed at, 
leaving the same files in retained snapshots fragmented.  To the extent 
that defrag actually does anything, moving blocks around to defrag files 
and breaking the reflinks to the snapshotted version, it will thus 
duplicate the actual space required by those file blocks.  If a file 
isn't fragmented, of course, it won't need moved and thus won't break the 
snapshot reflinks and require additional data space.

Assuming your filesystem data usage is comfortably under 50% and that you 
are deleting snapshots in a timely manner, the multiplying data effect 
should remain reasonable and manageable, however.  It won't multiply 
without bounds as long as you're deleting old snapshots and thus the 
multiple separate (defrag-broken reflinks) copies of the snapshotted data 
in a timely manner.

In the given scenario, you'd be defraging say once a week, with a daily 
snapshot/send/delete parented against a weekly snapshot/send/retain, 
which in turn would be parented against a quarterly snapshot/send/retain, 
with the intervening weeklies deleted.  There would thus be a maximum of 
two quarterly retained snapshots plus a weekly and a daily.  But the 
defrag would only be weekly and could be scheduled before the weekly 
retained snapshot, so data usage would be capped at 4X, 2X for the 
quarterly snapshots given they're beyond the defrag frequency, the live/
defragged copy that would be shared by the weekly and daily snapshot/
sends because the defrag is done /before/ the weekly-retained-snapshot/
send, and a transient copy from during and immediately after the weekly 
defrag that would become the weekly copy as soon as the new weekly 
snapshot/send/retain was done and the previous one deleted.

Assuming the data in your nocow subvolume is only a small fraction of the 
operating (non-snapshot) data in the entire filesystem and that the 
filesystem is at least twice the size of the operating data, space 
shouldn't be an issue.

Worst-case, the nocow subvolume is near 100% of the filesystem's 
operating data, in which case you'd need a filesystem about five times 
that size in ordered to allow for that 4X-capped copies of the 
snapshotted and defragged data as described above, plus the metadata 
overhead, etc.

Clear as mud? =:^)

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux