First of all, thanks for all replies; they've been quite insightful. On Nov 8, 2010, at 10:45 PM, Sean Bartell wrote: > From your questions, you don't seem to understand CoW. CoW is basically > an alternative to the logging/journalling used by most filesystems. > Actually, I do understand how CoW works. Although, maybe due to naiveté, I do lack some understanding on how it is applied on a full-fledged filesystem. >> 2) Each time a COW happens, is there any kind implicit 'snapshotting' that may keep track of changes around the filesystem for each COW? >> By Rodeh's paper and some info on the wiki, I gather that a new root is created for each COW, due to shadowing, but will the previous tree be kept? The wiki, at "BTRFS Design", states that "after the commit finishes, the older subvolume root items may be removed". This would make impossible to track changes to files, but 'btrfs subvolume find-new' still manages to output file generations, so there must be some info left behind. > > The old tree is discarded unless the user requested a snapshot of it. > > Every time btrfs updates the roots is a new generation. Some data > structures have "generation" fields, indicating the generation in which > they were most recently changed. This is mostly used to verify the > filesystem is correct, but it's also possible to scan the generation > fields and find out which files have changed. As Goffredo Baroncelli explained in a previous reply to my questions, the "find-new" command will search through keys with type BTRFS_EXTENT_DATA_TYPE. This command does print several changes to the same files throughout history since a given generation. My new question to this is rather simple: does BTRFS actually keep the data from this generations to which "find-new" has access, or is it only able to access information that records this changes? > >> 3) Following (2), is there any way to access this informations and, let's say, recover an older version of a given file? Or an entire previous tree? > > No, unless the user request a snapshot. I'm assuming you're not talking > about tools like PhotoRec, that try to reassemble files from whatever > disk data looks valid. You're right. What I mean is for one to be able to actually recover an old, recently-modified version of a file -- somewhat like a versioning system. I believe to have read that both WAFL and ZFS have similar support. I do understand now that this is only possible if one explicitly creates a snapshot. However, I thought that, by using shadowing of the changed blocks, it could be quite inexpensive (aside from a storage point-of-view) to implicitly keep multiple "versions" of the tree --- unchanged blocks would be kept shared among "tree versions" until CoWed, if they were ever changed. With these versions one would be able to recover, or restore, a file or the entire filesystem, regardless of having created an explicit checkpoint. Then again, I understand this is not how it works with BTRFS, and neither do I have a clue if it is feasible such support. > >> 4) From Rodeh's paper I got the idea that BTRFS uses periodic checkpointing, in order to assign generations to operations. Using 'btrfs subvolume find-new' I confirmed my suspicions. After copying two different directories into the same subvolume at the same time, all files got assigned the same generation and it took a while until they all showed up. This raises the question: what triggers a new checkpoint? Is it based on elapsed time since last checkpoint? Is it triggered by a COW and then, all COWs happening at the same time will be put together and create a big new generation? > > Again, periodic checkpointing is probably the wrong way to think about > it. It would be wasteful to overwrite the superblocks every time a > change is made; instead, btrfs may combine multiple changes into one > generation and only update the superblocks once. I'm not sure exactly > how btrfs decides when to write a new generation. As Chris Samuel stated in another reply, at some point I did made the link between BTRFS' checkpointing and NILFS'. Although I assumed BTRFS' checkpointing was hardcoded somewhere in the code. If this is not the case, I'm still wondering how such decision is made, for I have not yet found where this checkpointing is happening in the code. Regards. --- João Eduardo Luís-- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
