Hi Chris
On Monday 15 March 2010, Chris Mason wrote:
> On Fri, Mar 12, 2010 at 07:30:01PM +0100, Goffredo Baroncelli wrote:
> > On Friday 12 March 2010, Pat Patterson wrote:
> > > Are there any plans to implement something akin to ZFS send/recv, to
> > > be able to create a stream representation of a snapshot and restore it
> > > later/somewhere else? I've spent some time trawling the mailing list
> > > and wiki, but I don't see anything there.
> >
> > I spent a bit of time on this argument, in order to find how implement an
> > efficient method to backup incrementally the data.
> >
> > AFAICT "zfs send" and "zfs recv" do the same thing that tar does. They
> > transform a tree (or the difference between a tree and its snapshot) to a
> > stream, and vice-versa.
> >
> > To transform a tree to a stream is not very interesting.
> > The interesting part is how compare a tree and its snapshot. In fact a
> > snapshot of a tree should a be pointer to the original tree, and when a
file
> > is modified, a branch of the modified part (the extens of the file, the
> > directories of the path) is performed (yes I know that this a big
> > simplification of the process).
> > The key is that the file-system knows which part of a snapshot is still
equal
> > to the source and which not.
> >
> > If this kind of data is available to the user space, comparing a tree and
it
> > snapshot should be very fast.
> >
> > Reading the documentation of btrfs, it seems that associated the
transaction
> > there is a "version number". With this "version number" of a directory,
we
> > would be able to verify the equality of two trees comparing only the root
of
> > the trees. This would increase the seed of two trees.
>
> Every btree block and file extent include the transaction id of when
> they were created. When COW is on, this means they include the
> transaction id of when they were last modified.
>
> Finding updated file extents means searching through the tree based on
> transaction id (ignoring any branch in the tree older than transid X),
> which is exactly what the treelog code does to efficiently log fsyncs.
> This is especially easy because the tree node pointers include the
> expected transaction id of what they are pointing to, so you can skip
> reading any tree block with an old pointer.
If I understand correctly, you say that it is possible to find the file update
between two transaction id. It would be wonderful. Even though a question
comes me: what about if the transaction doesn't contain the snapshot alone ?
Could the "delta" contain writes happened after the second transaction or
before the first transaction ?
> In the subvol branch, we have a new ioctl to do tree searches from
> userland based on these ranges. It can very easily be used to make a
> list of files (and extents in those files) that have been updated since
> a given transid.
>
> >
> > But I was never able to get this "version number". There is the ioctl
command
> > FS_IOC_GETVERSION, which seems to return this number. But when a directory
or
> > an its children is update, this number doesn't change.
> >
> > I tried to hack the kernel code in order to test different "version"
number: I
> > tried inode->i_generation, or btrfs_inode->generation or btrfs_inode-
>sequence
> > or btrfs_inode->{last|last_sub|logged}_trans...
> > But none of the above was useful for my purpose.
>
> Right, I decided instead to store the generation in the file extent
> pointer. We needed it for other things as well, and it makes it
> possible to find individual extents that have changed in a file instead
> of just flagging the file as modified.
>
> This would be a good project if anyone is interested, I'm happy to send
> along full details.
If you are able to provide further details, I am interested in the things.
I appreciate any suggestion how extract the transaction ID given a file (or a
directory).
>
> -chris
Goffredo
--
gpg key@ keyserver.linux.it: Goffredo Baroncelli (ghigo) <kreijackATinwind.it>
Key fingerprint = 4769 7E51 5293 D36C 814E C054 BF04 F161 3DC5 0512
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html