Re: "Appending" data to the middle of a file using btrfs-specific features

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Tue, Dec 7, 2010 at 2:12 AM, Freddie Cash <fjwcash@xxxxxxxxx> wrote:
> On Mon, Dec 6, 2010 at 12:30 PM, Nirbheek Chauhan
> <nirbheek.chauhan@xxxxxxxxx> wrote:
>> But the behaviour of --inplace is not entirely to write out *only* the
>> blocks that have changed. From what I could make out, it does the
>> following:
>>
>> (1) Calculate a delta b/w the src and trg files
>> (2) Seek to the first difference in the target file
>> (3) Start writing data
>
> That may be true, I've never looked into the actual algorithm(s) that
> rsync uses. ÂJust played around with CLI options until we found the
> set that works best in our situation (--inplace --delete-during
> --no-whole-file --numeric-ids --hard-links --archive, over SSH with
> HPN patches).
>
>> I'm glossing over the final step because I didn't look deeper, but I
>> think you can safely assume that after the first difference, all data
>> is rewritten. So this is halfway between "rewrite the whole file" and
>> "write only the changed bits into the file". It doesn't actually use
>> any CoW features from what I can see. There is lots of room for btrfs
>> reflinking magic. :)
>>
>> Note that I tested this behaviour on a btrfs partition with a vanilla
>> rsync-3.0.7 tarball; the copy you use with ZFS might be doing some CoW
>> magic.
>
> All the CoW "magic" is handled by the filesystem, and not the tools on
> top. ÂIf the tool only updates X bytes, which fit into 1 block on the
> fs, then only that 1 block gets updated via CoW.
>

I'm quite sure that's what happens in btrfs too, but the thing about
updating in-place is that if you have

ABCDXXXEFGH

which needs to change to

ABCDZZZEFGH

You're all good. Only the blocks corresponding to XXX will be updated.
But if the change is

ABCDZZZZEFGH

You'll need to start rewriting EFGH since there's no way to insert
data in the middle (afaik) of a file with standard syscalls. Maybe
later you get a set of changes which sync you up with the file's
contents again, but the chances of that happening in a large file are
quite remote. That's why I said that it can be safely assumed that
after the first difference, all data is rewritten.

The only way to get around this on the filesystem level that I can
think of is data de-duplication; the filesystem doesn't let go of the
blocks for a while, and does reflinking if the same data is written
again. Perhaps that's what ZFS is doing, I have no idea :)

-- 
~Nirbheek Chauhan

Gentoo GNOME+Mozilla Team
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux