Incremental send receive of snapshot fails

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi all


I have a problem with incremental snapshot send receive in btrfs. May be my google-fu is weak, but I couldn't find any pointers, so here goes.


A few words about my setup first:

I have multiple clients that back up to a central server. All clients (and the server) are running a (K)Ubuntu 16.10 64Bit on btrfs. Backing up works with btrfs send / receive, either full or incremental, depending on whats available on the server side. All clients have the usual (Ubuntu) btrfs layout: 2 subvolumes, one for / and one for /home; explicit entries in fstab; root volume not mounted anywhere. For further details see the P.s. at the end.


Here's what happens:

In general I stick to the example form https://btrfs.wiki.kernel.org/index.php/Incremental_Backup . Backing up is done daily by a script, and it works successfully on all of my clients except one (called "lab").

I start with the first snapshot on "lab" and do a full send to the server. This works as expected (sending takes some hours as it is done over wifi+ssh). After that is done I send an incremental snapshot based on the previous parent. This also works as expected (no error etc). Sending deltas then happens once a day, with the script always keeping the last two snapshots on the client and many more on the server. Also after each run of the script I do a bit of "house keeping" to prevent "disk full" etc (see below p.s. for commands).

I can't exactly say when, but after some time (possibly the next day) snapshot sending fails with an error on the receiving end:
ERROR: unlink some/file failed. No such file or directory

Some searching around lead me to this https://bugzilla.kernel.org/show_bug.cgi?id=60673 . So I checked to make sure my script doesn't use the wrong parent; and it does not. But to make really sure I tried a send / receive directly on "lab" without the server:

# btrfs subvol snap -r / /.back/new_snap
Create a readonly snapshot of '/' in '/.back/new_snap'

# btrfs subv show /.back/last_snap_by_script
/.back/last_snap_by_script
        Name:                   last_snap_by_script
        UUID:                   b4634a8b-b74b-154a-9f17-1115f6d07524
        Parent UUID:            b5f9a301-69f7-0646-8cf1-ba29e0c24fac
        Received UUID:          196a0866-cd05-d24e-bac6-84e8e5eb037a
        Creation time:          2016-12-27 17:55:10 +0100
        Subvolume ID:           486
        Generation:             52036
        Gen at creation:        51524
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

# btrfs subv show /.back/new_snap
/.back/new_snap
        Name:                   new_snap
        UUID:                   fca51929-8101-db45-8df6-f25935c04f98
        Parent UUID:            b5f9a301-69f7-0646-8cf1-ba29e0c24fac
        Received UUID:          196a0866-cd05-d24e-bac6-84e8e5eb037a
        Creation time:          2016-12-28 11:51:43 +0100
        Subvolume ID:           506
        Generation:             52271
        Gen at creation:        52271
        Parent ID:              257
        Top level ID:           257
        Flags:                  readonly
        Snapshot(s):

# btrfs send -p /.back/last_snap_by_script /.back/new_snap > delta
At subvol /.back/new_snap

# btrfs subvol del /.back/new_snap
Delete subvolume (no-commit): '/.back/new_snap'

# cat delta | btrfs receive /.back/
At snapshot new_snap
ERROR: unlink some/file failed. No such file or directory

And the receive always fails with some ERROR similar to the above! What I find a bit odd is the identical "Received UUID", even before new_snap was sent / received ... but maybe that's normal?

If instead of "last_snap_by_script" I also create a new read only snapshot and send the delta between these two "new" ones, everything works as expected. But then there's little differences between the two new snaps ...

I tried to look for differences between the "lab" client and another one ("navi") where backing up works. So far I couldn't really find anything. I did create both file systems at different points in time (possibly with different kernels). All fs were created as btrfs and not "converted" from ext. "lab" has an SSD, "navi" a spinning disc. Both systems run on Intel CPUs in 64Bit ...


So now I have a snapshot on "lab" which I cannot use as a parent, but why? What did I do wrong? The whole procedure does work on my other clients (with the exact same script), why not on the "lab" client? And this is a re-occuring problem: I tried deleting all of the snaps (on both ends) and start all over again ... it will again end up with a "broken" snapshot eventually.


Up until now using btrfs has been a great experience and I always could resolve my troubles quite quickly, but this time I don't know what to do? Thanks in advance for any suggestions and feel free to ask for other / missing details :-)


Regards
Rene


P.s.: here's my system info from the failing client "lab"

$ uname -a
Linux lab 4.8.0-32-generic #34-Ubuntu SMP Tue Dec 13 14:30:43 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux

$ btrfs --version
btrfs-progs v4.7.3

# btrfs fi show
Label: 'SSD'  uuid: 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
        Total devices 1 FS bytes used 37.62GiB
        devid    1 size 55.90GiB used 41.03GiB path /dev/sdb1

# btrfs fi df /
Data, single: total=40.00GiB, used=37.09GiB
System, single: total=32.00MiB, used=16.00KiB
Metadata, single: total=1.00GiB, used=543.08MiB
GlobalReserve, single: total=112.38MiB, used=0.00B

$ mount | grep btrfs
/dev/sdb1 on / type btrfs (rw,noatime,ssd,space_cache,subvolid=257,subvol=/@) /dev/sdb1 on /home type btrfs (rw,noatime,ssd,space_cache,subvolid=286,subvol=/@home)

# btrfs scrub start -B /
scrub done for 122ecca7-9804-4c8a-b4ed-42fd6c6bbe7a
scrub started at Wed Dec 28 12:05:53 2016 and finished after 00:02:24
        total bytes scrubbed: 37.76GiB with 0 errors

"house keeping" mostly based on suggestions from Marc's Blog (http://marc.merlins.org/perso/btrfs/)
# /bin/btrfs balance start -v -dusage=0 /
# /bin/btrfs balance start -v -dusage=60 -musage=60 -v /

I can add a dmesg output on request, but so far I couldn't observe any reaction there...
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux