Re: backref for an extent not found in send_root (!backref_ctx->found_itself)

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Hi Jan,
attached are bash scripts to repro the issue.

Some instructions on how to run them:
- create 2 btrfs filesystems with "mkfs.btrfs /dev/sdXXX". I don't
think that size matters.
- mount them in /mnt/src and /mnt/dst
- mount options: noatime,nodatasum,nodatacow,nospace_cache
- put the 3 scripts into one directory and cd to it
- run btrfs_init_tests.sh (it sets up a small file tree for tests)
- run btrfs_test_first_ref_jan.sh

After about 20-30 seconds, it hits the error I mentioned and script
stops. It happens on "for-linus" branch, top commit
1eafa6c73791e4f312324ddad9cbcaf6a1b6052b.
I suspect the issue might be that the test schedules a lot of
subvolumes for deletion, and once the cleaner thread kicks in and also
starts doing backref stuff, the problem happens.

Another small note: there is an issue in btrfs-progs subvolume listing
code (also used by send). When it finds a ROOT_ITEM in the root tree
that is not linked with ROOT_REF/ROOT_BACKREF (i.e., one scheduled for
deletion), it gets confused and exits. Miao sent a patch to fix it
here:
http://www.spinics.net/lists/linux-btrfs/msg19767.html
I don't think it got merged into progs yet (progs are really behind:()

If you want a quick fix, add code like this to the beginning of
__list_subvol_fill_paths (but Miao sent a better patch):
	/*
	 * due to change in __list_subvol_search(), root_lookup
	 * might contain subvolumes with ref_tree==0 (in deletion).
	 */
again:	
	n = rb_first(&root_lookup->root);
	while (n) {
		struct root_info *entry = rb_entry(n, struct root_info, rb_node);
		if (entry->ref_tree == 0) {
			fprintf(stderr, "__list_subvol_fill_paths: drop root_id=%llu,
because it has no ref_tree\n", entry->root_id);
			rb_erase(n, &root_lookup->root);
			free(entry);
			goto again;
		}
		n = rb_next(n);
	}

Otherwise, "btrfs send" might fail, but this is not the failure we are
looking for:)

Thanks,
Alex.





On Tue, Jan 29, 2013 at 11:07 AM, Jan Schmidt <list.btrfs@xxxxxxxxxxxxx> wrote:
> Hi Alex,
>
> On Mon, January 28, 2013 at 17:11 (+0100), Alex Lyakas wrote:
>> Hi Jan,
>> I have a set of unit tests (part of the larger system) for the
>> send-receive functionality, with which I am able to hit this error:
>>
>> Jan 28 18:01:00 687-dev kernel: [16968.451358] btrfs: ERROR did not
>> find backref in send_root. inode=259, offset=139264, disk_byte=4263936
>> found extent=4263936
>>
>> As the code states, this could indicate a bug in backref walking. This
>> reproduces with "for-linus" branch.
>>
>> Typically this happens when a snapshot is deleted, immediately a new
>> snap with the same name is created, and then "btrfs send" is issued
>> without parent (i.e., full-send) on this snap.
>>
>> To debug this further, we can do one of two things:
>> # I can apply patches/debug prints & reproduce
>> # I can work to isolate the unit test into a bash script and send you
>> a script that reproduces
>
> I'd prefer #2 of the above. You can also send me the unit tests you've got if I
> can get them running without multiple days of setup.
>
> I'm guessing that this is more likely going to end up in send.c than in
> backref.c, perhaps Alexander would like to trace this one down. But anyway, send
> me a reproducer (in private, if you don't want to publish it) and we'll see
> what's going on.
>
> Thanks,
> -Jan

Attachment: btrfs_functions.sh
Description: Bourne shell script

Attachment: btrfs_init_tests.sh
Description: Bourne shell script

Attachment: btrfs_test_first_ref_jan.sh
Description: Bourne shell script


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux