Hi Filipi, 2015-06-09 18:36 GMT+08:00 Filipe David Manana <fdmanana@xxxxxxxxx>: > On Tue, Jun 9, 2015 at 11:04 AM, Robbie Ko <robbieko@xxxxxxxxxxxx> wrote: >> Hi Filipe, >> >> 2015-06-08 22:00 GMT+08:00 Filipe David Manana <fdmanana@xxxxxxxxx>: >>> On Mon, Jun 8, 2015 at 4:44 AM, Robbie Ko <robbieko@xxxxxxxxxxxx> wrote: >>>> Hi Filipe, >>> >>> Hi Robbie, >>> >>>> >>>> I've fixed "don't send utimes for non-existing directory" with another solution. >>>> >>>> In apply_dir_move(), the old parent dir. and new parent dir. will be >>>> updated after the current dir. has moved. >>>> >>>> And there's only one entry in old parent dir. (e.g. entry with >>>> smallest ino) will be tagged with rmdir_ino to prevent its parent dir. >>>> is deleted but updated. >>> >>> Can't parse this phrase. What do you mean by tagging an entry with rmdir_ino? >>> rmdir_ino corresponds to the number of a inode that wasn't deleted >>> when it was processed because there was some inode with a lower number >>> that is a child of the directory in the parent snapshot and had its >>> rename/move operation delayed (it happens after the directory we want >>> to delete is processed). >>> >> >> Right , my "tagged with rmdir_ino" is same meaning as you explained here. >> >>>> >>>> However, if we process rename for another entry not tagged with >>>> rmdir_ino first, its old parent dir. which is deleted will be updated >>>> according to apply_dir_move(). >>>> >>>> Therefore, I think we should check the existence of the dir. before >>>> we're going to update it's utime. >>>> >>>> The patch is pasted in the following link, could you give me some comment? >>>> >>>> https://friendpaste.com/h8tZqOS9iAUpp2DvgGI2k >>> >>> Looks better. >>> However I still don't understand your explanation, and just tried the >>> example in your commit message: >>> >>> "Parent snapshot: >>> >>> |---- a/ (ino 259) >>> |---- c (ino 264) >>> |---- b/ (ino 260) >>> |---- d (ino 265) >>> |---- del/ (ino 263) >>> |---- item1/ (ino 261) >>> |---- item2/ (ino 262) >>> >>> Send snapshot: >>> |---- a/ (ino 259) >>> |---- b/ (ino 260) >>> |---- c/ (ino 2) >>> |---- item2 (ino 259) >>> |---- d/ (ino 257) >>> |---- item1/ (ino 258)" >>> >>> So it's confusing after looking at it. >>> First the send snapshot mentions inode number 2, which doesn't exist >>> in the parent snapshot - I assume you meant inode number 264. >>> Then, the send snapshot has two inodes with number 259. Is "item2" in >>> the send snapshot supposed to be inode 262? >>> >> >> Your guess is right. And I correct it as follow. >> >> # Parent snapshot: >> # >> # |---- a/ (ino 259) >> # | |---- c (ino 264) >> # | >> # |---- b/ (ino 260) >> # | |---- d (ino 265) >> # | >> # |---- del/ (ino 263) >> # |---- item1/ (ino 261) >> # |---- item2/ (ino 262) >> >> # Send snapshot: >> # >> # |---- a/ (ino 259) >> # |---- b/ (ino 260) >> # |---- c/ (ino 264) >> # | |---- item2/ (ino 262) >> # | >> # |---- d/ (ino 265) >> # |---- item1/ (ino 261) >> >>> Anyway, assuming those 2 fixes to the example are correct guesses, I >>> tried the following and it didn't fail without your patches (i.e. no >>> attempts to send utimes to a non-existing directory): >>> >> >> Here my mean is : btrfs tries to get utime from non-existing directory and >> apply it on the existing directory. And my patch is attempted to avoid >> this case. >> However, this case is not guaranteed to cause error anytime but it may >> fails somehow >> which is depending on the data on the disk. >> The following are the incremental procedures to send the snapshot. >> >> utimes >> utimes a >> utimes b >> rename del -> o263-259-0 >> utimes >> rename a/c -> c >> utimes >> utimes a >> rename o263-259-0/item2 -> c/item2 >> utimes c/item2 >> utimes o263-259-0 <<---------------------- this step may cause error > > Why may it cause an error? > At that moment the name/path o263-259-0 exists at the destination > (i.e. the receiver, as it applies commands from the send stream > serially). Following is the error message when do receive. ERROR: utimes o263-259-0 failed. Invalid argument The argument of utimes for o263-259-0 is got from inode 263 in send root, But inode 263 is not exist in send root In send_utimes(), we didn't check if btrfs_search_slot returns 1, therefore may encounters this problem. I will try to make a stable reproducer. Thanks. Robbie Ko > >> utimes c >> utimes c >> rename b/d -> d >> utimes >> utimes b >> rename o263-259-0/item1 -> d/item1 >> rmdir o263-259-0 >> utimes d/item1 >> utimes d >> utimes d >> >> As the above pointed procedure, o263-259-0 is not appeared in the send root. > > Well yes, but that doesn't matter. > The oXXX-YYY-ZZZ names are never in any of the roots (send or parent), > they're just temporary names to allow for rename/move operations when > collisions are detected, and exist only in the receiver's filesystem. > >> When utime got from o263-259-0 is invalid (i.e. out of range of time >> format), it will fail. >> I saw the error occurs at utimensat() and got EINVAL. Here's >> explanation from linux man page. >> >> EINVAL Invalid value in one of the tv_nsec fields (value outside >> range 0 to 999,999,999, and not UTIME_NOW or UTIME_OMIT); or >> an invalid value in one of the tv_sec fields. >> >> So if o263-259-0 is not the send root, invalid format of utime may be got. > > Doubly confused now. So the whole change log (and the code changes) > mentions only attempts to send utimes for a directory/path that > doesn't exist in the receiver's fs - inode 263 exists in both the send > and parent roots, so we can always get its utimes values. Now you're > saying that somewhere in the send code we're getting incorrect values > for a utimes operation and then sending such operation to the > receiver? > > Getting the values for utimes is done looking up the inode item, by > its number, in the send root - for this the current name of the inode > in the receiver fs doesn't matter - it matters only for building the > path for the utimes operation. But what I'm understanding from your > reply is that we're getting a wrong utimes value form the inode item > for inode number 263, which would be a totally different issue from > generating incorrect paths. > > > So this still doesn't explain me why, without any of your patches > applied, the reproducer doesn't fail - which is really an important > aspect. > This is the xfstest I made with your reproducer: > https://friendpaste.com/2WyDxPe2FtVhOfECBk1VKF (and > tests/btrfs/999.out is just contains the single line "QA output > created by 999"). > > If you're able to give me a correct reproducer, I can make sense of it > and help getting a better change log. > Or this issue happens only after applying some of the other patches in > the series? > > thanks > > > > >> >> Thanks. >> Robbie Ko >> >>> # Parent snapshot: >>> # >>> # |---- a/ (ino 259) >>> # | |---- c (ino 264) >>> # | >>> # |---- b/ (ino 260) >>> # | |---- d (ino 265) >>> # | >>> # |---- del/ (ino 263) >>> # |---- item1/ (ino 261) >>> # |---- item2/ (ino 262) >>> >>> # Send snapshot: >>> # >>> # |---- a/ (ino 259) >>> # |---- b/ (ino 260) >>> # |---- c/ (ino 264) >>> # | |---- item2/ (ino 262) >>> # | >>> # |---- d/ (ino 265) >>> # |---- item1/ (ino 258) >>> >>> mkdir $SCRATCH_MNT/0 >>> mkdir $SCRATCH_MNT/1 >>> >>> mkdir $SCRATCH_MNT/a # 259 >>> mkdir $SCRATCH_MNT/b # 260 >>> mkdir $SCRATCH_MNT/item1 # 261 >>> mkdir $SCRATCH_MNT/item2 # 262 >>> mkdir $SCRATCH_MNT/del # 263 >>> mv $SCRATCH_MNT/item1 $SCRATCH_MNT/del/item1 >>> mv $SCRATCH_MNT/item2 $SCRATCH_MNT/del/item2 >>> mkdir $SCRATCH_MNT/a/c # 264 >>> mkdir $SCRATCH_MNT/b/d # 265 >>> >>> _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap1 >>> >>> mv $SCRATCH_MNT/a/c $SCRATCH_MNT/c >>> mv $SCRATCH_MNT/b/d $SCRATCH_MNT/d >>> mv $SCRATCH_MNT/del/item2 $SCRATCH_MNT/c >>> mv $SCRATCH_MNT/del/item1 $SCRATCH_MNT/d >>> rmdir $SCRATCH_MNT/del >>> >>> _run_btrfs_util_prog subvolume snapshot -r $SCRATCH_MNT $SCRATCH_MNT/mysnap2 >>> >>> run_check $FSSUM_PROG -A -f -w $tmp/1.fssum $SCRATCH_MNT/mysnap1 >>> run_check $FSSUM_PROG -A -f -w $tmp/2.fssum -x $SCRATCH_MNT/mysnap2/mysnap1 \ >>> $SCRATCH_MNT/mysnap2 >>> >>> _run_btrfs_util_prog send $SCRATCH_MNT/mysnap1 -f $tmp/1.snap >>> _run_btrfs_util_prog send -p $SCRATCH_MNT/mysnap1 $SCRATCH_MNT/mysnap2 \ >>> -f $tmp/2.snap >>> >>> _check_scratch_fs >>> >>> _scratch_unmount >>> _scratch_mkfs >/dev/null 2>&1 >>> _scratch_mount >>> >>> _run_btrfs_util_prog receive $SCRATCH_MNT -f $tmp/1.snap >>> run_check $FSSUM_PROG -r $tmp/1.fssum $SCRATCH_MNT/mysnap1 >>> >>> _run_btrfs_util_prog receive $SCRATCH_MNT -f $tmp/2.snap >>> run_check $FSSUM_PROG -r $tmp/2.fssum $SCRATCH_MNT/mysnap2 >>> >>> >>> >>> I would suggest making those hiearachy diagrams more readable - pipes >>> right below the name of their parent, continuation pipes like and >>> align all inode numbers in the same column, like the following: >>> >>> # Parent snapshot: >>> # >>> # |---- a/ (ino 259) >>> # | |---- c (ino 264) >>> # | >>> # |---- b/ (ino 260) >>> # | |---- d (ino 265) >>> # | >>> # |---- del/ (ino 263) >>> # |---- item1/ (ino 261) >>> # |---- item2/ (ino 262) >>> >>> # Send snapshot: >>> # >>> # |---- a/ (ino 259) >>> # |---- b/ (ino 260) >>> # |---- c/ (ino 264) >>> # | |---- item2/ (ino 262) >>> # | >>> # |---- d/ (ino 265) >>> # |---- item1/ (ino 258) >>> >>> (pasted here in case gmail screws up the indentation/formatting: >>> https://friendpaste.com/12wzqdcfFrlDdd1AiKX0bU) >>> >>> thanks >>> >>>> >>>> Thans! >>>> >>>> Robbie Ko >>>> >>>> 2015-06-05 0:14 GMT+08:00 Filipe David Manana <fdmanana@xxxxxxxxx>: >>>>> On Thu, Jun 4, 2015 at 12:18 PM, Robbie Ko <robbieko@xxxxxxxxxxxx> wrote: >>>>>> There's one case where we can't issue a utimes operation for a directory. >>>>>> When 263 will delete, waiting 261 and set 261 rmdir_ino, but 262 earlier >>>>>> processed and update uime between two parent directory. >>>>>> So fix this by not update non exist utimes for this case. >>>>> >>>>> So you mean that we attempt to update utimes for an inode, >>>>> corresponding to a directory, that exists in the parent snapshot but >>>>> not in the send snapshot. >>>>> >>>>> So the subject should be something like "Btrfs: incremental send, >>>>> don't send utimes for non-existing directory" instead of "Btrfs: >>>>> incremental send, fix rmdir not send utimes" >>>>> >>>>>> >>>>>> Example: >>>>>> >>>>>> Parent snapshot: >>>>>> |---- a/ (ino 259) >>>>>> |---- c (ino 264) >>>>>> |---- b/ (ino 260) >>>>>> |---- d (ino 265) >>>>>> |---- del/ (ino 263) >>>>>> |---- item1/ (ino 261) >>>>>> |---- item2/ (ino 262) >>>>>> >>>>>> Send snapshot: >>>>>> |---- a/ (ino 259) >>>>>> |---- b/ (ino 260) >>>>>> |---- c/ (ino 2) >>>>>> |---- item2 (ino 259) >>>>>> |---- d/ (ino 257) >>>>>> |---- item1/ (ino 258) >>>>>> >>>>>> Signed-off-by: Robbie Ko <robbieko@xxxxxxxxxxxx> >>>>>> --- >>>>>> fs/btrfs/send.c | 2 +- >>>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>>>> >>>>>> diff --git a/fs/btrfs/send.c b/fs/btrfs/send.c >>>>>> index e8eb3ab..46f954c 100644 >>>>>> --- a/fs/btrfs/send.c >>>>>> +++ b/fs/btrfs/send.c >>>>>> @@ -2468,7 +2468,7 @@ verbose_printk("btrfs: send_utimes %llu\n", ino); >>>>>> key.type = BTRFS_INODE_ITEM_KEY; >>>>>> key.offset = 0; >>>>>> ret = btrfs_search_slot(NULL, sctx->send_root, &key, path, 0, 0); >>>>>> - if (ret < 0) >>>>>> + if (ret != 0) >>>>>> goto out; >>>>> >>>>> So I don't think this is a good fix. The problem is in some code that >>>>> calls this function (send_utimes) against the directory that doesn't >>>>> exist - it just shouldn't do that, its logic should be fixed. >>>>> Following this approach, while it works, it's just hiding logic errors >>>>> in one or more code paths, and none of its callers checks for a return >>>>> value of 1 - they only react to values < 0 and introduces the >>>>> possibility of propagating a return value of 1 to user space. >>>>> >>>>> thanks >>>>> >>>>>> >>>>>> eb = path->nodes[0]; >>>>>> -- >>>>>> 1.9.1 >>>>>> >>>>>> -- >>>>>> To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in >>>>>> the body of a message to majordomo@xxxxxxxxxxxxxxx >>>>>> More majordomo info at http://vger.kernel.org/majordomo-info.html >>>>> >>>>> >>>>> >>>>> -- >>>>> Filipe David Manana, >>>>> >>>>> "Reasonable men adapt themselves to the world. >>>>> Unreasonable men adapt the world to themselves. >>>>> That's why all progress depends on unreasonable men." >>> >>> >>> >>> -- >>> Filipe David Manana, >>> >>> "Reasonable men adapt themselves to the world. >>> Unreasonable men adapt the world to themselves. >>> That's why all progress depends on unreasonable men." > > > > -- > Filipe David Manana, > > "Reasonable men adapt themselves to the world. > Unreasonable men adapt the world to themselves. > That's why all progress depends on unreasonable men." -- To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html
