Re: Please hammer my for-linus branch

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Wed, Jul 04, 2012 at 12:53:54AM -0600, Daniel J Blueman wrote:
> On 4 July 2012 13:19, Liu Bo <liubo2009@xxxxxxxxxxxxxx> wrote:
> > On 07/04/2012 11:37 AM, Daniel J Blueman wrote:
> >>> Hi everyone,
> >>>
> >>> I've got a nice set of fixes from Josef, Jan, Ilya and others in my
> >>> for-linus branch:
> >>>
> >>> git://git.kernel.org/pub/scm/linux/kernel/git/mason/linux-btrfs.git for-linus
> >>>
> >>> Some of the changes are fixes for the tree logging code, so I ran some
> >>> extra crash runs against them Friday night.
> >>>
> >>> I ended up with a new crash in the tree log directory deletion replay
> >>> code, so I didn't send out the pull request to Linus.
> >>>
> >>> It isn't clear yet if the new crash is because I was testing differently
> >>> or if it is a regression.  I'm nailing it down this weekend, but please
> >>> give my for-linus a shot.
> >>
> >> I consistently run into this assertion [1] while running a fio
> >> workload on a fresh RAID10 filesystem with a balance running.
> >>
> >> Let me know if you need steps to reproduce, debug etc.
> >
> > Seems that additional condition does not catch the bug.
> >
> > Plz show us the steps to reproduce, I'll try to reproduce it locally and nail it down.
> 
> The reproducer auto-generated from my test [1] consistently hits the
> spot here; config @ http://quora.org/2012/kconfig-btrfs . You'll need
> the fio workload file [2] in the same dir.
> 

Well that was a huge pain in the ass, you are going to have to tell me how to
fix this Arne or fix it yourself.  The problem was introduced here

00f04b88791ff49dc64ada18819d40a5b0671709

The problem is we no longer merge delayed refs on the fs trees anymore, and
somehow we end up with this sequence of events

alloc block
add backref for some random block
remove implicit backref
add implicit backref back <-- I'm not entirely sure why/how this happens, I just
			assume its some relocate magic
run refs

because we do the sequence thing we go to add the implicit backref and panic
because we find there is one already there, and that's not supposed to happen
with tree blocks.  If we had run the remove first we would have been fine or if
we had just merged the delayed refs they would have cancelled each other out and
we would have been fine.  In order to test this theory I took the seq
comparisons out of comp_entry in delayed-refs.c and the test has been running
for about 20 minutes, before it would die in less than 30 seconds.  So why is
this needed?  I assume you need it for something, but I figure its easier for
you to fix this than for me to go figure out what it's used for.  Thanks,

Josef
--
To unsubscribe from this list: send the line "unsubscribe linux-btrfs" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Filesystem Development]     [Linux NFS]     [Linux NILFS]     [Linux USB Devel]     [Linux Audio Users]     [Yosemite News]     [Linux Kernel]     [Linux SCSI]

  Powered by Linux