On Tue, 2012-03-27 at 09:47 -0700, Ben Greear wrote: > On 03/26/2012 04:39 PM, Eric Dumazet wrote: > > On Mon, 2012-03-26 at 16:06 -0700, Ben Greear wrote: > >> On 03/26/2012 02:53 PM, Ben Greear wrote: > >>> On 03/26/2012 02:49 PM, David Miller wrote: > >>>> > >>>> Looks like all of those strange undiagnosable reported Dave Jones > >>>> has been feeding us. Something in one part of the kernel leaves > >>>> a lock held, and this shows up as a warning elsewhere. > >>> > >>> Every (initial) bug printout fingers ipv6 and the 'ip' tool on my system. > >> > >> I added a patch to convert rcu_read_lock/unlock to macros so > >> that I could automatically grab the call site (_THIS_IP_) > >> and pass it into the lockdep framework instead of the (useless) > >> _THIS_IP_ in the old rcu_read_lock method which at best seems to > >> only indicate which module the issue relates to... > > > > Hi Ben > > > > Is this problem also appears with current tree ? > > (This could be a problem with the backport, as it was full of > > dependencies) > > > > Also, if you use a patch to better track rcu_read_lock()/unlock(), you > > could add new macros as well to track that a particular unlock() matches > > one given lock(). (maybe returning the rcu_preempt_depth at > > rcu_read_lock() time , but maybe a more absolute ref would be better) > > > > So we could have a warning if an unlock() doesnt match the lock() > > > > inet6_dump_fib () was already a suspect but we could not find why. > > > Ok, I tried the patch below, and got the result farther down. Is this > what you were thinking of? (The lockdep warning about rcu lock still > held happened immediately after this..so it appears the depth mis-match > does represent this problem... > > > [greearb@fs3 linux-3.0.dev.y]$ git diff > diff --git a/net/ipv6/ip6_fib.c b/net/ipv6/ip6_fib.c > index 0f9b37a..ae3c7c9 100644 > --- a/net/ipv6/ip6_fib.c > +++ b/net/ipv6/ip6_fib.c > @@ -366,6 +366,7 @@ static int inet6_dump_fib(struct sk_buff *skb, struct netlink_callback *cb) > struct hlist_node *node; > struct hlist_head *head; > int res = 0; > + int depth = current->lockdep_depth; > > s_h = cb->args[0]; > s_e = cb->args[1]; > @@ -410,6 +411,8 @@ next: > } > out: > rcu_read_unlock(); > + WARN(depth != current->lockdep_depth, "depth: %i lockdep-depth: %i\n", > + depth, current->lockdep_depth); > cb->args[1] = e; > cb->args[0] = h; > > > > ------------[ cut here ]------------ > WARNING: at /home/greearb/git/linux-3.0.dev.y/net/ipv6/ip6_fib.c:415 inet6_dump_fib+0x25c/0x292 [ipv6]() > Hardware name: To be filled by O.E.M. > depth: 1 lockdep-depth: 2 > Modules linked in: 8021q garp stp llc fuse macvlan pktgen coretemp hwmon sunrpc ipv6 uinput arc4 ath9k snd_hda_codec_realtek mac80211 snd_hda_intel > snd_hda_codec snd_hwdep snd_seq ath9k_common ath9k_hw snd_seq_device snd_pcm ath snd_timer e1000e cfg80211 snd mei(C) ppdev microcode i2c_i801 iTCO_wdt > soundcore serio_raw pcspkr snd_page_alloc iTCO_vendor_support parport_pc parport i915 drm_kms_helper drm i2c_algo_bit i2c_core video [last unloaded: scsi_wait_scan] > Pid: 6563, comm: ip Tainted: G C 3.0.25+ #16 > Call Trace: > [<ffffffff81046866>] warn_slowpath_common+0x80/0x98 > [<ffffffff81046912>] warn_slowpath_fmt+0x41/0x43 > [<ffffffffa0251a3a>] inet6_dump_fib+0x25c/0x292 [ipv6] > [<ffffffff813af450>] netlink_dump+0x5b/0x19b > [<ffffffff81385da2>] ? consume_skb+0x28/0x2a > [<ffffffff813af7bf>] netlink_recvmsg+0x1c7/0x2f8 > [<ffffffff8137c6cf>] __sock_recvmsg_nosec+0x65/0x6e > [<ffffffff8137dde0>] __sock_recvmsg+0x49/0x54 > [<ffffffff8137e349>] sock_recvmsg+0xa6/0xbf > [<ffffffff81072bf8>] ? lock_release_non_nested+0x9d/0x227 > [<ffffffff810ca002>] ? might_fault+0x4e/0x9e > [<ffffffff810ca04b>] ? might_fault+0x97/0x9e > [<ffffffff81387cae>] ? copy_from_user+0x2a/0x2c > [<ffffffff810ca002>] ? might_fault+0x4e/0x9e > [<ffffffff81388080>] ? verify_iovec+0x4f/0xa3 > [<ffffffff8137e0c4>] __sys_recvmsg+0x147/0x21e > [<ffffffff81063868>] ? up_read+0x1e/0x36 > [<ffffffff810fc9fb>] ? fcheck_files+0xb7/0xee > [<ffffffff810fcb30>] ? fget_light+0x3b/0xbc > [<ffffffff8137e8a0>] sys_recvmsg+0x3d/0x5b > [<ffffffff81450e92>] system_call_fastpath+0x16/0x1b > ---[ end trace 5232c09c4fb31d15 ]--- > > > I found the bug in rt6_fill_node() will send a patch in a couple of minutes -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html