On 03/03/2014 03:40 PM, Florian Westphal wrote: > Nikolay Aleksandrov <nikolay@xxxxxxxxxx> wrote: >> I stumbled upon this very serious bug while hunting for another one, >> it's a very subtle race condition between inet_frag_evictor, >> inet_frag_intern and the IPv4/6 frag_queue and expire functions (basically >> the users of inet_frag_kill/inet_frag_put). >> What happens is that after a fragment has been added to the hash chain but >> before it's been added to the lru_list (inet_frag_lru_add), it may get >> deleted (either by an expired timer if the system load is high or the >> timer sufficiently low, or by the fraq_queue function for different >> reasons) before it's added to the lru_list > > Sorry. Not following here, see below. > >> diff --git a/net/ipv4/inet_fragment.c b/net/ipv4/inet_fragment.c >> index bb075fc9a14f..322dcebfc588 100644 >> --- a/net/ipv4/inet_fragment.c >> +++ b/net/ipv4/inet_fragment.c >> @@ -278,9 +278,10 @@ static struct inet_frag_queue *inet_frag_intern(struct netns_frags *nf, >> >> atomic_inc(&qp->refcnt); >> hlist_add_head(&qp->list, &hb->chain); >> + inet_frag_lru_add(nf, qp); >> spin_unlock(&hb->chain_lock); >> read_unlock(&f->lock); > > If I understand correctly your're saying that qp can be free'd on > another/cpu timer right after dropping the locks. But how is it > possible? > > ->refcnt is bumped above when arming the timer (before dropping chain > lock), so even if the frag_expire timer fires instantly it should not > free qp. > > What am I missing? > > Thanks, > Florian > inet_frag_kill when called from the IPv4/6 frag_queue function will remove the timer refcount, then inet_frag_put afterwards will drop it to 0 and free it and all of this could happen before the frag was ever added to the LRU list, then it gets added. This happens much easier for IPv6 because of the dropping of overlapping fragments in its frag_queue function, the point is we need to have the timer's refcount removed in any way (it could be the timer itself - there's an inet_frag_put in the end, or much easier by the frag_queue function). I think I've explained it badly, I hope this makes it clearer :-) Cheers, Nik -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html