Re: [PATCH net-next] net: take care of cloned skbs in tcp_try_coalesce()

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 05/02/2012 09:46 AM, Eric Dumazet wrote:
> On Wed, 2012-05-02 at 09:27 -0700, Alexander Duyck wrote:
[...]
>>>>> @@ -4515,7 +4521,12 @@ copyfrags:
>>>>>  		offset = from->data - (unsigned char *)page_address(page);
>>>>>  		skb_fill_page_desc(to, skb_shinfo(to)->nr_frags,
>>>>>  				   page, offset, skb_headlen(from));
>>>>> -		*fragstolen = true;
>>>>> +
>>>>> +		if (skb_cloned(from))
>>>>> +			get_page(page);
>>>>> +		else
>>>>> +			*fragstolen = true;
>>>>> +
>>>>>  		delta = len; /* we dont know real truesize... */
>>>>>  		goto copyfrags;
>>>>>  	}
>>>>>
>>>>>
>>>> I don't see where we are now addressing the put_page call to release the
>>>> page afterwards.  By calling get_page you are incrementing the page
>>>> count by one, but where are you decrementing dataref in the shared
>>>> info?  Without that we are looking at a memory leak because __kfree_skb
>>>> will decrement the dataref but it will never reach 0 so it will never
>>>> call put_page on the head frag.
>>> really the dataref was already incremented at skb_clone() time
>>>
>>> It will be properly decremented since we call __kfree_skb()
>>>
>>> Only the last decrement will perform the put_page()
>>>
>>> Think about splice() is doing, its the same get_page() game.
>> I think you are missing the point.  So skb_clone will bump the dataref
>> to 2, calling get_page will bump the page count to 2.  After this
>> function you don't call __kfree_skb(skb) instead you call
>> kmem_cache_free(skbuff_head_cache, skb).  This will free the sk_buff,
>> but not decrement dataref leaving it at 2.  Eventually the raw socket
>> will call kfree_skb(skb) on the clone dropping the dataref to 1 and you
>> will call put_page dropping the page count to 1, but I don't see where
>> the last __kfree_skb call will come from that will drop dataref and the
>> page count to 0.
> No, you miss that _if_ we added one to page count, then we wont call
> kmem_cache_free(skbuff_head_cache, skb) but call __kfree_skb(skb)
> instead because fragstolen will be false.
>
> if (fragstolen)
> 	kmem_cache_free(...)
> else
> 	__kfree_skb(...)
>
> In future patch (addressing tcp coalescing in tcp_queue_rcv() as well),
> I'll add a helper to make this more clear.
You're correct about the fragstolen case, I actually was thinking of the
first patch you sent, not this second one.

However we still have a problem.  What we end up with now is a case of
sharing in which the clone skb no longer knows that it is sharing the
head with another skb.  The dataref will drop to 1 when we call
__kfree_skb.  This means that any other function out there that tries to
see if the skb is shared would return false.  This could lead to issues
if there is anything out there that manipulates the data in head based
on the false assumption that it is not cloned.  What we would probably
need to do in this case is tweak the logic for skb_cloned.  If you are
using a head_frag you should probably add a check that returns true if
cloned is true and page_count is greater than 1.  We should be safe in
the case of skb_header_cloned since we already dropped are dataref when
we stole the page and freed the skb.

Thanks,

Alex


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[Index of Archives]     [Linux Kernel Discussion]     [TCP Instrumentation]     [Ethernet Bridging]     [Linux Wireless Networking]     [Linux WPAN Networking]     [Linux Host AP]     [Linux WPAN Networking]     [Linux Bluetooth Networking]     [Linux ATH6KL Networking]     [Linux Networking Users]     [Linux Coverity]     [VLAN]     [Git]     [IETF Annouce]     [Linux Assembly]     [Security]     [Bugtraq]     [Yosemite Information]     [MIPS Linux]     [ARM Linux Kernel]     [ARM Linux]     [Linux Virtualization]     [Linux IDE]     [Linux RAID]     [Linux SCSI]