[PATCH v2 0/2] netpoll: Use skb_irq_freeable to make zap_completion_queue safe.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



Resending with the requested typo fix in the commit message and pruned
down to just the bare minimal bug fix, that this patchset is.

In some case such as trigger_all_cpu_backtrace netpoll can wind up
generating a lot of packets in hard irq context.  My rough estimate is
perhaps 1500 packets.  That is larger than any driver tx ring, which
makes netpoll_poll_dev necessary to transmit all of the netconsole
packets immediately.  Those 1500+ packets can take up a couple megabytes
of memory if we aren't careful.  On some machines that is enough to
start depleting the polls GFP_ATOMIC can dig into, so netpoll needs to
at a minimum to be able to reuse the memory for the skbs it has
transmitted.

Today this reclamation of transmitted packets happens in
zap_completetion_queue as dev_kfree_skb_irq places all packets to be
freed on a completion queue.  netpoll then searches this queue for
packets it thinks are freeable, and frees them.  Unfortunately
the current logic netpoll uses to decided a packet is freeable
is incorrect and thus unsafe :(

The logic netpoll uses to determine if a packet is freeable is to verify
a skb does not have a destructor.  Which works most of the time.  But in
pathological cases it can report that a packet is freeable in hard irq
context when it is not.

This set of changes adds a function skb_irq_freeable and uses that
function in zap_completion_queue to remove the bug.

While I don't expect this will allow anything except skbs sent by
netpoll to be freed, finding all packets that are freeable instead
of just find packets generated by netpoll that are guaranteed to be
freeable seems like the most robust and maintainable way of handling
this proble.

Eric W. Biederman (2):
      net: Add a test to see if a skb is freeable in irq context
      netpoll: Use skb_irq_freeable to make zap_completion_queue safe.

 include/linux/skbuff.h | 13 +++++++++++++
 net/core/netpoll.c     |  2 +-
 2 files changed, 14 insertions(+), 1 deletion(-)

---
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 18ef0224fb6a..113fee1b7b63 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb)
 { }
 #endif
 
+static inline bool skb_irq_freeable(struct sk_buff *skb)
+{
+       return !skb->destructor &&
+#if IS_ENABLED(CONFIG_XFRM)
+               !skb->sp &&
+#endif
+#if IS_ENABLED(CONFIG_NF_CONNTRACK)
+               !skb->nfct &&
+#endif
+               !skb->_skb_refdst &&
+               !skb_has_frag_list(skb);
+}
+
 static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping)
 {
        skb->queue_mapping = queue_mapping;
diff --git a/net/core/netpoll.c b/net/core/netpoll.c
index ed7740f7a94d..e33937fb32a0 100644
--- a/net/core/netpoll.c
+++ b/net/core/netpoll.c
@@ -270,7 +270,7 @@ static void zap_completion_queue(void)
                while (clist != NULL) {
                        struct sk_buff *skb = clist;
                        clist = clist->next;
-                       if (skb->destructor) {
+                       if (!skb_irq_freeable(skb)) {
                                atomic_inc(&skb->users);
                                dev_kfree_skb_any(skb); /* put this one back */
                        } else {
--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Discussion]     [TCP Instrumentation]     [Ethernet Bridging]     [Linux Wireless Networking]     [Linux WPAN Networking]     [Linux Host AP]     [Linux WPAN Networking]     [Linux Bluetooth Networking]     [Linux ATH6KL Networking]     [Linux Networking Users]     [Linux Coverity]     [VLAN]     [Git]     [IETF Annouce]     [Linux Assembly]     [Security]     [Bugtraq]     [Yosemite Information]     [MIPS Linux]     [ARM Linux Kernel]     [ARM Linux]     [Linux Virtualization]     [Linux IDE]     [Linux RAID]     [Linux SCSI]