Resending with the requested typo fix in the commit message and pruned down to just the bare minimal bug fix, that this patchset is. In some case such as trigger_all_cpu_backtrace netpoll can wind up generating a lot of packets in hard irq context. My rough estimate is perhaps 1500 packets. That is larger than any driver tx ring, which makes netpoll_poll_dev necessary to transmit all of the netconsole packets immediately. Those 1500+ packets can take up a couple megabytes of memory if we aren't careful. On some machines that is enough to start depleting the polls GFP_ATOMIC can dig into, so netpoll needs to at a minimum to be able to reuse the memory for the skbs it has transmitted. Today this reclamation of transmitted packets happens in zap_completetion_queue as dev_kfree_skb_irq places all packets to be freed on a completion queue. netpoll then searches this queue for packets it thinks are freeable, and frees them. Unfortunately the current logic netpoll uses to decided a packet is freeable is incorrect and thus unsafe :( The logic netpoll uses to determine if a packet is freeable is to verify a skb does not have a destructor. Which works most of the time. But in pathological cases it can report that a packet is freeable in hard irq context when it is not. This set of changes adds a function skb_irq_freeable and uses that function in zap_completion_queue to remove the bug. While I don't expect this will allow anything except skbs sent by netpoll to be freed, finding all packets that are freeable instead of just find packets generated by netpoll that are guaranteed to be freeable seems like the most robust and maintainable way of handling this proble. Eric W. Biederman (2): net: Add a test to see if a skb is freeable in irq context netpoll: Use skb_irq_freeable to make zap_completion_queue safe. include/linux/skbuff.h | 13 +++++++++++++ net/core/netpoll.c | 2 +- 2 files changed, 14 insertions(+), 1 deletion(-) --- diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h index 18ef0224fb6a..113fee1b7b63 100644 --- a/include/linux/skbuff.h +++ b/include/linux/skbuff.h @@ -2833,6 +2833,19 @@ static inline void skb_init_secmark(struct sk_buff *skb) { } #endif +static inline bool skb_irq_freeable(struct sk_buff *skb) +{ + return !skb->destructor && +#if IS_ENABLED(CONFIG_XFRM) + !skb->sp && +#endif +#if IS_ENABLED(CONFIG_NF_CONNTRACK) + !skb->nfct && +#endif + !skb->_skb_refdst && + !skb_has_frag_list(skb); +} + static inline void skb_set_queue_mapping(struct sk_buff *skb, u16 queue_mapping) { skb->queue_mapping = queue_mapping; diff --git a/net/core/netpoll.c b/net/core/netpoll.c index ed7740f7a94d..e33937fb32a0 100644 --- a/net/core/netpoll.c +++ b/net/core/netpoll.c @@ -270,7 +270,7 @@ static void zap_completion_queue(void) while (clist != NULL) { struct sk_buff *skb = clist; clist = clist->next; - if (skb->destructor) { + if (!skb_irq_freeable(skb)) { atomic_inc(&skb->users); dev_kfree_skb_any(skb); /* put this one back */ } else { -- To unsubscribe from this list: send the line "unsubscribe netdev" in the body of a message to majordomo@xxxxxxxxxxxxxxx More majordomo info at http://vger.kernel.org/majordomo-info.html