Re: [PATCH] netpoll: Don't call driver methods from interrupt context.

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



David Miller <davem@xxxxxxxxxxxxx> writes:

> From: ebiederm@xxxxxxxxxxxx (Eric W. Biederman)
> Date: Tue, 04 Mar 2014 16:03:43 -0800
>
>> So I would like some clear guidance.  Will you accept patches to make
>> it safe to call the napi poll routines from hard irq context, or should
>> we simply defer messages prented with netconsole in hard irq context
>> into another context where we can run the napi code?
>> 
>> If there is not a clear way to fix the problems that crop up we should
>> just delete all of the netpoll code altogether, as it seems deadly in
>> it's current form.
>
> Clearly to make netconsole most useful we should synchronously emit
> log messages.
>
> Because what if the system hangs right after this event, but before
> we get back to a "safe" context.
>
> That's one bug that will be a billion times harder to diagnose if
> we defer.

In general I agree.  

The gripping hand for me is kernel/rcu/tree.c:print_cpu_stall() that
generates a warning from irq context on every cpu simultaneously.

Which without netpoll I can debug by just logging into the machine and
dumping dmesg, but with netpoll machine die when the warning is
generarted because of the after the first few messages each additional
message generates a new message.

Now that I have looked closer the printk generating a printk problem
seems to be something that is best solved at the printk level.  So if
you will accept the patches I will proceed to shore up the existing
netpoll implementations.

I am thinking pretty seriously about forcing hard irq context during
netconsole's use of netpoll to ensure that the hard irq context case
actually get's tested.  I need to do some audit's to see if that would
cause any side effects beyond leaving irq's disabled.

diff --git a/drivers/net/netconsole.c b/drivers/net/netconsole.c
index ba2f5e710af1..aaa9062061c8 100644
--- a/drivers/net/netconsole.c
+++ b/drivers/net/netconsole.c
@@ -734,6 +734,7 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
        unsigned long flags;
        struct netconsole_target *nt;
        const char *tmp;
+       bool hard_irq;
 
        if (oops_only && !oops_in_progress)
                return;
@@ -742,6 +743,9 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                return;
 
        spin_lock_irqsave(&target_list_lock, flags);
+       hard_irq = in_irq();
+       if (!hard_irq)
+               irq_enter();
        list_for_each_entry(nt, &target_list, list) {
                netconsole_target_get(nt);
                if (nt->enabled && netif_running(nt->np.dev)) {
@@ -761,6 +765,8 @@ static void write_msg(struct console *con, const char *msg, unsigned int len)
                }
                netconsole_target_put(nt);
        }
+       if (!hard_irq)
+               irq_exit();
        spin_unlock_irqrestore(&target_list_lock, flags);
 }


Eric


--
To unsubscribe from this list: send the line "unsubscribe netdev" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html




[Index of Archives]     [Linux Kernel Discussion]     [TCP Instrumentation]     [Ethernet Bridging]     [Linux Wireless Networking]     [Linux WPAN Networking]     [Linux Host AP]     [Linux WPAN Networking]     [Linux Bluetooth Networking]     [Linux ATH6KL Networking]     [Linux Networking Users]     [Linux Coverity]     [VLAN]     [Git]     [IETF Annouce]     [Linux Assembly]     [Security]     [Bugtraq]     [Yosemite Information]     [MIPS Linux]     [ARM Linux Kernel]     [ARM Linux]     [Linux Virtualization]     [Linux IDE]     [Linux RAID]     [Linux SCSI]