Re: kernel panic in latest vanilla stable, while using nameif with "alive" pppoe interfaces

Michal Ostrowski a écrit :
> Here's my theory on this after an inital look...
> Looking at the oops report and disassembly of the actual module binary
> that caused the oops, one can deduce that:
> Execution was in pppoe_flush_dev().  %ebx contained the pointer "struct
> pppox_sock *po", which is what we faulted on, excuting "cmp %eax, 0x190(%ebx)".
> %ebx value was 0xffffffff (hence we got "NULL pointer dereference at 0x18f").
> At this point "i" (stored in %esi) is 15 (valid), meaning that we got a value
> of 0xffffffff in pn->hash_table[i].
>>From this I'd hypothesize that the combination of dev_put() and release_sock()
> may have allowed us to free "pn".  At the bottom of the loop we alreayd
> recognize that since locks are dropped we're responsible for handling
> invalidation of objects, and perhaps that should be extended to "pn" as well.
> --
> Michal Ostrowski
> mostrows@xxxxxxxxx

Looking at this stuff, I do believe flush_lock protection is not
properly done.

At the end of pppoe_connect() for example we can find :

        if (po->pppoe_dev) {
                po->pppoe_dev = NULL;

This is done without any protection, and can therefore clash with 
pppoe_flush_dev() :

	po->pppoe_dev = NULL; /* ppoe_dev can already be NULL before this point */

	dev_put(dev);    /* oops */
