- To: NeilBrown <neilb@xxxxxxx>
- Subject: Re: linux-image-2.6.32-5-686: kernel BUG at ... build/source_i386_none/drivers/md/raid5.c:2764!
- From: Christian Balzer <chibi@xxxxxxx>
- Date: Mon, 25 Jun 2012 15:55:14 +0900
- Cc: linux-raid@xxxxxxxxxxxxxxx, Jose Manuel dos Santos Calhariz <jose.spam@xxxxxxxxxxx>
- In-reply-to: <20120625164230.2ba8f72c@notabene.brown>
- Organization: FusionGOL
On Mon, 25 Jun 2012 16:42:30 +1000 NeilBrown wrote:
> On Mon, 25 Jun 2012 11:58:33 +0900 Christian Balzer <chibi@xxxxxxx>
> wrote:
>
> > On Mon, 25 Jun 2012 12:39:06 +1000 NeilBrown wrote:
> >
> > > On Sun, 24 Jun 2012 18:02:34 +0100 Jose Manuel dos Santos Calhariz
> > > <jose.spam@xxxxxxxxxxx> wrote:
> > >
> > > > On Sun, Jun 24, 2012 at 06:21:46PM +1000, NeilBrown wrote:
> > > > > On Fri, 22 Jun 2012 13:19:53 +0100 Jose Manuel dos Santos
> > > > > Calhariz <jose.spam@xxxxxxxxxxx> wrote:
> > > > >
> > > > > >
> > > > > > In another day during the periodic mdadm RAID check:
> > > > > > - the linux kernel gave a kernel BUG,
> > > > > > - tried to kick out a failed disk and
> > > > > > - stopped accepting I/O to the affected raid.
> > > > > >
> > > > > > The affected programs were in state D. The only way to recover
> > > > > > was to do a reboot. After reboot the problematic disk was
> > > > > > replaced.
> > > > > >
> > > > > > I reported the bug to Debian and is there all the information
> > > > > > about it:
> > > > > >
> > > > > > http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=675969
> > > > > >
> > > > > > I was asked to report the BUG here in case someone knows what
> > > > > > happened.
> > > > > >
> > > > > > Here is a summary of the more relevant information:
> > > > > >
> > > > > > This machine have 2 x RAID6 with 6 disks each, for a total of
> > > > > > 12 disks.
> > > > > >
> > > > > > I have 5 systems with a similar setup and only one failed,
> > > > > > maybe because of the failing disk. I will use one of the
> > > > > > systems to try to reproduce the bug, before triyng a new
> > > > > > kernel.
> > > > > >
> > > > > >
> > > > > > The proprietary module is the openafs filesystem v1.6.1
> > > > > > backported from Debian testing.
> > > > > >
> > > > > > The kernel bug is:
> > > > > >
> > > > > >
> > > > > > build/source_i386_none/drivers/md/raid5.c:2764!
> > > >
> > > > >
> > > > > This bug was fixed in 2.6.32.49 and 3.2
> > > > >
> > > > > http://git.kernel.org/?p=linux/kernel/git/stable/linux-stable.git;a=commitdiff;h=61d433c479a6ccfed6a7e73e6111ca8fa0348c63
> > > > >
> > > > > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=9a3f530f39f4490eaa18b02719fb74ce5f4d2d86
> > > > >
> > > > > NeilBrown
> > > >
> > > > The failing kernel had that fix all ready. The machine was running
> > > > the kernel Debian 2.6.32-41squeeze2. Looking into the change log,
> > > > this kernel have all the fixes until 2.6.32.51 plus other fixes.
> > > >
> > > > Jose Calhariz
> > > >
> > >
> > > The oops report said:
> > >
> > > (2.6.32-5-686 #1)
> > >
> > > is "5" the same as "41squeeze2" ??? This is a genuine question - I
> > > have little idea about Debian versioning so maybe these are the same
> > > thing somehow. But they look different.
> > >
> > Yes, the "name' of the kernel and it's actual detail version are
> > disjunct like that in Debian, the current kernel of that vintage is:
> > ---
> > Package: linux-image-2.6.32-5-amd64
> > Source: linux-2.6
> > Version: 2.6.32-44
> > ---
>
> Ok.
> So the version number reported by "uname -a" doesn't change when you
> upgrade a Debian kernel? That's rather sad.
It kinda does, the -5 part is the the version bit that will increase for
each significant release, but it doesn't quite reflect the more
detailed version info:
---
engtest01:~# uname -a
Linux engtest01 2.6.32-5-686 #1 SMP Mon Oct 3 04:15:24 UTC 2011 i686 GNU/Linux
engtest01:~# cat /proc/version
Linux version 2.6.32-5-686 (Debian 2.6.32-38) (ben@xxxxxxxxxxxxxxx) (gcc version 4.3.5 (Debian 4.3.5-4) ) #1 SMP Mon Oct 3 04:15:24 UTC 2011
---
So while I have 2.6.32-45 kernel installed on that machine above, it's not
been rebooted for 220 days and still runs the -38 incarnation.
Of the -5 kernel according to uname and yes, that can be confusing.
Regards,
Christian
> I means that one has to take the reporters work for which kernel was
> running rather than looking in the oops message for where the kernels
> tells me what version it was.
>
> Given the report, it is entirely possible that an older kernel was
> running while a newer kernel was installed.
>
> Jose: how certain are you that the kernel that was running at the time
> was exactly the kernel that was installed at the time. i.e. you had not
> performed a software update since the last reboot?
>
> However even if you can confirm that a new kernel was running I doubt I
> could find an answer. There isn't really much info to go on. So unless
> you can reproduce the problem, I doubt I'll even start looking.
>
> NeilBrown
--
Christian Balzer Network/Systems Engineer
chibi@xxxxxxx Global OnLine Japan/Fusion Communications
http://www.gol.com/
--
To unsubscribe from this list: send the line "unsubscribe linux-raid" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[ATA RAID]
[Linux SCSI Target Infrastructure]
[Managing RAID on Linux]
[Linux IDE]
[Linux SCSI]
[Linux Hams]
[Device-Mapper]
[Kernel]
[Linux Books]
[Linux Admin]
[Linux Net]
[GFS]
[RPM]
[git]
[Photos]
[Yosemite Photos]
[Yosemite News]
[AMD 64]
[Linux Networking]