- Subject: Re: [PATCH v3 21/22] netoops: Add user-programmable boot_id
- From: Matt Mackall <mpm@xxxxxxxxxxx>
- Date: Tue, 14 Dec 2010 16:47:44 -0600
- Cc: simon.kagstrom@xxxxxxxxxxxxxx, davem@xxxxxxxxxxxxx, nhorman@xxxxxxxxxxxxx, adurbin@xxxxxxxxxx, linux-kernel@xxxxxxxxxxxxxxx, chavey@xxxxxxxxxx, Greg KH <greg@xxxxxxxxx>, netdev@xxxxxxxxxxxxxxx, Américo Wang <xiyou.wangcong@xxxxxxxxx>, akpm@xxxxxxxxxxxxxxxxxxxx, linux-api@xxxxxxxxxxxxxxx
- In-reply-to: <AANLkTineeFcSNx09P=2SZGcBHeq5p_LQ54nU=uGVv_Ck@xxxxxxxxxxxxxx>
- References: <20101214212846.17022.64836.stgit@xxxxxxxxxxxxxxxxxxxxxxxx> <20101214213048.17022.58746.stgit@xxxxxxxxxxxxxxxxxxxxxxxx> <1292362957.3446.851.camel@calx> <AANLkTi=Q+LaS7yaU_4ThkhMUM-q8S3JTuwMWqAikQPgH@xxxxxxxxxxxxxx> <1292364378.3446.854.camel@calx> <AANLkTineeFcSNx09P=2SZGcBHeq5p_LQ54nU=uGVv_Ck@xxxxxxxxxxxxxx>
On Tue, 2010-12-14 at 14:33 -0800, Mike Waychison wrote:
> On Tue, Dec 14, 2010 at 2:06 PM, Matt Mackall <mpm@xxxxxxxxxxx> wrote:
> > On Tue, 2010-12-14 at 13:59 -0800, Mike Waychison wrote:
> >> On Tue, Dec 14, 2010 at 1:42 PM, Matt Mackall <mpm@xxxxxxxxxxx> wrote:
> >> > On Tue, 2010-12-14 at 13:30 -0800, Mike Waychison wrote:
> >> >> Add support for letting userland define a 32bit boot id. This is useful
> >> >> for users to be able to correlate netoops reports to specific boot
> >> >> instances offline.
> >> >
> >> > This sounds a lot like the pre-existing /proc/sys/kernel/random/boot_id
> >> > that's used by kerneloops.org.
> >>
> >> Could be. I'm looking at it now... There is no documentation for this
> >> boot_id field?
> >
> > Probably not. It's just a random number generated at boot.
> >
> >> Reusing this guy would work, except that it doesn't appear to allow
> >> arbitrary values to be set. We need to inject our boot sequence
> >> number (which is figured out in userland) in the packet somehow as we
> >> need to correlate it to our other monitoring systems.
> >
> > What happens if you oops before userspace is available?
> >
>
> Either one of two general cases:
> - The crash is a one-off and the machine comes back. The boot
> number sequence will see a hole in it, which is a clue that something
> bad happened.
> - The machine is in a crash loop. This has the same failure mode
> for us as if the machine never made it onto the network due to
> whatever reason: bad cables, bad firmware, bad ram, ...
>
> In both cases, we can detect that something is wrong and handle it.
> Note that our firmware is responsible for incrementing the boot
> sequence at bootup, which is why the above works. In general though,
> our machines do make it up to userland -- staying alive once booted is
> the hard part ;)
Interesting. Is this Google-specific firmware magic? I'd probably accept
a hook in random.c to fold a number into the UUID, which would unify
things.
--
Mathematics is the supreme nostalgia of our time.
--
To unsubscribe from this list: send the line "unsubscribe linux-api" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[Home]
[Linux USB Devel]
[Video for Linux]
[Linux Audio Users]
[Photo]
[Yosemite News]
[Yosemite Photos]
[Free Online Dating]
[Linux Kernel]
[Linux SCSI]
[XFree86]