kexec / kdump support for powerpc

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On 2/14/07, Luke Browning <LukeBrowning at us.ibm.com> wrote:
>
> Hello,
>
> I have spent the last two weeks debugging kexec and kdump on the cell
> platform.   I got my kernels to boot on the Cell system and the Cell
> simulator, but I am baffled as the changes that I made are in the common
> powerpc code.   If I am right, it doesn't work on powerpc.   I must have
> gone astray...

Hi Luke,

The good news is you're right, you did go astray :)

> First, I am loading my kernel using
>
> kexec -l vmlinux         or          kexec -l vmlinux --append="maxcpus=0"

That's OK, except for the maxcpus=0 part. I'd suggest you don't use
maxcpus at all, it's not well tested, I'm pretty sure it doesn't work
on the IBM cell blade for example.

> This results in a bad start pointer.  Specifically,
>
> image->start = image->memory[1].mem
>
> The entire kernel is loaded into image->memory[0].mem and I don't know what
> is supposed to be in second and third memory segments, perhaps glue code,
> but whatever it is it doesn't work.   The system hangs executing code in the
> second region.
>
> I changed the kernel, so that
>
> image->start = image->memory[0] + KERNELBASE.

This is no good. You can't jump straight from the first kernel into
the second, you have to go through purgatory.

If you grab a copy of the kexec-tools source,
(git://git.kernel.org/pub/scm/linux/kernel/git/horms/kexec-tools-testing.git),
and look in purgatory/arch/ppc64/, you'll find v2wrap.S.

v2wrap.S is the kexec boot wrapper (version 2) aka. purgatory. It
starts with the following comments:

# calling convention:
#   r3 = physical number of this cpu (all cpus)
#   r4 = address of this chunk (master only)

# Invokes ppc64 kernel with the expected arguments
# of kernel(device-tree, phys-offset, 0)

Which is exactly what you discovered needs to be done :D

Anyway, I'm glad someone's looking at kexec on cell, I haven't had the
time to look closely at it. You said that your kexec was hanging in
the second region, how were you debugging it? Can you give us anymore
info?

Another problem with the existing kexec code is it doesn't cope
properly with 64k pages on non hypervisor machines, like the cell
blade. We need to fix the FIXME in native_hpte_clear()
(arch/powerpc/mm/hash_native_64.c).

cheers


[Index of Archives]     [Netdev]     [Linux Wireless]     [Kernel Newbies]     [Security]     [Linux for Hams]     [Netfilter]     [Bugtraq]     [Yosemite Forum]     [MIPS Linux]     [ARM Linux]     [Linux RAID]     [Linux Admin]     [Samba]     [Linux Media]

  Powered by Linux