- Subject: Re: kernel crash when using libnuma
- From: Andi Kleen <andi@xxxxxxxxxxxxxx>
- Date: Mon, 6 Feb 2012 23:23:18 +0100
- Cc: linux-numa@xxxxxxxxxxxxxxx, aarcange@xxxxxxxxxx
- In-reply-to: <CABZZ3EtgJvMbzdfaCCTqeC+LNLH=wHDvEmQX+8qV16tzdvepyw@mail.gmail.com>
- References: <CABZZ3EtgJvMbzdfaCCTqeC+LNLH=wHDvEmQX+8qV16tzdvepyw@mail.gmail.com>
- User-agent: Mutt/1.4.2.2i
On Mon, Feb 06, 2012 at 03:57:52AM -0500, Trevor Kramer wrote:
> I have a program which can use libnuma to allocate memory using
> numa_alloc_onnode() or using malloc. When running in malloc mode
> everything works fine but when running under libnuma mode I get
> consistent kernel panics with the following traces. This only occurs
> when multiple threads are running. Has anyone seen this before or have
> any recommendations on how to debug further?
Looks like a THP problem.
For RHEL issues you normally need to talk to RedHat, these lists
are more for mainline.
-Andi
>
> crash> bt
> PID: 62333 TASK: ffff883ff5698b40 CPU: 17 COMMAND: "test"
> #0 [ffff883ff58378f0] machine_kexec at ffffffff810310cb
> #1 [ffff883ff5837950] crash_kexec at ffffffff810b6392
> #2 [ffff883ff5837a20] oops_end at ffffffff814de670
> #3 [ffff883ff5837a50] die at ffffffff8100f2eb
> #4 [ffff883ff5837a80] do_trap at ffffffff814ddf64
> #5 [ffff883ff5837ae0] do_invalid_op at ffffffff8100ceb5
> #6 [ffff883ff5837b80] invalid_op at ffffffff8100bf5b
> [exception RIP: split_huge_page+2021]
> RIP: ffffffff8116c605 RSP: ffff883ff5837c38 RFLAGS: 00010297
> RAX: 0000000000000001 RBX: ffff880ff704bc38 RCX: 000000000000fe9e
> RDX: 0000000000000000 RSI: 0000000000000046 RDI: 0000000000000246
> RBP: ffff883ff5837d08 R8: 0000000000000000 R9: 0000000000000004
> R10: 0000000000000001 R11: ffff880ff6fb7906 R12: ffff880ff84b7aa8
> R13: fffffffffffffff2 R14: ffffea006c34c000 R15: ffffea006c34c000
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ffff883ff5837c30] split_huge_page at ffffffff8116c5aa
> #8 [ffff883ff5837d10] __split_huge_page_pmd at ffffffff8116c6d1
> #9 [ffff883ff5837d40] unmap_vmas at ffffffff8113559e
> #10 [ffff883ff5837e80] unmap_region at ffffffff8113cce1
> #11 [ffff883ff5837ef0] do_munmap at ffffffff8113d3a6
> #12 [ffff883ff5837f50] sys_munmap at ffffffff8113d4e6
> #13 [ffff883ff5837f80] system_call_fastpath at ffffffff8100b172
> RIP: 00007f12d33154d2 RSP: 00007f12884731f8 RFLAGS: 00010283
> RAX: 000000000000000b RBX: ffffffff8100b172 RCX: 0000000000000020
> RDX: 0000000000000000 RSI: 00000000003fe560 RDI: 00007f129f460000
> RBP: 00000000003fe560 R8: 00007f1288475300 R9: 00007f1288475300
> R10: 0000003d9c0eb3b0 R11: 0000000000000246 R12: 0000003d9c0f1fc0
> R13: 0000003d9c0f0e00 R14: 00007f129f460000 R15: 00007f129f460000
> ORIG_RAX: 000000000000000b CS: 0033 SS: 002b
>
> The machine is running RedHat Enterprise Server 6 with
> 2.6.32-220.4.1.el6.x86_64.
>
> Thanks,
>
> Trevor
> --
> To unsubscribe from this list: send the line "unsubscribe linux-numa" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
--
ak@xxxxxxxxxxxxxxx -- Speaking for myself only.
--
To unsubscribe from this list: send the line "unsubscribe linux-numa" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[Home]
[Linux USB Devel]
[Video for Linux]
[Linux Audio Users]
[Photo]
[Yosemite News]
[Yosemite Photos]
[Free Online Dating]
[Linux Kernel]
[Linux SCSI]
[XFree86]
[Devices]