Re: FC4 crashes repeatedly on Supermicro AS1020A-T dual-core Opterons, SMP

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



> Michal Szymanski wrote:
>
> >All systems crash (either hang with some "machine check exception"
> >kernel messages or reset) when loaded with repeating runs of 1.3gb, CPU
> >intensive with some I/O. I run 2 or 4 jobs simultaneously and they had
> >never survived more than a few hours.

Let's try the easy stuff first -- if it's crashing with a machine check
exception, then let's disable machine check exceptions, and see if things
still break.

Try booting with the parameter "nomce".  Be aware that mce is a mechanism
for the processor to inform the kernel of thermal issues or component 
failure.  You'll only want to disable this mechanism if you aren't having
thermal problems.  

Of course, if you are having thermal problems, it's probably a good idea to
resolve those before cranking up the other 3/4s of your system.  ; )

Hope that helps!

-Phil/CERisE

P.S.  I came a little late to this party -- I didn't see the original message.
Did you include the text of the kernel crash?
-
: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Index of Archives]     [Linux Kernel]     [Remote Processor]     [Audio]     [Linux for Hams]     [Kernel Newbies]     [Netfilter]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Fedora Users]

  Powered by Linux