[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: FC4 crashes repeatedly on Supermicro AS1020A-T dual-core Opterons, SMP



On Fri, May 05, 2006 at 10:18:36AM -0500, Robert M. Hyatt wrote:
> 
> One note.  I am running on a quad 875 system, but am using Suse rather 
> than FC4.  It is running perfectly reliable (this is a 4 cpu, dual-core, 
> 2.2ghz box, 8 processors total).  I had problems with FC4 myself, 
> although it runs perfectly on my normal dual xeon boxes...
> 
> On Fri, 5 May 2006, Bill Davidsen wrote:
> 
> >Michal Szymanski wrote:
> >
> >>Hi all,
> >>
> >>I have recently purchased three Supermicro AS1020A-T servers equipped
> >>with two dual-core Opterons 280 each. H8DAR-T motherboards, 8 or 12 GB
> >>RAM. The systems carry FC4 x86_64 with proprietary driver (made by
> >>Adaptec) for the onboard Marvell 88SX6041 SATA Controller. Original
> >>(install) kernel 2.6.11-1.1369_FC4smp - unfortunately not upgradable due
> >>to the lack of the SATA driver for other kernel versions.
> >>
> >>All systems crash (either hang with some "machine check exception"
> >>kernel messages or reset) when loaded with repeating runs of 1.3gb, CPU
> >>intensive with some I/O. I run 2 or 4 jobs simultaneously and they had
> >>never survived more than a few hours.
> >> ...
> >>2. I ran non-SMP 2.6.11 kernel (with Adaptec driver) on another machine.
> >>There have been two test repeating 1.3g jobs running on it (each getting 
> >>50%
> >>of the single CPU used by the system) for over 50 hours now, no crashes.
> >>Also, a single test job running on SMP kernel gave no crashes in 24 hours.
> >>
> >What happens if you use only one CPU? Either with a uni kernel (you should 
> >have gotten one) or "maxcpus=1" in the boot commands. You are running a 
> >custom kernel with custom drivers, so you really should be asking the 
> >supplier, all we can do is suggest things which might provide extra 
> >information.

Hi all,

I got 3 copies of Roberts' message but none of Bill's :-)

Still, I don't quite understand Bill's question ("What happens if you
use only one CPU?"). The answer is quoted just above this question!
There were no crashes with the system running on non-SMP kernel.

In the meantime I got Kingston 1GB modules from my dealer, for testing.
Strangely as it seems, I could not crash the machine with Kingston
memory running tests as long as 72 hours. It seems, then, that it is a
memory issue although I do not understand why the same memory crashes
the machine in SMP and does not in non-SMP, under similar load. Also,
the Patriot 2GB memory modules (which seem to crash the machines) are on
the Supermicro's list of memory recommended for H8DAR-T mobo.

One of the machines went back to the dealer (actually to their memory
supplier) for tests. The memory guys seem not to trust our crashing
experience. We'll see what happens. I am afraid, however, that they will
say "the memory is OK".

regards, Michal.

-- 
  Michal Szymanski (msz at astrouw dot edu dot pl)
  Warsaw University Observatory, Warszawa, POLAND
-
: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at  http://vger.kernel.org/majordomo-info.html

[Audio]     [Hams]     [Kernel Newbies]     [Security]     [Netfilter]     [Bugtraq]     [Photo]     [Yosemite Photos]     [Yosemite News]     [MIPS Linux]     [ARM Linux]     [Linux Security]     [Linux RAID]     [Samba]     [Video 4 Linux]     [Linux Resources]     [Fedora Users]

Add to Google