Re: FC4 crashes repeatedly on Supermicro AS1020A-T dual-core Opterons, SMP
On Fri, May 05, 2006 at 10:18:36AM -0500, Robert M. Hyatt wrote:
>
> One note. I am running on a quad 875 system, but am using Suse rather
> than FC4. It is running perfectly reliable (this is a 4 cpu, dual-core,
> 2.2ghz box, 8 processors total). I had problems with FC4 myself,
> although it runs perfectly on my normal dual xeon boxes...
>
> On Fri, 5 May 2006, Bill Davidsen wrote:
>
> >Michal Szymanski wrote:
> >
> >>Hi all,
> >>
> >>I have recently purchased three Supermicro AS1020A-T servers equipped
> >>with two dual-core Opterons 280 each. H8DAR-T motherboards, 8 or 12 GB
> >>RAM. The systems carry FC4 x86_64 with proprietary driver (made by
> >>Adaptec) for the onboard Marvell 88SX6041 SATA Controller. Original
> >>(install) kernel 2.6.11-1.1369_FC4smp - unfortunately not upgradable due
> >>to the lack of the SATA driver for other kernel versions.
> >>
> >>All systems crash (either hang with some "machine check exception"
> >>kernel messages or reset) when loaded with repeating runs of 1.3gb, CPU
> >>intensive with some I/O. I run 2 or 4 jobs simultaneously and they had
> >>never survived more than a few hours.
> >> ...
> >>2. I ran non-SMP 2.6.11 kernel (with Adaptec driver) on another machine.
> >>There have been two test repeating 1.3g jobs running on it (each getting
> >>50%
> >>of the single CPU used by the system) for over 50 hours now, no crashes.
> >>Also, a single test job running on SMP kernel gave no crashes in 24 hours.
> >>
> >What happens if you use only one CPU? Either with a uni kernel (you should
> >have gotten one) or "maxcpus=1" in the boot commands. You are running a
> >custom kernel with custom drivers, so you really should be asking the
> >supplier, all we can do is suggest things which might provide extra
> >information.
Hi all,
I got 3 copies of Roberts' message but none of Bill's :-)
Still, I don't quite understand Bill's question ("What happens if you
use only one CPU?"). The answer is quoted just above this question!
There were no crashes with the system running on non-SMP kernel.
In the meantime I got Kingston 1GB modules from my dealer, for testing.
Strangely as it seems, I could not crash the machine with Kingston
memory running tests as long as 72 hours. It seems, then, that it is a
memory issue although I do not understand why the same memory crashes
the machine in SMP and does not in non-SMP, under similar load. Also,
the Patriot 2GB memory modules (which seem to crash the machines) are on
the Supermicro's list of memory recommended for H8DAR-T mobo.
One of the machines went back to the dealer (actually to their memory
supplier) for tests. The memory guys seem not to trust our crashing
experience. We'll see what happens. I am afraid, however, that they will
say "the memory is OK".
regards, Michal.
--
Michal Szymanski (msz at astrouw dot edu dot pl)
Warsaw University Observatory, Warszawa, POLAND
-
: send the line "unsubscribe linux-smp" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
[Audio]
[Hams]
[Kernel Newbies]
[Security]
[Netfilter]
[Bugtraq]
[Photo]
[Yosemite Photos]
[Yosemite News]
[MIPS Linux]
[ARM Linux]
[Linux Security]
[Linux RAID]
[Samba]
[Video 4 Linux]
[Linux Resources]
[Fedora Users]