[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Google
  Web www.spinics.net

SUMMARY: Split-tail clustering with IBM xSeries servers?



Hello,

Sorry about the lateness of the summary; some things came up. Don't they
always?

First: It turns out that it is *not* possible to use the ServeRAID 5i
controller of the x345 model for split-tail clusting, since that
controller simply is not cluster-capable. We would need an appropriate
controller from the ServeRAID 4 series.

Hence, we are currently exploring some other alternatives; most likely
we'll build a cheap box with software-RAID on huge IDE disks, and
(r?)sync our main array over periodically.

I received replies from Joseph Bueno and Jochen Schmidt -- my thanks to 
both of you!

Regarding clustering:

"I've done this in our test center but not with IBM-Servers. You can use
the Failover functionality of RedHat Advanced Server or Mission Critical
Linux. With both systems you must write scripts to start/stop the needed
Services on the machines."

Regarding hard reset/power cycling:

"I'm suggest you to use the peppercon Rol/F for this. See
http://www.peppercon.com/products/rolf.html for details. This is a
networkcard with an onboard Reset Controller. You can send them an
special (password-protected) udp-packet and the card does a really
hard-reset of the system.

You can use a crosscable to connect both systems and also use this 
interface as heartbeat.

The second way is a lan-powerswitch from APCC. With this device you can 
control 5 devices over lan."

And:

"[...] we have 10 colocated servers with dual power supplies and we use
4 APC MasterSwitch 9212. Each provide 8 power outlets and you control
them remotely with telnet, HTTP or SNMP (we use telnet through an SSH
tunnel). They also have a serial port."


/Martin Eskildsen
IT Administrator, Tpack A/S

> Hello,
> 
> We have an IBM x345 xSeries server here and an EXP300 disk array
> (running RAID5) that runs our central file server functions: Samba, CVS
> respository, and NFS for other Linux servers.
> 
> That machine has grown in importance to us, and even though the hardware
> has a lot of redundancy internally (dual this-and-that; can disable
> failed CPU or RAM slot and reboot automatically) thus providing low
> down-time, some parts are still single-points-of-failure, in particular
> the motherboard and the RAID controller. If they die, we might be down
> for a day or two until spare parts arrive. I don't trust service
> contracts; IMHO they are usually a statement of intent, not a guarantee.
> 
> Thus, I'm thinking about setting up active-passive redundancy using a
> second x345, the Heartbeat module from Linux-HA, and then share the
> EXP300 disk array in a so-called split-tail configuration: One system
> mounts read-write; the other system doesn't until the first system dies.
> 
> I'm considering such an expensive, complex measure on an internal server
> simply because we would be in some pain if that server is down for a day
> due to service. We, of course, have backups and all that, but it still
> takes time bringing a replacement machine online if you need to restore
> and configure it. We'd rather avoid that pain, if we reliably could.
> 
> Thus, I have two questions:
> 
> 1. Has anybody tried setting up this kind of clustering using similar
> hardware (IBM xSeries server and an external IBM EXP SCSI disk array)?
> If yes, any experiences and hints you'd like to offer?
> 
> 2. I would end up with two servers, both of which having dual power
> supplies. In my setup, I run each "side" of the box on separate UPSes,
> but those devices are "dumb" in the sense that they can't be controlled
> directly (I can't programmatically turn on/off one outlet on the UPS
> device). That's a problem, since I want to run STONITH (*) as part of my
> Heartbeat setup, to avoid "split-brain" problems (both nodes mounting
> the array read/write simultaneously). Any experience with power cycling
> devices that can be controlled remotely, preferably via USB or RS232?
> 
> Thank you for your time,
> 
> 
> /Martin Eskildsen
> IT Administrator, Tpack A/S
> 
> (*) STONITH: "Shoot The Other Node In The Head". Yet another effect of
> Open Source and volunteer efforts: Weird acronyms -- that name would
> never have gotten through, say, the Microsoft Marketing Dept.   :-)
_______________________________________________
LinuxManagers mailing list - http://www.linuxmanagers.org
submissions: LinuxManagers@linuxmanagers.org
subscribe/unsubscribe: http://www.linuxmanagers.org/mailman/listinfo/linuxmanagers

[Home]     [Kernel List]     [Linux SCSI]     [Video 4 Linux]     [Linux Admin]     [Yosemite News]     [Motherboards]

Powered by Linux