| [Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index] |
Hello, Sorry about the lateness of the summary; some things came up. Don't they always? First: It turns out that it is *not* possible to use the ServeRAID 5i controller of the x345 model for split-tail clusting, since that controller simply is not cluster-capable. We would need an appropriate controller from the ServeRAID 4 series. Hence, we are currently exploring some other alternatives; most likely we'll build a cheap box with software-RAID on huge IDE disks, and (r?)sync our main array over periodically. I received replies from Joseph Bueno and Jochen Schmidt -- my thanks to both of you! Regarding clustering: "I've done this in our test center but not with IBM-Servers. You can use the Failover functionality of RedHat Advanced Server or Mission Critical Linux. With both systems you must write scripts to start/stop the needed Services on the machines." Regarding hard reset/power cycling: "I'm suggest you to use the peppercon Rol/F for this. See http://www.peppercon.com/products/rolf.html for details. This is a networkcard with an onboard Reset Controller. You can send them an special (password-protected) udp-packet and the card does a really hard-reset of the system. You can use a crosscable to connect both systems and also use this interface as heartbeat. The second way is a lan-powerswitch from APCC. With this device you can control 5 devices over lan." And: "[...] we have 10 colocated servers with dual power supplies and we use 4 APC MasterSwitch 9212. Each provide 8 power outlets and you control them remotely with telnet, HTTP or SNMP (we use telnet through an SSH tunnel). They also have a serial port." /Martin Eskildsen IT Administrator, Tpack A/S > Hello, > > We have an IBM x345 xSeries server here and an EXP300 disk array > (running RAID5) that runs our central file server functions: Samba, CVS > respository, and NFS for other Linux servers. > > That machine has grown in importance to us, and even though the hardware > has a lot of redundancy internally (dual this-and-that; can disable > failed CPU or RAM slot and reboot automatically) thus providing low > down-time, some parts are still single-points-of-failure, in particular > the motherboard and the RAID controller. If they die, we might be down > for a day or two until spare parts arrive. I don't trust service > contracts; IMHO they are usually a statement of intent, not a guarantee. > > Thus, I'm thinking about setting up active-passive redundancy using a > second x345, the Heartbeat module from Linux-HA, and then share the > EXP300 disk array in a so-called split-tail configuration: One system > mounts read-write; the other system doesn't until the first system dies. > > I'm considering such an expensive, complex measure on an internal server > simply because we would be in some pain if that server is down for a day > due to service. We, of course, have backups and all that, but it still > takes time bringing a replacement machine online if you need to restore > and configure it. We'd rather avoid that pain, if we reliably could. > > Thus, I have two questions: > > 1. Has anybody tried setting up this kind of clustering using similar > hardware (IBM xSeries server and an external IBM EXP SCSI disk array)? > If yes, any experiences and hints you'd like to offer? > > 2. I would end up with two servers, both of which having dual power > supplies. In my setup, I run each "side" of the box on separate UPSes, > but those devices are "dumb" in the sense that they can't be controlled > directly (I can't programmatically turn on/off one outlet on the UPS > device). That's a problem, since I want to run STONITH (*) as part of my > Heartbeat setup, to avoid "split-brain" problems (both nodes mounting > the array read/write simultaneously). Any experience with power cycling > devices that can be controlled remotely, preferably via USB or RS232? > > Thank you for your time, > > > /Martin Eskildsen > IT Administrator, Tpack A/S > > (*) STONITH: "Shoot The Other Node In The Head". Yet another effect of > Open Source and volunteer efforts: Weird acronyms -- that name would > never have gotten through, say, the Microsoft Marketing Dept. :-) _______________________________________________ LinuxManagers mailing list - http://www.linuxmanagers.org submissions: LinuxManagers@linuxmanagers.org subscribe/unsubscribe: http://www.linuxmanagers.org/mailman/listinfo/linuxmanagers
[Home] [Kernel List] [Linux SCSI] [Video 4 Linux] [Linux Admin] [Yosemite News] [Motherboards]