Re: Failover after partial failure because of SAN?
|[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]|
On Fri, Nov 4, 2011 at 4:03 PM, Jochen Schneider <jochen.schneider@xxxxxxxxx> wrote: > Hi, > > We are setting up a cluster for a storage application with SAN disks managed > through HA-LVM and connected through multipath. There are actually two > applications which have to run on the same node, HAVE to run on the same node? Why? Can't they communicate via TCP/IP? > but only one of them needs > the disk. Both of them have clients. > > The question I have is what should happen when the SAN fails: Should both > applications failover to another machine (possibly after a retry) or should > the application which doesn't need the disk keep running while the other is > shut down? You're not giving yourself much option. Since you say both application HAVE to run on the same node, I assume both are related (e.g. one needs the other). In that case, the only viable option is to failover. Having said that, I'm curious what do you mean by "SAN fails". It's rare for a cluster node to be suddenly unable to access a node while the other can access it just fine. Usually it's either the SAN unaccessible completely (e.g. broken SAN or switches) or a server node fails. > I'm not sure how much recovery can come out of a failover in case > of a SAN failure, if it's not both network cards of the node which are > defective or whatever. Exactly :) If no node can access the SAN, then it can't failover anywhere. -- Fajar -- Linux-cluster mailing list Linux-cluster@xxxxxxxxxx https://www.redhat.com/mailman/listinfo/linux-cluster