Re: Failover on public network failiure?

From: Peter Othen (petero_at_c-and-o.co.uk)
Date: 06/21/04


Date: Mon, 21 Jun 2004 18:11:22 +0100

Hi Alex,

There are a number of ways the cluster service checks the state of the
network interfaces it is using on a node. It can be notified that the
interface is unavailable via Plug and Play (e.g. pulling the cable out), it
will also ping other nodes in the cluster and the default gateway configured
on the node. If the network interface is determined to be failed then any
cluster IP address resources on that cluster network (i.e. IP address
resources with the same IP subnet ID as the failed network interface)
currently online on that node will also go into a failed state. Therefore
any dependent resources will go offline and after three attempts to restart
the IP address on the current node the group(s) will failover to another
node.

So the cluster can detect local NIC/cable issues and cause a failover to a
different node. However it won't cause a failover just because the router R1
fails. If it can still get a ping reply from another node over that
interface the interface is not considered to have failed so there won't be
any failures on the cluster node. Multi-homing isn't an option for you as
the cluster will only use one IP subnet on a network, so you need multiple
NICs if you want to have multiple public networks with different IP subnet
IDs. You definitely shouldn't use a single link for the public and private
traffic.

One option you could consider is to create a generic application (or perhaps
generic script for w2k3) which tested the connectivity to the router and
fails if the router can't be contacted, thus causing a group failover. Might
not be appropriate, but worth a thought.

Best wishes,

Peter Othen
MCSE, MCDBA, MCT

"Arc" <arc__@microman.tv> wrote in message
news:eUuk%23neVEHA.2988@TK2MSFTNGP10.phx.gbl...
> The plan (and a chance to display my l33t ASCII art skills, apologies to
> those not using a fixed font);
>
> C C C
> \ | /
> S S S
> \ | /
> R1--R2
> | |
> N1---N2
> \ /
> SAN
>
> C = Client
> S = Switch
> R = Router/Switch
> Nx = Cluster Node
>
> Having read Microsoft's various articles it would seem that if a node
loses
> public network connectivity it should initiate a failover of any resource
> groups that have a dependancy on that interface.
>
> So, in the event that N1 loses access to R1 due to the router failing, the
> cable going faulty(?) or a local NIC problem, dependent resources should
> failover to N2 which would still have an active public interface. Now,
while
> this is the impression I have from the articles I've read, I still have an
> uncomfortable feeling because I haven't seen it plainly stated. Can anyone
> confirm this is the case?
>
> Ideally, what I had wanted to do was this;
>
> R1--R2
> | X |
> N1---N2
>
> By using two NIC's on the public side in a fault tolerant team.
> Unfortunately I only have two NIC's available and no room to add more. I
was
> thinking of using the one link to combine public and private traffic but
> that seems a no-no, particularly with teaming. The other alternative was
> multi-homing, but I've got too little knowledge at the moment on that
> option.
>
> If the failover works as I described above then all should be fine, if
> not...
>
> Thanks in advance.
>
> Alex
>
>



Relevant Pages

  • RE: Unable to failover when public network cables are removed
    ... "Unable to failover" wrote: ... > for Network redundancy. ... > initially, that is, server1 is able to failover to server2. ... The Cluster IP address and Cluster ...
    (microsoft.public.windows.server.clustering)
  • Re: 2003 Cluster and HP Teaming software
    ... so if we're talking about using NIC Teaming sofware on an MSCS Failover ... Mike Rosado | Microsoft Beta Support Engineer | Cluster Technologies ... 278431 Using teaming adapters with network load balancing may cause ...
    (microsoft.public.windows.server.clustering)
  • Re: Cluster IP address change and delay
    ... How to change the network IP addresses of SQL Server failover cluster ... I have recently changed IP addresses on my MS SQL cluster (the new IP ... The resources failover to another node when needed correctly. ...
    (microsoft.public.sqlserver.clustering)
  • Re: Strange HACMP config error
    ... > had a full network blackout yesterday. ... The cluster was instable after ... > First one IP interface is gone. ... network switch logs can show routing problems if you have access to them. ...
    (comp.unix.aix)
  • Re: iscsi multipath fails when cluster service is started
    ... i've configured a virutal machine with windows server 2003 under esx4 to ... test the behavior of the network inferfaces. ... when i do the same procedure on my physical server with installed cluster ... interface still has a connection. ...
    (microsoft.public.windows.server.clustering)

Loading