Re: Cluster could not fail over
- From: "Ryan Sokolowski [MVP - Avanade]" <ryan@xxxxxxxxxxxxxxxxxxxxxx>
- Date: Mon, 1 Aug 2005 15:01:51 -0700
Wouldn't this be normal as the failure of or lack of signal over the
heartbeat network is the initiator for a node resource failover? The nodes
should failover resources when the heartbeat intervals are missed for a
particular period of time...
"The network connections must be able to provide a maximum guaranteed round
trip latency between nodes of no more than 500 milliseconds. The cluster
uses heartbeat to detect whether a node is alive or not responding. These
heartbeats are sent out on a periodic basis. If a node takes too long to
respond to heartbeat packets, the cluster service starts a heavy-weight
protocol to figure out which nodes are really still alive and which ones are
dead; this is known as a cluster re-group. The heartbeat interval is not a
configurable parameter for the cluster service (there are many reasons for
this, but the bottom line is that changing this parameter can have a
significant impact on the stability of the cluster and the failover time).
500ms round-trip is significantly below any threshold to ensure that
artificial re-group operations are not triggered."
--
Ryan Sokolowski
MVP - Windows Server - Clustering
MCSE, CCNA, CCDA, BCFP
Avanade
http://www.Avanade.com
"A troubleshooter's best tool is the Event Viewer and understanding the
events and messages contained therein."
This posting is provided "AS IS" with no warranties, and confers no rights.
"Philip" <Philip@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:35BE4DB5-D199-47FC-B2DB-6CD0862FE8D3@xxxxxxxxxxxxxxxx
> Hi everyone,
> Recently we have setup 2 Windows 2003 Enterprise Servers with MS
> Clustering.
> The cluster was setup using the standard guide from Microsoft.
>
> The configurations are :
>
> 1) 3 NIC cards for each server: 2 on board NIC cards using Network
> Teaming
> for Network redundancy. 1 for Heartbeat using cross cable.
>
> 2) 2 HBA cards for each server for SAN redundancy.
>
> After the Cluster was setup, we tested failover test by removing 2 Public
> network cables for each server, rebooting and shutting down of servers,
> The
> failover tests were completed without problem. The servers are able to
> take
> over resources whenever the other is down.
>
> However, when we tried the test again today by removing the 2 Public
> Network
> cables from one of the server which holds the resources, the other server
> is
> not able to take over the resources. Only by removing the Private or
> complete
> shutdown of the server will enable the failover to be completed
> successfully..
>
> Would appreciate any advise on the above perculiar behaviour..
>
> Thanks.
>
.
- Follow-Ups:
- Re: Cluster could not fail over
- From: Philip
- Re: Cluster could not fail over
- References:
- Cluster could not fail over
- From: Philip
- Cluster could not fail over
- Prev by Date: re:Installing BizTalk 2004 on Windows Server 2003 cluster, help
- Next by Date: W32Time will not start
- Previous by thread: Cluster could not fail over
- Next by thread: Re: Cluster could not fail over
- Index(es):
Relevant Pages
|