Re: Cluster could not fail over



Wouldn't this be normal as the failure of or lack of signal over the
heartbeat network is the initiator for a node resource failover? The nodes
should failover resources when the heartbeat intervals are missed for a
particular period of time...

"The network connections must be able to provide a maximum guaranteed round
trip latency between nodes of no more than 500 milliseconds. The cluster
uses heartbeat to detect whether a node is alive or not responding. These
heartbeats are sent out on a periodic basis. If a node takes too long to
respond to heartbeat packets, the cluster service starts a heavy-weight
protocol to figure out which nodes are really still alive and which ones are
dead; this is known as a cluster re-group. The heartbeat interval is not a
configurable parameter for the cluster service (there are many reasons for
this, but the bottom line is that changing this parameter can have a
significant impact on the stability of the cluster and the failover time).
500ms round-trip is significantly below any threshold to ensure that
artificial re-group operations are not triggered."

--
Ryan Sokolowski
MVP - Windows Server - Clustering
MCSE, CCNA, CCDA, BCFP
Avanade
http://www.Avanade.com

"A troubleshooter's best tool is the Event Viewer and understanding the
events and messages contained therein."

This posting is provided "AS IS" with no warranties, and confers no rights.

"Philip" <Philip@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:35BE4DB5-D199-47FC-B2DB-6CD0862FE8D3@xxxxxxxxxxxxxxxx
> Hi everyone,
> Recently we have setup 2 Windows 2003 Enterprise Servers with MS
> Clustering.
> The cluster was setup using the standard guide from Microsoft.
>
> The configurations are :
>
> 1) 3 NIC cards for each server: 2 on board NIC cards using Network
> Teaming
> for Network redundancy. 1 for Heartbeat using cross cable.
>
> 2) 2 HBA cards for each server for SAN redundancy.
>
> After the Cluster was setup, we tested failover test by removing 2 Public
> network cables for each server, rebooting and shutting down of servers,
> The
> failover tests were completed without problem. The servers are able to
> take
> over resources whenever the other is down.
>
> However, when we tried the test again today by removing the 2 Public
> Network
> cables from one of the server which holds the resources, the other server
> is
> not able to take over the resources. Only by removing the Private or
> complete
> shutdown of the server will enable the failover to be completed
> successfully..
>
> Would appreciate any advise on the above perculiar behaviour..
>
> Thanks.
>


.



Relevant Pages

  • Re: Veritas storage foundation HA for windows
    ... How about Veritas Storage Foundation "HA" for Windows? ... "If one of the servers or resources running on the server in a cluster ... a process known as failover." ...
    (microsoft.public.sqlserver.clustering)
  • Re: Clustering and Mirror Data sets?
    ... failure or server issue, will also allow system maintance on one server while ... testing on how our software will react if when a failover occurs. ... he's looking more for a cluster without shared storage (which I guess is ... Most cluster applications will require ...
    (microsoft.public.windows.server.clustering)
  • Re: file server clustering
    ... During the failover I get a message 'Delayed Write Failed'. ... You will have to declare all shares (cluster resources in this case, ... > I am building a clustered server that only provide drive access... ...
    (microsoft.public.windows.server.clustering)
  • Re: How to Cluster Portal Server 2003
    ... > I have a cluster which is failover. ... This is a small office scenario and a beefy cluster - I wish ... > to run everything from this server and cluster whatever I can for failover. ... > So by your language a small failover farm. ...
    (microsoft.public.sharepoint.portalserver)
  • Re: Loss of Connectivity
    ... Either on the Heartbeat or Primary Nic. ... Does the passive node shut its cluster service down? ... If it actually loses total network connectivity, ... We have a Windows 2003 Server Enterprise w/SP2 Running in a cluster. ...
    (microsoft.public.windows.server.clustering)