Re: Minimize failover time

Tech-Archive recommends: Speed Up your PC by fixing your registry



Keep in mind what has to happen when a cluster fails over or is moved to the
other node. The cluster has to recognize the failure or the move request.
The cluster then has to send SQL and the other resources shutdown signals
and then wait for all the resources to respond. If the resources don't
respond the cluster has to wait for the request to time-out before killing
the resource. Once all the resources are off-line the cluster then has to
send start signals to the resources on the other node and again wait for a
response. The resources most likely have to start up in a particular order,
so the start signals have to wait for each resource in the particular order
to start and respond before the next resource can be sent a start signal.
All of this signaling takes time, 15 to 20 seconds is actually pretty good
response. I suspect you were testing the fail-over and this 15 to 20 seconds
isn't based upon an actual failure where timeouts will most likely be
encountered and a much slower response as a result.

This is what clusters do, they don't guarentee that you won't have a service
interruption, just that the service interruption will be shorter then if you
had to manually respond. Highly reliable and highly available are not the
same.





"Pasquale" <Pasquale@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:9C239FA3-ACC2-476B-AA1F-2EA208D056F9@xxxxxxxxxxxxxxxx
I have a two node cluster (active/active).
When I try the failover with the cluster administrator tool I have seen
that
it occurs 15-20 seconds to recover the SQL Server resource.
Is it possible to decrement the failover time for the SQL Server resource?
How?
Thanks


.



Relevant Pages

  • Re: Cluster Testing - Failure and Recovery taking longer than expected
    ... The IP Address resources should fail, and the groups will all failover to another node in the cluster. ... When we place all groups on one node and unplug the public network cable the cluster does not immediately fail over to other nodes. ... It takes 7 minutes for the failure to register and for the cluster to recover. ...
    (microsoft.public.windows.server.clustering)
  • Re: Changing Node & Virtual IPs for Print Server Cluster
    ... It's a clustered print server. ... Cluster Node 1 Name: cluster1a.domain.com ... Move all resources to Node A. ... IP addresses for all virtual servers including the cluster itself. ...
    (microsoft.public.windows.server.clustering)
  • Re: Exch 2003 SP2 - applied on one node, but cant move resources
    ... resources to Node2, the failover did not complete because 'system attendant' ... Virtual Exchange server and failover occurred normally again upon taking ... cluster resources oline. ...
    (microsoft.public.exchange.admin)
  • Re: Failed cluster node confusion!
    ... Blue exclamation marks usually means that the cluster service has terminated ... If this fails then the heartbeat will go over the teamed NIC ... the second node did NOT failover the resources. ... working node when one node has completely died (blue screen, ...
    (microsoft.public.windows.server.clustering)
  • RE: Cluster migrations
    ... Our file clusters have one virtual name per cluster group, ... individual file share resources within that group. ... Then if you need to move it you just present the LUN ... run mountvol /e at the command on the Win2k3 node before you present the LUN ...
    (microsoft.public.windows.server.clustering)