Re: Is Alive = Failing over
- From: v-xuwen@xxxxxxxxxxxxxxxxxxxx (Vincent Xu [MSFT])
- Date: Mon, 10 Jul 2006 02:32:00 GMT
Hi,
Exactly as Geoff said, Is-Alove check is a lightweight test .From a SQL
Server perspective, the node hosting the SQL Server resource does a
looks-alive check every 5 seconds. This is a lightweight check to see
whether the service is running and may succeed even if the instance of SQL
Server is not operational. If this query fails, the IsAlive check retries
five times and then attempts to reconnect to the instance of SQL Server. If
all five retries fail, the SQL Server resource fails. The interval can be
changed through the cluster administrator by going into the Advanced tab of
the SQL server properties but by default LooksAlive interval is 5000
milliseconds and IsAlive interval is 60000 milliseconds.
You can check following article:
<http://www.microsoft.com/technet/prodtechnol/sql/2000/maintain/failclus.msp
x#XSLTsection125121120120>
Best regards,
Vincent Xu
Microsoft Online Partner Support
======================================================
Get Secure! - www.microsoft.com/security
======================================================
When responding to posts, please "Reply to Group" via your newsreader so
that others
may learn and benefit from this issue.
======================================================
This posting is provided "AS IS" with no warranties,and confers no rights.
======================================================
--------------------
asFrom: "Geoff N. Hiten" <SQLCraftsman@xxxxxxxxx>
References: <lFvrg.199$Js2.156@xxxxxxxxxxxxxxxx>
Subject: Re: Is Alive = Failing over
Date: Fri, 7 Jul 2006 12:44:53 -0400
Lines: 51
X-Priority: 3
X-MSMail-Priority: Normal
X-Newsreader: Microsoft Outlook Express 6.00.2900.2869
X-RFC2646: Format=Flowed; Response
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2900.2869
Message-ID: <O0HLtSeoGHA.4816@xxxxxxxxxxxxxxxxxxxx>
Newsgroups: microsoft.public.sqlserver.clustering
NNTP-Posting-Host: 69.15.110.58
Path: TK2MSFTNGXA01.phx.gbl!TK2MSFTNGP01.phx.gbl!TK2MSFTNGP03.phx.gbl
Xref: TK2MSFTNGXA01.phx.gbl microsoft.public.sqlserver.clustering:20214
X-Tomcat-NG: microsoft.public.sqlserver.clustering
There is actually an algorithm for the Looks-Alive and Is-Alive failure
sequence that the cluster service goes through before marking a service
to"failed" and attempting a cluster recovery. It takes multiple failures
Typecomplete the sequence. The sequence was devised in attempt to balance
necessaryI and Type II errors. That is avoiding a failure when one was not
are(Type I) and while not missing any real failures (Type II). The checks
Looks-Aliveperformed as a client would view the server, because that is the most
representative way of examining the system.
If your cluster is failing over due to non-responsiveness to the
doingand Is-Alove checks, then by definition it is failing correctly. You can
adjust the timing using the cluster tool, but I highly recommend not
itso. You will likely cause your system to not fail when you actually need
doto. You would be better off finding out why the server cannot respond to
such a lightweight test as the cluster heartbeat tests.
If you insist on shooting yourself in the foot, here is exactly how you
canit: Open the cluster administrator tool. Open the Resources folder.
Right-click on the SQL Server resource. Select the 'Advanced' tab. You
"Isoverride the "Looks Alive" and "Is Alive" polling intervals here.
--
Geoff N. Hiten
Senior Database Administrator
Microsoft SQL Server MVP
"Tim" <tim@xxxxxxxxxx> wrote in message
news:lFvrg.199$Js2.156@xxxxxxxxxxxxxxxxxxx
This is what is happening:
Cluster service is running "Is Alive" check every 1 minute on SQL server
I validate this by profiling the SQL server and see that "select
@@servername" command being executed by cluster service every minute
There are times when the server is stressed, thus connections I believe
are gettiing refused/delayed, some are these connections might be the
tuneAlive" check.
Thus, the cluster service thinks there is something wrong with SQL and
either restarts or failovers SQL
Is there a threshold setting that can be set like, after 10 "Is Alive"
failed checks within 1 hour then failover or restart? Or what other
options do I have. We are in the process the trying to performance
thisthe server but this might takes weeks to complete. In the mean time
effects our production cluster.
.
- References:
- Is Alive = Failing over
- From: Tim
- Re: Is Alive = Failing over
- From: Geoff N. Hiten
- Is Alive = Failing over
- Prev by Date: Re: HOW TO CONFIGURE WIN2003 & SQL2K IN ACTIVE/ACTIVE
- Next by Date: sql clustering problems
- Previous by thread: Re: Is Alive = Failing over
- Next by thread: Re: Is Alive = Failing over
- Index(es):
Relevant Pages
|
Loading