Re: Time of failover of Microsoft SQL 2000
From: Patrice (krakowpat_at_yahoo.com)
Date: 02/25/05
- Next message: Hans de Bruin: "Having fun with moint points"
- Previous message: Donna Lambert [MS]: "RE: Is there necessary remove named instance alias after..."
- In reply to: Joe Yong: "Re: Time of failover of Microsoft SQL 2000"
- Messages sorted by: [ date ] [ thread ]
Date: 25 Feb 2005 07:08:57 -0800
Hello Joe,
After the 10-32 seconds for Mike, you come with < 10 seconds. We should
do something very wrong to end with ~ 2 min on a not loaded SQL
cluster.
But, I also think that we should at this stage be sure that we are
measuring the failover duration the same way ;-) Indeed, last WE, we
had to install some MS patches, and we have made two complete
failovers. Here are the details of the 2nd failover (the fastest one),
which took 2 min 9 sec:
- NODE01 17:25:30 The Cluster Service is attempting to offline the
Resource Group "Cluster Group".
- NODE01 17:25:30 The Cluster Service brought the Resource Group
"Cluster Group" offline.
- NODE02 17:25:55 The Cluster Service is attempting to bring online
the Resource Group "Cluster Group".
- NODE02 17:25:59 The Cluster Service brought the Resource Group
"Cluster Group" online.
- NODE01 17:26:11 The Cluster Service is attempting to offline the
Resource Group "MSDTC".
- NODE01 17:26:12 The Cluster Service brought the Resource Group
"MSDTC" offline.
- NODE01 17:26:32 The Cluster Service is attempting to offline the
Resource Group "SQL01".
- NODE02 17:26:34 The Cluster Service is attempting to bring online
the Resource Group "MSDTC".
- NODE01 17:26:39 The Cluster Service brought the Resource Group
"SQL01" offline.
- NODE02 17:26:47 The Cluster Service brought the Resource Group
"MSDTC" online.
- NODE01 17:27:00 The Cluster Service is attempting to offline the
Resource Group "SQL02".
- NODE02 17:27:00 The Cluster Service is attempting to bring online
the Resource Group "SQL01".
- NODE01 17:27:08 The Cluster Service brought the Resource Group
"SQL02" offline.
- NODE02 17:27:11 The Cluster Service brought the Resource Group
"SQL01" online.
- NODE02 17:27:28 The Cluster Service is attempting to bring online
the Resource Group "SQL02".
- NODE02 17:27:39 The Cluster Service brought the Resource Group
"SQL02" online.
[00:02:09]
Indeed, I must admit that the administrator has moved the four groups
one by one, which is probably not the most efficient way, any advices
on this topic is welcome! But, if we take the four moves independently,
we can see that we still have durations that are > 10 sec:
Move of the "Cluster Group" group:
- NODE01 17:25:30 The Cluster Service is attempting to offline the
Resource Group "Cluster Group".
- NODE01 17:25:30 The Cluster Service brought the Resource Group
"Cluster Group" offline.
- NODE02 17:25:55 The Cluster Service is attempting to bring online
the Resource Group "Cluster Group".
- NODE02 17:25:59 The Cluster Service brought the Resource Group
"Cluster Group" online.
[00:00:29]
Move of the "MSDTC" group:
- NODE01 17:26:11 The Cluster Service is attempting to offline the
Resource Group "MSDTC".
- NODE01 17:26:12 The Cluster Service brought the Resource Group
"MSDTC" offline.
- NODE02 17:26:34 The Cluster Service is attempting to bring online
the Resource Group "MSDTC".
- NODE02 17:26:47 The Cluster Service brought the Resource Group
"MSDTC" online.
[00:00:36]
Move of the "SQL01" group:
- NODE01 17:26:32 The Cluster Service is attempting to offline the
Resource Group "SQL01".
- NODE01 17:26:39 The Cluster Service brought the Resource Group
"SQL01" offline.
- NODE02 17:27:00 The Cluster Service is attempting to bring online
the Resource Group "SQL01".
- NODE02 17:27:11 The Cluster Service brought the Resource Group
"SQL01" online.
[00:00:39]
Move of the "SQL02" group:
- NODE01 17:27:00 The Cluster Service is attempting to offline the
Resource Group "SQL02".
- NODE01 17:27:08 The Cluster Service brought the Resource Group
"SQL02" offline.
- NODE02 17:27:28 The Cluster Service is attempting to bring online
the Resource Group "SQL02".
- NODE02 17:27:39 The Cluster Service brought the Resource Group
"SQL02" online.
[00:00:39]
By the way, you can here see why I was talking about 6 moves. You can
see here 4 moves, which should be completed by 2 last moves in order to
equally distribute the groups between the two servers. We did not
perform the 2 last moves because of our lack of confidence with the
NODE01 server, which has crashed 2 times since the beginning of the
year :-(
In summary, I am looking for:
(1) Advices on the most efficient way to move all the groups of a
cluster;
(2) Similar Event Log analysis
Finally, I need to emphasize that I have no clue about the SQL "load"
during the failover, I guess it would very interesting to have a graph
of duration of failover versus load :-))
Many thanks in advance and best regards,
Patrice
- Next message: Hans de Bruin: "Having fun with moint points"
- Previous message: Donna Lambert [MS]: "RE: Is there necessary remove named instance alias after..."
- In reply to: Joe Yong: "Re: Time of failover of Microsoft SQL 2000"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|