Re: Failed cluster node confusion!
- From: "Edwin vMierlo [MVP]" <EdwinvMierlo@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 11 Dec 2007 13:27:29 -0000
Normally the second node would take ownership of all your groups and
resources.
To me this means that something is not working properly.
i would start by analyzing the following on the surviving node around the
time of failure of the first node
1) any clues / errors in the system event log
2) any clues / errors in the cluster.log (note cluster.log is in GMT
timezone, not "host" time)
Blue exclamation marks usually means that the cluster service has terminated
(on node 2).
This usually means it cannot arbitrate for the Quorum disk.
But please note the word "usually" ... only analysis will tell us for sure
rgds,
edwin.
"SW" <siwilson@xxxxxxxxx> wrote in message
news:bd6d5576-5e6b-450f-b514-d56e5278faf2@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hello All!
Hope someone can help me with an issue I have with 3 shared disk
clusters I have...
I was doing some tests with a Windows Server 2003 SP1 2 node cluster
which is connected to an HP EVA SAN for storage. Everything is
redundant including the HBA cards, NICs (they are teamed) etc etc. The
heartbeat is configured to go over a cross over cable between the two
nodes. If this fails then the heartbeat will go over the teamed NIC
connection.
The client asked me to do some testing to see how reslient the
solution was. All went really well when doing the following tests:
1) Shuting down one of the nodes
2) Unplugging one or both of the ethernet cables from the teamed NIC
When doing the above test, everything failed over to the remaining
working node correctly.
The problem came when simulating a catastrophic failure by literally
unplugging one of the nodes power supplies (both of them). When I did
this, the second node did NOT failover the resources. Can someone
explain why this is the case? We discovered this by accident when one
of our nodes blue screened and the resources didn't failover. Is this
a design limitation? When I opened Cluster Administrator all the nodes
and resources had a blue exlamation point through them. How does one
get the reources to automatically (or even manually) failover to the
working node when one node has completely died (blue screen, hardware
failure, etc)?? What is the correct procedure to follow when one of
the nodes in a cluster completely fails? Does the remaining working
node have to "seize" the resources?
I can't seem to find anything regarding this issue on the net.
Any help will be much appreciated! ;-)
.
- Follow-Ups:
- Re: Failed cluster node confusion!
- From: SW
- Re: Failed cluster node confusion!
- References:
- Failed cluster node confusion!
- From: SW
- Failed cluster node confusion!
- Prev by Date: Failed cluster node confusion!
- Next by Date: Re: File Share Resource Fails Status Check
- Previous by thread: Failed cluster node confusion!
- Next by thread: Re: Failed cluster node confusion!
- Index(es):
Relevant Pages
|