RE: Help Again! Cluster Resources Moving Takes Ages & unreliable
- From: MarkFox <MarkFox@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Wed, 17 May 2006 04:55:02 -0700
Simon, Have you checked to see if the delay in coming online is due to a
chkdsk running? This can be determined by looking in the event logs are at
the time of the delay on the server that the resource is online pending you
will see a chkdsk process running. You can also look in C:\Windows\Cluster\
and look for a file called ChkDsk_Disk10_Sig....Log. This could explain the
long delay in a resource coming online, chkdsk can take quite a while to run
depending on amount of space in use etc. The cluster checks for corruption
every time a resource is moved.
Hope that helps.
--
Mark
"Simon" wrote:
Hi again everyone :-).
I am in need of assistance - happy to post on here, or directly if anyone is
willing to provide some assistance here. I am more concerned about our
cluster misbehaving, and not having any good resources in the United Kingdom
(1 book I found which I am still waiting on!!), and the White Papers are not
fully helping me understand what is going on.
Spec is as follows
2 Servers x HP Proliant DL380 G3 Dual Processor 3Ghz and 2GB RAM
1 x HP EVA SAN Hardware Solution
OS installed on C:
The Quorum is on a seperate disk, but from my understanding, its configured
as Shared. In other words, it can move to another NODE.
All hardware fibre channel using HBA's all drivers and firmware up to date.
Secure Path has been installed and configured and we also use Veritas
Enterprise Administrator for the Disks Administrator application (This
replaces the Microsoft Disk Admin tool which is part of the OS).
1st Node hosts 9 VDisks in sizes of 250GB. On this Node, there are 65
Resources configured in Cluster Administrator.
2nd Node hosts 8 VDsisks in sizes of 250GB. On this node, there are 31
Resources configured in CLuster Administrator.
The main problem I have is the uptime, which currently stands at 14 days
(Max has only been 22 so far!!) and the Resources taking a LONG time to move,
5-7 mins in most cases.
I have monitored this and noticed that some resources stay in a pending
state for some time. When they move over, it takes a long time for the
resources to move onto to another node.
Also, if the resources have not come back online CORRECTLY, it takes the
entire group down and moves them back. I think I may have resolved this by
disabling the option "Affect the Group", which was ticked. A lesson learned
here was a Shared Resource was removed, the cluster tool could not find it
and took the ENTIRE group down!
I am not too sure where to start - its in production so taking it down is
not easy. But I want to help the company with the limited skills I have. Im
not sure if its permissions problem with resources that is causing the issue,
or hardware but if anyone is able to share any additional info, this would
truly help.
Im the first to admit, I am NOT a cluster expert - But I want to be and I
want to know what is happening, so I can understand and correct it.
I also apprecaite everyone is busy, but in times like this, I am willing to
do almost anything to help sort this out.
Thank you
Simon
- Follow-Ups:
- Prev by Date: Re: NLB Unicast Switch flooding
- Next by Date: Re: Change from Passive to Active
- Previous by thread: Print Cluster - spooler monitoring
- Next by thread: RE: Help Again! Cluster Resources Moving Takes Ages & unreliable
- Index(es):
Relevant Pages
|