Re: Cluster Freezes
- From: Rog <Rog@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 21 Feb 2008 23:42:00 -0800
Edwin,
Thanks for your reply.
I apologize if I wasn't clear earlier, I just inherited the cluster and
didn't quite have all the detail that maybe I should have.
When testing the offline times I notice that the file shares go offline in a
matter of seconds which is expected and the same for the IP and Network Name
resources. However, the Physical disk seem to take atleast 30-45 seconds to
go offline.
The failover time for all the resources is really good maybe 20 seconds tops
for the whole group.
But, the real problem is coming back online. IP and Network name are fast
and then I have a period of about 30 minutes waiting for the Physical disk to
come online and then the fileshares come online quickly after or it never
finishes bringing the disk online before it freezes.
At this point you have to shutdown one node and reboot the other in order to
get the cluster back and then bring back the second node.
The only configuration that seems out of place to me is that the physical
disks have a dependency of the Network Name. I have always made it a practice
to have no depenencies on the physical and just make the disk and network
name dep. of the file share. My managers ideal was that by him putting the
network name as a dep. on the disk that it would always be available.
I haven't changed this yet simple there are so many file shares it would
have to be added to if I changed this dep. However, I found an article that
says that you should never add dep. on a physical disk and do you think this
could be my problem.
I can't find any article that tells me why I have it set like this. Can you
explain?
Thank again!
"Edwin vMierlo [MVP]" wrote:
I think you need to time it more carefully, so you know where the actual.
time is taken
there are 3 times involved
1) the offline time of the group
2) the move of ownership of the group to the other node
3) the online time of the group
In regards to 1) the offline time of the group
you need to determine the offline time of each of the resource depending on
their dependency structure, e.g. how long does it take to take *just* the
File Share Resources, and leave IP / Name and Disk online, just to see if
offlining the Shares are taking the time or that it is something else.
Once you know the offline timings, do the same for 3) the online timings
usually the 2) move of ownership is reasonably fast and does not take a long
time.
In other words, you need to be more detailed in "group move is slow" and
determine what resource or resources is slow and doing what action
(off/on/move)
Rgds,
edwin.
"Rog" <Rog@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:BA9BA565-5E11-4C48-B6D5-FFCDE3584FD3@xxxxxxxxxxxxxxxx
All,minutes
I have a W2k3 R2 cluster that had about 10 disk and 150 file share
resources. When I attempt to failover the group it will take about 15
minimum to failover if successful. But, most recently it completly locksup
both servers and you have to shut them both down and bring one up at atime
to get the cluster back online.freeze.
If I take about a third of the file share resources offline, then I can
failover in about 10 minutes. I was first thinking that I might be dealing
with a timeout issue, but I didn't think it would cause the server to
Is there a limit to the number of disk and file share resources you should
have in a single cluster group?
Anyone have any suggestions?
Thanks!
- Follow-Ups:
- Re: Cluster Freezes
- From: Edwin vMierlo [MVP]
- Re: Cluster Freezes
- References:
- Re: Cluster Freezes
- From: Edwin vMierlo [MVP]
- Re: Cluster Freezes
- Prev by Date: Re: cluster service is requesting a bus reset for device
- Next by Date: Re: Cluster Freezes
- Previous by thread: Re: Cluster Freezes
- Next by thread: Re: Cluster Freezes
- Index(es):
Relevant Pages
|