Re: Physical disk hangs at "offline pending"



Answer me this, when you look at the properties for this disk resource and inspect the 'dependencies' tab...is there anything listed there?

Chuck Timon, Jr.
Microsoft Corporation
Longhorn Readiness Team
This posting is provided "AS IS" with no warranties, and confers no rights.

"Henry" <Henry@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message news:056FAD68-9EF5-4877-93B1-841B20EED633@xxxxxxxxxxxxxxxx
Hi,

1) No. The file system is NTFS.
2) Event ID 1145 - Cluster resource OracleDB timed out. (Physical disk name)
Event ID 1205 - The cluster service failed to bring the resource group
"OracleDB" completely online or offline.
3)00000f7c.00000810::2007/03/01-15:51:35.805 INFO [FM] FmpRmOfflineResource:
RmOffline() for 5fa5cc41-66f4-4b14-9d9c-32c7f67347a5 returned error 997. --- 0000c20.00000f18::2007/03/01-17:56:22.860 INFO [FM] FmpRmOfflineResource:
RmOffline() for 667e7691-4049-44fe-9380-c620cd79971d returned error 997

The following entry is repeated:
00000c20.00000a60::2007/03/01-17:58:25.379 INFO [FM] FmpCompleteMoveGroup:
Exit, status = 997
00000c20.00000a60::2007/03/01-17:58:25.875 INFO [FM] FmpCompleteMoveGroup:
Completing the move for group BANCTEC to node 1 (1)
00000c20.00000a60::2007/03/01-17:58:25.875 INFO [FM] FmpOfflineResource:
Offline resource <OracleDB> returned pending

until finally:
00000c20.00000a60::2007/03/01-17:59:40.276 INFO [FM] FmpCompleteMoveGroup:
Exit, status = 997
000002f4.00000388::2007/03/01-17:59:40.757 WARN [RM] RmpTimerThread:
Resource OracleDB pending timed out, CP 3 - setting state to failed.

This last messege may be the result of us getting fed up and shutting down
the server that will not release the physical drive.
4) The other resources are offline. On the odd occasion another physical
disk displays the pending offline symtoms as well.

Thanks in Advance

--
Henry


"Edwin vMierlo" wrote:

Henry,

just a few questions:
- is this Oracle FileSystem (ocfs.sys) ?
- what errors do you see in the system event log (please post) ?
- what errors do you see in the cluster.log (please post) ?
- once the disk is in off-line pending state... what other cluster resources
are off-line pending ?

thnx,
edwin.




"Henry" <Henry@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:CE4E6FBE-DAE2-425C-B2F5-526812E37245@xxxxxxxxxxxxxxxx
> Hi,
>
> We have installed Oracle failsafe on this cluster and the drive in
question
> is part of the "Cluster Group" set of resources. The oracle database
resides
> on this SAN drive. I have stopped all oracle services on the server > giving
me
> the problems and the disk still does not go offline to enable a > failover
> unless the server is shut down.
> I suppose there must be something else preventing the failover and am
trying
> to determine what could be preventing this disk from being released. > The
> server in question does have exclusive rights to this physical disk > when
it
> is the active member.
> If anyone has any idea as to how I might determine if some process is
> refusing to release it's resources please make a suggestion.
> Is there a way to increase the logging level of the cluster and should
that
> give me a better indication of what may be the problem? (the logs are
fairly
> hard to decipher even at the default logging level).
>
> Thanks in Advance,
> -- > Henry
>
>
> "Chuck Timon [MSFT]" wrote:
>
> > Sounds like something has a handle to the drive that is preventing
cluster
> > from completing the Offline process. What kind of group is this disk
> > resource in?
> >
> > Chuck Timon, Jr.
> > Microsoft Corporation
> > Longhorn Readiness Team
> > This posting is provided "AS IS" with no warranties, and confers no
rights.
> >
> > "Henry" <Henry@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
> > news:4429C77B-C125-4677-8F00-C2D96D014716@xxxxxxxxxxxxxxxx
> > > Hi,
> > >
> > > I have a 2 node cluster that works correctly when the active server
goes
> > > down.
> > > All resources are taken over by the passive member.
> > >
> > > When I try to move resources from node 1 to node 2 everything works
fine
> > > as
> > > well.
> > > The problem is that when I try to move the the resources back to > > > the
> > > original node all resources move except for one physical disk. This
> > > physical
> > > disk status remains as "offline pending". The cluster log contains
many
> > > entries similar to what follows:
> > > "FmpofflineResource: offline resource <drivex>returned pending"
> > > until finally
> > > "RmpTimerThread: Resource drivex pending timed out, CP 3 - seting
state to
> > > failed."
> > >
> > > The only way for us to get the offline resource available for the
other
> > > cluster member is to reboot the server that failed to put the > > > physical
> > > drive
> > > offline.
> > >
> > > Any ideas would be appreciated.
> > > -- > > > Thanks in Advance,
> > >
> > > Henry
> >
> >




.



Relevant Pages

  • RE: Concurrently streaming a file to HttpResponse and file IO
    ... about the same time to read a resource from the disk as it does to select it ... I'm implementing support for disk based caching of binary resources ... (continue to serve requests for resources) ... an approach is to start streaming the resource from ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Cluster Freezes
    ... When testing the offline times I notice that the file shares go offline in a ... the Physical disk seem to take atleast 30-45 seconds to ... The failover time for all the resources is really good maybe 20 seconds tops ... the real problem is coming back online. ...
    (microsoft.public.windows.server.clustering)
  • Re: Concurrently streaming a file to HttpResponse and file IO
    ... I would like to keep the application responsive (continue to serve requests for resources) while streaming resources to disk. ... In order to serve each request, an approach is to start streaming the resource from DB to the client request - and simultaneously queue a task to the threadpool that streams the resource to disk. ...
    (microsoft.public.dotnet.languages.csharp)
  • Concurrently streaming a file to HttpResponse and file IO
    ... I would like to keep the application responsive (continue to serve requests for resources) while streaming resources to disk. ... In order to serve each request, an approach is to start streaming the resource from DB to the client request - and simultaneously queue a task to the threadpool that streams the resource to disk. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: MSDTC Disk Problem
    ... As i told you it fails the DTC resources,instead i can move ip and network ... error that the resources can't open the log file that reside on the disk ... >> rejoin the crashed node in the cluster ...
    (microsoft.public.windows.server.clustering)