Re: Disk failure and event id 1146
- From: "Edwin vMierlo" <EdwinvMierlo@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Mon, 8 Jan 2007 10:17:53 -0000
I haven't see this sequence of events before, but for troubleshooting
purposes, you might want to run your two disk resources in a seperate
resource monitor (you check this box in the properties of the resource in
cluadmin), and then see if this changes behaviour, in regards of what
resource monitor is failing.
Also, the log sniplet below details the following:
00000918.000016f0::2007/01/05-17:09:31.750 INFO [RM] RmpSetResourceStatus,
Posting state 3 notification for resource <Disk U:>
state 3 is not a failed state, it is an offline state, which follows after
the flushing and dismounting the disk, so this looks quite normal, and
doesn't look like a failure of the disk resource.
What is suspect is that you get the error about flushing buffers, you might
want to disable any hardware monitoring software which might have a handle
open to disk, and therefore preventing the exclusive volume lock for
flushing. Monitoring software both from your server manufacturer or
third-party could (not saying it is....) cause such behaviour.
HTH,
Edwin.
"Franco" <Franco@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in message
news:3382D545-C8BF-4617-BEA2-091BF7B4A493@xxxxxxxxxxxxxxxx
Hello all, this is my environment:n-1.
2 nodes MSCS with Windows Server 2003 SP1.
Node 1 host SQL Server, Cluster group, File system and Lotus Domino
Node 2 host Oracle 9i in a failsafe confoguration.
IBM SAN DS4400 with firmware code and device drivers for all devices at
scenario
This is not always happening but I would like to share this strange
with you:I
When I decide to "take one group offline", in this case the Oracle group,
end up in this situation:to
the 2 disks of the Oracle group (T:\ and U:\) gets failed.
In the system event log I receive the following errors:
"The cluster resource monitor died unexpectedly, an attempt will be made
restart it."queuing
This error is followed by a series of errors like this one:
"The system failed to flush data to the transaction log. Corruption may
occur."
In the cluster log I have:
"00000918.000016f0::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
Offline, ResourceEntry @ 000C4A90 Valid 1
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [FM] FmpRmOfflineResource:
RmOffline() for 85f5dcef-4d92-4734-b330-01c0c9ba46ac returned error 997
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
update type 0 context 8queuing
00000918.000013c8::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
[PnP] Stop watching PnP events for disk a184bf5a
00000918.000013c8::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
[PnP] Stop watching disk a184bf5a - processed
00000918.000016f0::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
Offline, Processing disk number 7.
00000918.000016f0::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
Offline, Using handle array.
00000918.000016f0::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
Offline, Opening device \Device\Harddisk7\Partition1.
00000918.000016f0::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
Offline, FlushFileBuffers for \Device\Harddisk7\Partition1.
00000918.000016f0::2007/01/05-17:09:30.516 INFO Physical Disk <Disk U:>:
Offline, Locking volume for \Device\Harddisk7\Partition1.
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
Dispatching seq 160634 type 0 context 8 to node 2
00000918.000013c8::2007/01/05-17:09:30.516 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK for U (Partition1) - Received
00000918.000013c8::2007/01/05-17:09:30.516 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK for U (Partition1) - Processed
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
completed update seq 160634 type 0 context 8
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [FM]
FmpPropagateResourceState: resource 85f5dcef-4d92-4734-b330-01c0c9ba46ac
pending event.
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [FM]
FmpSetResourcePersistentState: Setting persistent state for resource
85f5dcef-4d92-4734-b330-01c0c9ba46ac...
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
update type 1 context 4098queuing
00000918.000013c8::2007/01/05-17:09:30.516 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK_FAILED for U (Partition1) - Received
00000918.000013c8::2007/01/05-17:09:30.516 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK_FAILED for U (Partition1) - Processed
00000918.000016f0::2007/01/05-17:09:30.516 WARN Physical Disk <Disk U:>:
Offline, Locking volume failed, error 5.
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
Dispatching seq 160635 type 1 context 4098 to node 2
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [DM] DmWriteToQuorumLog
Entry Seq#=160635 Type=4098 Size=152
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [DM] DmUpdateSetValue
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [DM] Setting value of
PersistentState for key Resources\85f5dcef-4d92-4734-b330-01c0c9ba46ac to
0x00000000
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [DM] DmWriteToQuorumLog
Entry Seq#=160635 Type=4098 Size=152
000006c4.00000a34::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
completed update seq 160635 type 1 context 4098
000006c4.0000084c::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
update type 0 context 11<85f5dcef-4d92-4734-b330-01c0c9ba46ac>
000006c4.0000084c::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
Dispatching seq 160636 type 0 context 11 to node 2
000006c4.0000084c::2007/01/05-17:09:30.516 INFO [GUM] GumSendUpdate:
completed update seq 160636 type 0 context 11
000006c4.0000084c::2007/01/05-17:09:30.516 INFO [FM]
FmpRmChangeResourceParams for resource
called...NotifyCallBackRoutine:
00000918.000013c8::2007/01/05-17:09:31.125 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK for U (Partition1) - Received
00000918.000013c8::2007/01/05-17:09:31.125 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK for U (Partition1) - Processed
00000918.000013c8::2007/01/05-17:09:31.125 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK_FAILED for U (Partition1) - Received
00000918.000013c8::2007/01/05-17:09:31.125 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_LOCK_FAILED for U (Partition1) - Processed
00000918.000016f0::2007/01/05-17:09:31.125 WARN Physical Disk <Disk U:>:
Offline, Locking volume failed, error 5.
00000918.000016f0::2007/01/05-17:09:31.719 INFO Physical Disk <Disk U:>:
Offline, Dismounting volume \Device\Harddisk7\Partition1.
00000918.000013c8::2007/01/05-17:09:31.719 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_DISMOUNT for U (Partition1) - Received
00000918.000013c8::2007/01/05-17:09:31.719 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_DISMOUNT for U (Partition1) - Processed
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
Offline, Dismount complete, volume \Device\Harddisk7\Partition1.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
DiskCleanup started.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[DiskArb] StopPersistentReservations is called.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[DiskArb] Stopping reservation thread.
00000918.00000cc4::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[DiskArb] CompletionRoutine, status 0.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[ArbCleanup] Verifying sector size.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[ArbCleanup] Reading arbitration block.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[DiskArb] Successful read (sector 12) [SRV0002B:120604]
(0,5f5f2dea:01c72da1).
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[ArbCleanup] Writing arbitration block.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[DiskArb] Successful write (sector 12) [:0] (0,00000000:00000000).
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[ArbCleanup] Returning status 0.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
[DiskArb] StopPersistentReservations is complete.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
DisksDismountDrives: letter mask is 00100000.
00000918.000016f0::2007/01/05-17:09:31.735 INFO Physical Disk <Disk U:>:
DiskCleanup returning final error 0
00000918.000013c8::2007/01/05-17:09:31.735 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_UNLOCK for U (Partition1) - Received
00000918.000013c8::2007/01/05-17:09:31.735 INFO Physical Disk: [PnP] Event
GUID_IO_VOLUME_UNLOCK for U (Partition1) - Processed
00000918.000016f0::2007/01/05-17:09:31.750 INFO Physical Disk <Disk U:>:
Offline, Returning final error 0.
00000918.000016f0::2007/01/05-17:09:31.750 INFO [RM] RmpSetResourceStatus,
Posting state 3 notification for resource <Disk U:>
000006c4.00001798::2007/01/05-17:09:31.750 INFO [FM]
enqueuing eventRmNotifyChanges
000006c4.00001798::2007/01/05-17:09:31.750 INFO [FM] Calling
in monitor 0918.(http://support.microsoft.com/kb/314753),
000006c4.00000848::2007/01/05-17:09:31.750 INFO [CP] CppResourceNotify for
resource Disk U:"
I do not have more than 100 disks,
so this is not my situation.
I do not find in the cluster log the following note:
"WARN Network Name Enterprise Network Name: Failed to obtain
IDirectoryObject pointer to computer account ServerName, status 80005000"
so this is not also my case (http://support.microsoft.com/kb/886652)
Any clue?
Thank you very much and kind regards.
Franco
.
- References:
- Disk failure and event id 1146
- From: Franco
- Disk failure and event id 1146
- Prev by Date: Disk failure and event id 1146
- Next by Date: Re: Disk failure and event id 1146
- Previous by thread: Disk failure and event id 1146
- Next by thread: Re: Disk failure and event id 1146
- Index(es):
Relevant Pages
|