Re: Physical Disk goes offline when cluster node reboots
- From: "Darrek" <Darrek.Kay1@xxxxxxxx>
- Date: 15 Dec 2006 08:58:00 -0800
Edwin vMierlo wrote:
Darrek,
Just to confirm that we have the symptom right
- All groups are online on Node 1 (therefore all disks are online on Node 1)
- you reboot Node 2
- All disks on Node 1 go offline on Node 1 during reboot/Post of Node 2
Please confirm that this is what you are experiencing
and two questions:
Q: are the disks who go offline on Node 2, do they fail or do they go
offline ? (please specify, as there is a difference)
Q: Do you see any "reservation lost" messages/events in the system event log
?
rgds,
Edwin.
Yes. All groups are online and running fine on Node 1. During Node 2
POST Node 1 reports errors like this in the event log:
(One for each LUN on the SAN)
Event Type: Error
Event Source: Disk
Event Category: None
Event ID: 15
Description:
The device, \Device\Harddisk1, is not ready for access yet.
And then...one of these...
Event Type: Error
Event Source: ClusSvc
Event Category: Physical Disk Resource
Event ID: 1038
Description:
Reservation of cluster disk 'Disk T - QASQLBTmp' has been lost. Please
check your system and disk configuration.
And then...several of these...
Event Type: Warning
Event Source: Ntfs
Event Category: None
Event ID: 50
Description:
{Delayed Write Failed} Windows was unable to save all the data for the
file . The data has been lost. This error may be caused by a failure of
your computer hardware or network connection. Please try to save this
file elsewhere.
More Event 15's, and 1038' for other LUNs
A couple of these mixed in...
Event Type: Information
Event Source: Application Popup
Event Category: None
Event ID: 26
Description:
Application popup: Windows - Delayed Write Failed : Windows was unable
to save all the data for the file Q:\$Mft. The data has been lost. This
error may be caused by a failure of your computer hardware or network
connection. Please try to save this file elsewhere.
One of these:
Event Type: Warning
Event Source: Ftdisk
Event Category: Disk
Event ID: 57
Description:
The system failed to flush data to the transaction log. Corruption may
occur.
At this point Cluster Admin begins sending service stop commands to
SQL.
And I get these:
Event Type: Error
Event Source: ClusSvc
Event Category: Physical Disk Resource
Event ID: 1036
Description:
Cluster disk resource '' did not respond to a SCSI maintenance command.
Followed by several more 57's:
I even managed one of these:
Event Type: Error
Event Source: ClusSvc
Event Category: Physical Disk Resource
Event ID: 1034
Description:
The disk associated with cluster disk resource 'Disk Q:' could not be
found. The expected signature of the disk was BED1F8F9. If the disk was
removed from the server cluster, the resource should be deleted. If the
disk was replaced, the resource must be deleted and created again in
order to bring the disk online. If the disk has not been removed or
replaced, it may be inaccessible at this time because it is reserved by
another server cluster node.
Followed by one of these:
Event Type: Error
Event Source: ClusSvc
Event Category: Startup/Shutdown
Event ID: 1009
Description:
Cluster service could not join an existing server cluster and could not
form a new server cluster. Cluster service has terminated.
The drivers I'm using are Emulex Storport FC2243
5-1.11X1 11/07/2005 WS2K3 32 bit (elxadjct.sys & elxstor.sys)
5.1.3.2 (elxstod.dll)
The MSA 1000 is on firmware 4.48.
Thanks for your help!
-DK
.
- Follow-Ups:
- Re: Physical Disk goes offline when cluster node reboots
- From: Chuck Timon [Microsoft]
- Re: Physical Disk goes offline when cluster node reboots
- References:
- Physical Disk goes offline when cluster node reboots
- From: Darrek
- Re: Physical Disk goes offline when cluster node reboots
- From: Edwin vMierlo
- Physical Disk goes offline when cluster node reboots
- Prev by Date: Reasons not to cluster domain controllers
- Next by Date: Re: Reasons not to cluster domain controllers
- Previous by thread: Re: Physical Disk goes offline when cluster node reboots
- Next by thread: Re: Physical Disk goes offline when cluster node reboots
- Index(es):
Relevant Pages
|
Loading