Building hit by lightning, server repaired, suspect data corruptio
From: matt (matt_at_discussions.microsoft.com)
Date: 08/11/04
- Next message: MIGUEL: "Please help me to solve this stop message"
- Previous message: Mark Elsen: "RoboCopy for one file"
- Next in thread: Yor Suiris: "Re: Building hit by lightning, server repaired, suspect data corruptio"
- Reply: Yor Suiris: "Re: Building hit by lightning, server repaired, suspect data corruptio"
- Messages sorted by: [ date ] [ thread ]
Date: Tue, 10 Aug 2004 17:17:01 -0700
Ok folks, please do not laugh. I am serious about everything in this post. I
have been on a new job for two days and have been handed one of the biggest
headaches I have ever faced.
The building took a lightning strike. Two UPS's gave their lives in the
service of the company. The server took a BIG hit, one of the drives involved
here died or was damaged. The server was resurrected and appears stable.
Here is the setup:
Two Raid 5 arrays, mirrored.
Dynamic Disks
2000 Server
Active Directory
Here is the symptom:
Several users (2-3 of 100) are having various data problems.
Example one: User has a directory on a volume that normally contains over
500 word .doc files. User can only *see* 210 of them. Restore the missing
files via Veritas, restore fails. Redirect the restore to another directory,
action is successful. Attempt to copy files from redirect directory to user's
directory, Windows warns "file already exists, do you want to replace?"
Answer yes, answer no, makes no difference, cannot see the file after the
copy is complete.
Example two: User has a .xls file on the same volume, different folder. User
opens file, Excel says file is damaged and must be repaired. When file is
repaired, most data in the workbook is missing. Restore the file to another
folder. File is fine, all data there. Copy restored file to the correct
folder and open the file in Excel, file is corrupt, gets repaired, and is
exactly like the first version of the file (same data is there, same data is
missing as first go round).
Logs, diagnostics, event viewers do not give any indication of failure or
bad files.
Here is my theory ( I know that this is not supposed to be able to happen,
but it is the only solution I can think of):
In both cases I beleive that data on one of the two mirrored volumes is
corrupted so badly that it cannot be manipulated. The file may exist healthy
on one of the mirrors and be corrupt on the other. The system does not allow
overwrites because the file exists on the mirror, the file is so corrupt on
the other that it is unreadable. This would cause the problem I am seeing, I
have seen this on a Novell 3x mirror and a NT4 mirror. The only fix was to
delete the folder that contained the files and restore it from backup. This
deleted the directory, subs, and files and allowed them to be re-written to
both sides of the mirrors.
Unless anyone else has ideas of how to fix this, it is the best I can come
up with. Breaking the mirrors and re-mirroring has been done twice since the
lightning strike with no fix. Corrupt data is the only answer I can find.
Thanks for any assistance or ideas.
- Next message: MIGUEL: "Please help me to solve this stop message"
- Previous message: Mark Elsen: "RoboCopy for one file"
- Next in thread: Yor Suiris: "Re: Building hit by lightning, server repaired, suspect data corruptio"
- Reply: Yor Suiris: "Re: Building hit by lightning, server repaired, suspect data corruptio"
- Messages sorted by: [ date ] [ thread ]
Relevant Pages
|