Re: Servers dropping like flies
- From: "Scott Rymer" <tsrymer/at/hotmail/dot/com>
- Date: Wed, 28 Oct 2009 09:01:51 -0400
Allan... you hit the nail on the head. The PERC cache memory was failing.
We finally came to this conclusion after 6 hours on the phone will Dell. I reseated the memory and it's been fine for a week now.
I've got a spare stick ordered just in case.
Thanks for the help.
-Scott
"Al Williams" <donotreplydirect@xxxxxxxxxxxxxxxx> wrote in message news:un2JmkJMKHA.4064@xxxxxxxxxxxxxxxxxxxxxxx
The only time I've seen "file system structure on the disk is corrupt and unusable" is when a file system cratered or somehow dropped offline. Is the D: volume on the array you are testing? What was the timing of this event, did it coincide with anything?
Sounds like you'll need to call Dell or something to verify your tests. Perhaps the array's cache memory has failed? I'd start a new thread with any questions regarding the Perc controller and your array setup & tests, I'm not familiar with those. I gather these tests are done with the OS offline, if so it may not be Trend after all...
--
Allan Williams
Scott Rymer wrote:I came in on Saturday to do some more in depth diagnostics on the
server. I ran complete memory tests and system tests with all
passing except for the Buffer Test on SCSI disc 8 which failed with
error: "Error Code: 4400:117f - Read/Write Buffer Test data
mismatch". I ran the test over again and it passed. I then ran it
for 20 iterations and it failed twice. I ran 20 iterations on the
other 4 drives. They too failed about 10-15% of the time so what can
I conclude from that? I didn't see any diagnostics that would let me
test the PERC controllers though.
The other server (SQL) hasn't had any issues since the initial
reboots but this SBS server restarted about 8 times over the weekend.
I did see 1 error in the event logs this morning that I haven't seen
up to this point:
NTFS:: The file system structure on the disk is corrupt and unusable.
Please run the chkdsk utility on the volume D:.
Dell PowerEdge 2800
-Scott
"Scott Rymer" <tsrymer/at/hotmail/dot/com> wrote in message
news:B97B152A-E584-4CC1-8CF7-CF9665D58949@xxxxxxxxxxxxxxxx
Back from lunch... I booted to the Dell Utility partition to run
memory and system tests. All came back fine. I've disabled Trend
so I'll wait to see what that brings...
-Scott
"Al Williams" <donotreplydirect@xxxxxxxxxxxxxxxx> wrote in message
news:O8DCwdXLKHA.3708@xxxxxxxxxxxxxxxxxxxxxxx
Hit F8 on boot. Check your hardware - do a full RAM test
(memtest86 boot CD),
check fans, temperatures, etc.
---
That sounds like a hardware problem but if it is happening on two
different
servers at once that's pretty suspicious (although check any shared
components like UPS/switches). As AV programs tie in at low layers
they could mimic HW issues so disabling Trend was probably a good
idea. Are you
sure it was completely gone - sometimes they leave low level drivers
active.
Have you checked with Trend support?
--
Allan Williams
Scott Rymer wrote:Further info... Windows SBS 2003 Standard w/ SP2 is dead in a
reboot cycle. What's freakin me out is there are no events leading
up to the reboot... so I have no idea how to fix it...
Is there a safe-mode I can get into and possibly some crash logs or
something?
-Scott
"Scott Rymer" <tsrymer/at/hotmail/dot/com> wrote in message
news:7A569C78-8273-49A4-BF10-933EEDE60CD4@xxxxxxxxxxxxxxxx
In the last 2 days, my SBS box (4 times) and my SQL box (2 times)
have just dropped off the face of the earth... or at least the
local network. When I try to login via the console, they're all
locked up and I'm forced to do a hard shutdown and reboot. Nothing in the event logs other than "The previous shutdown at
<Date><Time> was unexpected." Nothing around or before that in
any of the logs. The only thing I can think of that has changed is upgrading to
Trend Micro WFBS 6.0 a couple of months ago. On the servers, I'm
using the new Smart Scan engine for real-time scanning. I've got
the standard excluded directories such as Exchange and SQL. I've
reluctantly disable Trend on both servers this morning to see if
the lock issue persist and plan on changing back to the
Conventional scan engine if the servers respond well. Is anyone
else having issue with the new Smart Scan?
Nope! As I speak... it just happened again. HELP!!! How can I
rectify this issue?
-Scott
.
- Prev by Date: RWW
- Next by Date: SBS2008 User Documents/Restore Question
- Previous by thread: RWW
- Next by thread: SBS2008 User Documents/Restore Question
- Index(es):
Relevant Pages
|