Re: 2008 SBS no longer boots
- From: Freaky <wontsay@xxxxxxxxxx>
- Date: Tue, 07 Apr 2009 22:33:12 +0200
Don't really know what to tell you. Rewritten this 5 times now, I'll try
to keep it short.
We're very much aware of how things should be. Some times customers
don't have the money, other times we just got the customer with an
already crappy situation and have to make due. We can discuss how it
should have been all day long, but it's like telling me I should look
out for the bus after it has ran me over (or well, actually my
customer). Then again, no backup is going to help this issue. We can
very easily recover/reinstall the server and be done with it, but the
issue has already returned 3 times. (and btw, I'm not fan of just an
online backup either... I could however (including this one) recover
every crash with it so far).
Some of the information I got from my collegues (this isn't my customer,
I just have to solve the mess) has been corrected. Last friday the
updates were installed, server restarted fine then. Somewhere during the
weekend it decided to 'restart (read probably crash, I wasn't there, nor
anyone else :))' on it's own, after that it didn't come up anymore.
Anyways, I just virtualized the thing with a bootable CD, customer can
work with some temporary hardware of ours. Reinstalling it just isn't an
option for me cause the chances are high it will just return and we have
other customers with same hardware/setup (of which one other already got
hit by the same issue too). I consider it my obligation to make sure this:
a) never happens again
b) I can recover really really fast from it when it happens (as in 5
minutes by fixing driver/registry/whatever) if I can't get a.
If you're willing/able to help with the issue at hand I can post what
I've done so far. If you only want to discuss how the world should be,
it is better for me (or well, my customer, I actually like discussion)
to go elsewhere. Please don't get me wrong, your gesture to help
*really* is valued, in it's current form we're both just wasting time
however, which I very much like when I have some spare time, but
currently I do not :). Can't be the judge over your time.
Hope this doesn't sound harsh, it's not intended that way. I'm just low
on time, so I have to be careful with what I spend it on and I do find
it rude to not reply at all. After all, you did spend time on me (as
well as several others here). I'm not native english either (you
probably did notice :D) so I'm having issues getting it across the way
I'd like too (then again, I even have that quite frequently in my native
language, but it doesn't help either).
Anyways, thanks again :)
Btw, it is remarkable how many forums mirror these newsgroups, and even
more so how fast google indexes them. Searching the issue often returns
the posts here on various forums :D
Cliff Galiher wrote:
Inline:.
-Cliff
"Freaky" <wontsay@xxxxxxxxxx> wrote in message
news:ujttdOutJHA.1304@xxxxxxxxxxxxxxxxxxxxxxx
Yea we tried LKGC, didn't help. This really isn't a hardware issue, if
it were, the machine wouldn't run after reinstall, it does. Also, same
issue on two brand new machines at two completely different customers is
really odd. Also, if it were hardware, and it is this consistent through
every boot we try, it would do the same after reinstall (if you would
even be able to do so).
Hardware and hardware *related* are two different things. As I
indicated, I think it is a driver issue, which is hardware related.
Different H/W == different drivers. But I digress.
These 2 servers weren't set up by me, I always immediately install the
essential updates, but never use the driver update, especially on
servers. But, with 2008 you don't go to update.microsoft.com any more,
it pulls all the updates in itself. Apparently you can disable the
driver updates tho'.
Same with Vista. But the control panel *still* has the option to view
updates and install individually. Whether you set AU to download and
install or whether you are doing it, if you aren't reviewing the KB
articles associated with each update then you really can't complain that
you don't know which update broke your machine. Regardless of whether
you use update.microsoft.com (XP/SBS2003) or the update control panel
(Vista/Win2k8) or an update management package (WSUS/Shavlik/System
Center products), it is certainly best practice to review and document
*every* update being installed on a machine. Just seeing "13 updates"
in the control panel and clicking OK is not wise on a server. Might as
well enable auto-install if you are going to do the manual equivalent.
I realize that comes across harsh, and that is not my intent. But I am
trying to point out *why* people recommend not auto-installing updates
on servers. And if you are just installing updates by clicking "next"
then you are doing the same thing by proxy. You can't really blame that
on the change from the website to the control panel in win2k8....it is a
sign of a more systemic problem with how updates are being reviewed and
managed on the server and you should adjust accordingly. It is entirely
possible to not review updates displayed on the updates site in SBS2k3
as well, and that is equally as bad.
Restoring a backup is not a nice option. First off we have online
backup, it would atleast require a bear reinstallation, and reattaching
the stores, etc.
Having *only* an online backup is....well...let me be blunt. Insane.
Online backups for off-site is certainly admirable. But it should only
be *one* aspect of your DR plan. Onsite backups are easier to work with
and far less error prone and, as you are discovering), less limiting.
The internet has freed us of many things, but most people would not
*only* keep backups in a lockbox in a bank....restoring becomes a
serious pain. Online backups should be viewed similarly. Do yourself a
favor, go buy a portable USB drive, and use the SBS wizards to set up an
image backup.
My experiences with restoring system state (AD etc)
aren't too great and last time my collegues tried it when this problem
occurred the updates were automatically installed like 2 weeks before
the reboot and hence, the system state reinstated the problem (which
does reinforce the idea it's not hardware).
Two unrelated problems. First, Win2k8 backups are *vastly* different
from win2k3 (and by proxy SBS.) 2k8 is image based. Resolves a lot of
the issues that surrounded system-state being out of sync with the rest
of the OS. There was a reason you *had* to be at the same SP level (and
sometimes even the same hotfix level) with SBS2k3 before restoring
system-state. Full image backups resolves these. Use them. You'll
appreciate that nicety down the road.
As far as the system-state having the bad drivers/files/etc backed
up....my recommendation is generally to plan on rebooting on patch
Tuesday. It is a monthly reboot, you can do it off-hours, and you
verify your system-state is bootable after every change and every
month. If your backup rotation is retaining backups so they are
expiring before a problem is discovered then you have discovered a
serious flaw in your DR plan, not Windows.
Only very seldom have issues
with the essential updates, and when we did it was always solved in 2
minutes.
There have been plenty of documented issues where an update seriously
broke a system component. Exchange, sharepoint, even AD. I've seen it
all. Just because something *hasn't* happened to you doesn't mean it
can't. The whole point of "Disaster Recovery" is the *disaster* part of
the equation. Nobody *wants* a disaster, but you have to plan for
it....and not everything is a 2-minute fix. If you are properly
planning to recover from a worst case scenario, then recovering from
something less than worst case is infitintely easier. But if you only
plan on recovering from "small bumps" and not "disasters" then when
disaster strikes you'll find your DR was woefully inadequate. Take the
problems you are seeing now as *SERIOUS* warning signs. Time to adjust.
Also, rolling back hotfixes from bootable CD or whatever is
well documented and pretty easy on 2003.
Well documented? Yes. Easy? Absolutely not. If you don't have
backups, or if your backup retention has pushed an old system-state out
of rotation, and you find a hotfix broke...say...exchange's IMF
filtering, but you didn't realize it right away because of the nature of
the problem....you are just as screwed. You can't roll back easily, you
can't re-install without downtime.....SBS2k3 is no different that 2k8 in
this regard. Backups are backups. Retention is retention, and patches
can break.
Since the SBS servers we
maintain (nearly all of our customers have em) all have several ports
open to the internet, not installing essential updates or delaying it
too long is often not an option.
I certainly don't recommend delaying. I recommend reviewing and
documenting. Those are two *VERY* different tasks. ANYBODY can click
"next", "next", "next" and install updates...whether it is AU or a
website driven update system. IT gets paid to review, read the KB
articles, and make informed decisions about what is "essential" and what
isn't. There is no other way to spin it.
Anyways, thanks for the re' :) If you have some insights on rolling back
hotfixes on SBS 2008 plz don't hesitate to share :D.
Hotfixes? Same as 2k3. Some hotfixes can be rolled back. Others
cannot. Depends on the hotfix. But this is a driver update, most
likely. The MS Update site had a drivers section in Win2k3 as well. I
regularly saw (and ignored) Intel NIC drivers on the site. The WHQL
drivers that MS kept wanting to push on my SBS 2k3 boxes absolutely
killed performance, and the drivers from Intel's website always worked
far better...but AU didn't see it that way. I just had to decline the
update. Win2k8 moved from a website to a control panel and it is easier
to catch yourself just approving updates...but the methodology hasn't
really changed. Backup->review->document
change->install->reboot->test->backup. you get in that habit and you'll
be far happier.
-Cliff
Regards
Cliff Galiher wrote:
First, take a deep breath. A BSOD during .sys loading is almost
*always* a hardware related issue. It is most commonly caused by a HAL
mismatch, but kernel-level drivers can do it too. So that is the first
thing to keep in mind.
With that, I'd say MS was on the right track. It sounds like you have a
driver since the problem re-occurs even after a re-install. Have you
tried booting to "last known good configuration?" That will at least
allow you to roll back the driver. If the driver is a required driver
for boot...well...sometimes LKGC doesn't keep the old versions, so you
may not be able to do this. That will likely be a re-install (yes, a
third time.)
Finally though, and this is important, I strongly recommend using some
sort of patch management software for managing clients *AND* servers.
Do *not* use automatic updates!!!! WSUS is a good candidate since it
ships with SBS. And don't let WSUS auto-approve. This gives you the
ability to see what updates you are installing. I tend to stagger
updates if many post en-masse for testing. And when you find a problem
child, you simply restore from backup (you keep backups, right?) instead
of having to re-install, and you decide if the update is required or
not. If it is a security patch, for example, then you contact MS (for
free since it is an update) and find out why it kills your hardware.
And if it's a driver....you contact the vendor. No muss, no fuss.
But not making backups so you have to re-install, and not actually
tracking which updates you are installing on a server....those are
recipes for frustration. What you've already experienced is just the
beginning.
-Cliff
"Freaky" <wontsay@xxxxxxxxxx> wrote in message
news:ewovqjttJHA.5172@xxxxxxxxxxxxxxxxxxxxxxx
Hi there,
this is the third time we're experiencing this issue, and are about
done
with it. The previous 2 times the servers were reinstalled. This is the
2nd time on this box. The other time was on another box, same type
though, HP ProLiant ML350G5.
The server continuously reboots during the boot process. If we try to
boot into safemode it reboots too, right after crcdisk.sys. If we
disable automatic reboot we get a BSOD stating A problem has been
detected and Windows has been shut down to prevent damage to your
computer. Bla bla bla.
STOP: 0x0000007b (0xfffffa60005af9d0, 0xffffffffc0000034,
0x0000000000000000, 0x0000000000000000)
According to HP this problem is unknown, we should contact Microsoft.
Called Microsoft for a support case. The engineer said he encountered
this more often, always with the same HP hardware. At first he tried to
have me use recovery console (well command prompt, there no longer is a
recovery console) with tools like listsvc, fdisk /mbr, etc. All these
tools no longer exist...
Then he had me look through the BIOS to change the SATA mode of the
RAID
controller to AHCI. This is a SmartArray E200i hardware RAID controller
with SAS disks. I don't really think it has AHCI... nor was I able to
find it (not in it's BIOS nor in ACU on smartstart). Odd enough I can't
find it for the single SATA port on it either. Inside the HP case
there's a description of connectors, it's not called SATA port, it's
explicitly called 'SATA Optical Connector'. It has the CD drive
attached
to it. Not able to find any options to completely disable the IDE and
SATA controllers either. I rather miss the CD, than have an entire
company down.
Next we tried ERD commander, which sees or 2008 SBS SP1 install, but
doesn't recognize it (perhaps because it dutch, it should work on 2008
SBS according to the MS engineer), and thus doesn't want to run it's
tools against it (like regedit, fixing the MBR (which should help
according to the helpdesk, which I seriously doubt), hotfix rollback
etc.
He did mention it has to do with a driver update HP released to
MicrosoftUpdate. Apparently 2008 (and Vista) install all driver updates
also (which is way beyond me, who decided on that and where can I find
him/her? :D). HP support is currently closed, so haven't got their
feedback yet.
In all 3 cases the server was running for quite a while before the
issue
occurred. After reinstall we also installed all updates and restarted
several times. Yet it comes back every time. We don't do anything
special, we even got the latest drivers from HP site for every
reinstall, no weird driver from driverguide.com or anything like that.
No weird software on it, etc.
Definitely don't want to reinstall again, if only because the issue
apparently can come back at any time. Really loosing all trust
whatsoever in both companies. Unfortunately, ditching MS will be hard,
HP on the other hand...
Hope anyone has some useful information. According to some post I found
you can disable services manually now (thus exporting the registry,
reading that piece of !#%$@!#$, and making your own reg hacks to
disable. Yes, I can see how this is a step forward, I take it windows 7
will require us to use a hexeditor on the registry file, which will
really make me wonder how they're gonna improve it after that :D).
TIA
- Follow-Ups:
- Re: 2008 SBS no longer boots
- From: Les Connor [SBS MVP]
- Re: 2008 SBS no longer boots
- From: Cliff Galiher
- Re: 2008 SBS no longer boots
- References:
- 2008 SBS no longer boots
- From: Freaky
- Re: 2008 SBS no longer boots
- From: Cliff Galiher
- Re: 2008 SBS no longer boots
- From: Freaky
- Re: 2008 SBS no longer boots
- From: Cliff Galiher
- 2008 SBS no longer boots
- Prev by Date: Re: DHCP not working right after server restore from backup
- Next by Date: Re: DHCP not working right after server restore from backup
- Previous by thread: Re: 2008 SBS no longer boots
- Next by thread: Re: 2008 SBS no longer boots
- Index(es):
Relevant Pages
|
Loading