Re: SP2 Installation Problem

From: cquirke (MVP Win9x) (cquirkenews_at_nospam.mvps.org)
Date: 08/22/04


Date: Sun, 22 Aug 2004 13:24:14 +0200

On Sat, 21 Aug 2004 23:15:34 -0400, Barry Watzman

>Chkdsk/r locates bad sectors -- does a scan of the hard drive.
>Problem is, that while with classic drives this might have done
>something, with modern IDE drives, Windows will never see a bad sector
>until the drive is failing catastrophically.

Hm. The truth of that depends on how early you define "failing
catastrophically". I agree that just about everyone, from HD vendor
to MS, are trying to sweep problems under the rug - if you don't make
a concerted effort to spot and dodge problems, no-one else will.

Sounds paranoid? Read on...

>With modern IDE drives, what windows sees, until the drive is
>catastrophically dying, is a "logical" or "virtual" drive that is always
>perfect. If there are problems with a sector, the drive handles them
>internally through "defect reallocation" -- the "bad" sector is so
>marked bad (usually when it's still readable with retrys) and never used
>again, and the data is moved to a spare good sector, in a different
>place on the drive, all transparent to Windows.

This is true, and a process like this can happen at three layers:

1) HD's firmware defect management

This is what you refer to, and indeed you won't see any trace of these
"repairs' in Windows. The physical sector readdressing is done within
the HD, so nothing changes in the file systrem at all - which is why
this management is compatable with other OSs.

2) NTFS on the fly repair

FATxx is free of this, but NTFS will basically do the same thing "on
the fly" that the HD's firmware is trying to do. Is this useful, or
likely to screw up in a "too many cooks" scenario?

These changes may or may not be visible; knowing how autocratic ChkDsk
and AutoChk are (no user control, poor reporting buried in the swaps
of Event Viewer under "Winlogon", etc.), my hopes are low.

3) Explicit suface tests

ChkDsk /R, Win9x GUI Scandisk "thorough" and Win9x DOS mode Scandisk
"surface scan" all do the same thing - they test-read every sector in
the volume and, if it cannot be read within a certain number of
retries, they attempt to relocate the data at the cluster level.

Unlike (1), this is (at least for FATxx) visible, because you would
see the number of "bad clusters" reported and you'd see the B(ad)
blocks in the map if you did Scandisk surface from DOS mode.

4) S.M.A.R.T.

This is not an automatic defect management, but a "window" into what
the HD firmware's self-monitoring and defect management have been up
to. BIOS can usually report S.M.A.R.T. status on startup, but the
usual default is NOT to do so.

In addition, you can add 3rd-partyware to report S.M.A.R.T. status
(Windows *still* has no native ability to do this), and several HD
vendor diagnostics merely query S.M.A.R.T. status without checking the
HD disk surfaces at all when the "standard" or "quick" test is done.

5) 3rd-party diagnostics

These include HD vendor's downloadable tools (usually free), parts of
some test suites, and stand-alone utilities such as SpinRite. Some
may be able to disable the HD firmware's defect management, and thus
get a straight answer without this vendor's proxy trying to cover up.

The advantage of these is that they should be OS-agnostic, and should
not try to "fix" anything. Making the HD look "OK" until the
warranty's over is the HD vendor's interest; yours is to detect
failure early and get your data off safely.

6) Windows bad disk flag

Windows mailtains two state-of-filesystem flags, and checks these on
startup. If either flag is set, an automatic check of the file system
is done, with surface check if indicated.

The first flag is set when file updates are in progress, and thenm
cleared. If this is (left) set at boot time, a check of the file
system logic is done. This is the usual "bad exit" situation.

The second flag is set (and left set) if an attempt to access physical
disk fails. In response to this, a surface check of all volumes on
the affected HD is done. This may happen irrespective of whether the
last Windows session was properly shut down or not.

>Windows will virtually never see any problems at all until the
>drive is close to catastrophic failure, and usually LONG after
>S.M.A.R.T. has been reporting errors and impending drive failure.

False, in my experience - only in a few cases, even with BIOS
S.M.A.R.T. enabled (SOP on PCs I build), has S.M.A.R.T. reported a HD
as bad, before testing has shown latency or frank errors.

As mentioned, when HD firmware "fixes" a bad sector by remapping it,
no trace of this (outside possible details from S.M.A.R.T. - but most
views just say "OK" with no stats) is visible. But the process of
attempting this fix can be very visible indeed, due to latency - it
can take several seconds to "successfully" read a failing sector, due
to nested retries and attempts to "fix" this on the fly.

Suspect this if the PC slows down radically, with the HD activity LED
hard on, and either no HD noise (same-track retries) or cyclical
clacking (seek retries). This is how a failing HD usually presents.

The best way to "see" this latency is to do a DOS mode Scandisk
surface scan (which NTFS victims can't do, of course). This process
maintains a fine-grained cluster counter, and runs no other processes
underfoot that could add latency.

So you can *see* pauses in that counter, and know it's the HD (or
other hardware delays e.g. CPU thermal protection or interrupt floods)
and not other software that is the cause. You can't do this with
Win9x GUI's "thourough" test of ChkDsk /R.

>-------------------- ----- ---- --- -- - - - -
    Trsut me, I won't make a mistake!
>-------------------- ----- ---- --- -- - - - -



Relevant Pages

  • Re: WD data lifeguard tools on a HDD with data on it
    ... >>> the partition table and boot sector look fine. ... >>> Windows may have trouble accessing the entire disk and can not reach the ... creating sector to sector copies of hard drives. ...
    (comp.sys.ibm.pc.hardware.storage)
  • Re: Are computer forensics people as stupid as they seem?
    ... I took 2 modern hard drives. ... I encrypted one with TC under Windows XP. ... appeared as "Raw" and no partition info was detected. ... contents of the disks, and both showed every sector overwritten, from ...
    (alt.privacy)
  • Re: Are computer forensics people as stupid as they seem?
    ... I took 2 modern hard drives. ... I encrypted one with TC under Windows XP. ... appeared as "Raw" and no partition info was detected. ... contents of the disks, and both showed every sector overwritten, from ...
    (alt.privacy)
  • Re: SP2 Installation Problem
    ... that while with classic drives this might have done ... With modern IDE drives, what windows sees, until the drive is ... If there are problems with a sector, ... > partition, and one essentially ...
    (microsoft.public.windowsxp.general)
  • Make the Most of Your New PC
    ... mint Windows Vista computer this holiday ... the Workgroup name is exactly the same for all the computers in the ... that doesn't support USB 2.0) and low-capacity portable hard drives. ...
    (microsoft.public.windows.vista.general)