Re: Bad Clusters



Good, Insightful posting - Explained very well.

"cquirke (MVP Windows shell/user)" <cquirkenews@xxxxxxxxxxxxxxx> wrote in
message news:3jduf1t7fbsggllk1tstrp6a875scbie55@xxxxxxxxxx
> On Fri, 12 Aug 2005 04:58:03 -0700, "Robert Reader"
>
>>I recently ran a chkdsk with the repair option on an external hard drive,
>>and
>>it reported the following on one of the files:
>
>>Windows replaced bad clusters in file 99407
>>of name \Archive\SOFTWA~1\TopoMaps\CONSOL~1\TopoMaps.zip.
>
> That means the hard drive is dying. Evacuate data and replace it.
>
>>I have several questions:
>
>>1) Does this mean the file it refers to is now corrupt? The file was
>>still
>>present on the disk after chkdsk finished. If bad clusters were found, I
>>assume the data in the clusters is now trash.
>
> Yep, most likely it will be.
>
> It's important not to confuse bad clusters with lost clusters.
>
> Lost clusters are chained out of the free space, but have no directory
> entry that defines them as a file or subdirectory. This is a file
> system logic error that follows bad exits in particular.
>
> Bad clusters contain physically bad sectors, and indicate the hard
> drive is failing at the hardware level. Listen up!
>
>
> For a cluster to show up as bad, it has to escape the hard drive's
> built-in defect "management". When the hard drive's firmware detects
> excessive retries are required to read a sector without error, it will
> try to copy the contents to another "spare" sector. If it succeeds,
> then the "spare" is assigned the address of the sick sector, and the
> sick sector is never used again. All of this happens below the
> awareness of the rest of the PC; the OS has no clue, and nothing shows
> up in ChkDsk logs, etc. It's successfullt swept under the carpet.
>
> There's only one window into this process, and that is S.M.A.R.T.,
> which can read the hard drive's internal record-keeping, and thus
> hopefully see to what extent the HD has been covering up defects.
>
> Windows has zero built-in S.M.A.R.T. awareness, even though this
> facility has been around since 2G drives of the Win9x era.
>
> BIOS can report S.M.A.R.T. status as part of the startup process, but
> by duhfault, this reporting is disabled.
>
> Hard drive vendor's diagnostics typically just give you an "OK" or
> "fail" summary, with no detail at all. An alarming amount of
> impending carnage is rubber-stamped as "OK".
>
> When you finally find a 3rd-party tool that reports raw S.M.A.R.T.
> detail, you find it's pretty hard to understand; the values just don't
> make sense, unless you know how to interpret them. Then you find that
> even though the summary says "OK", there have been x sectors that had
> to be "fixed" on the fly, etc. Not nearly so "OK" after all.
>
>
> So by the time a cluster shows as "bad", it's had to be so bad that
> the hard drive's attempts to paper over it and deny there's a problem,
> have failed. Even at this stage, the problem can be covered up - this
> time by the OS code that operates the "better" NTFS file system.
>
> This code will do exactly the same thing that the hard drive firmware
> tried to do; read the data out of the failing allocation unit (cluster
> rather than sector, this time) and copy it somewhere else, marking the
> original cluster as "bad". This time there would be visible signs
> within the OS's record-keeping of clusters - if you could ever get a
> clear view of that information, that is.
>
> Once a cluster's marked out of use as "bad", further ChkDsk /R or
> AutoChk tests will not test it again. So an elective test may report
> "no (new) bad clusters" even when there have been 20 bad clusters
> successfully "fixed" by NTFS's on-the-fly fiddling, and another 30 bad
> sectors successfully "fixed" by the HD's internal defect management.
>
>
> I think you can see what all this means - that despite any claims to
> the contrary, the game is rigged to hide impending HD failure from
> you, hopefully until the HD's warranty period has expired. The large
> print may claim your vendors really care about your data, but the
> small print confirms they just want to duck support calls.
>
>>2) Are bad clusters the same as bad sectors, ie, if a bad cluster exists
>>it
>>means it contains one or more bad sectors?
>
> A sector is a hardware-level unit of storage, typically containing 512
> bytes. A cluster is a file system level unit of data storage,
> containing a power of 2 sectors - typically 8, for 4k clusters.
>
> Yes, a newly-discovered bad cluster means one or more bad sectors,
> unless something has faked the marking process for some reason.
> Viruses used to do that long ago; I don't think today's OSs lend
> themselves to that particular way of hiding malicious code anymore.
>
> There's one circumstance in which existing bad clusters do not mean a
> failing hard drive, and that is where the contents of failing hard
> drive are imaged (copied exactly) to a good replacement hard drive.
> The raw imaging process will preserve the existing bad cluster marks,
> even though they no longer refer to actual bad clusters.
>
>>3) Further down in the same chkdsk report, it reported "0 KB in bad
>>sectors". Why does it report this when it had just found some bad
>>clusters?
>>Does it report this because it had "fixed" them, ie, replaced them with
>>spares, and bad sectors no longer exist?
>
> Good question. As one who does data recovery, and who has seen too
> many "too late" dead drives that ate data which might have been saved
> if an earlier alarm had been raised, I'm not inclined to trust vendor
> reporting, especially ChkDsk. "Everything's fine" may be a lie, but
> "hey, something might possibly be wrong" is certain to mean trouble.
>
>
>>-- Risk Management is the clue that asks:
> "Why do I keep open buckets of petrol next to all the
> ashtrays in the lounge, when I don't even have a car?"
>>----------------------- ------ ---- --- -- - - - -


.



Relevant Pages

  • Re: Not sure what to do about not receiving error messages
    ... If you need most of them, the drive is dying. ... clusters and running sector repair just results in a COMPLETED ... And that shows that the drive has some bad sectors that have been replaced. ... It should report on the drive max temp too with a Hitachi drive. ...
    (comp.sys.ibm.pc.hardware.storage)
  • Re: Bad Clusters
    ... BOTH drives issued a "bad ... clusters" notice on one file. ... I don't know if dozens of other sectors have recently been ... So an elective test may report ...
    (microsoft.public.windowsxp.general)
  • Re: Bad Clusters
    ... >present on the disk after chkdsk finished. ... If bad clusters were found, ... Bad clusters contain physically bad sectors, ... BIOS can report S.M.A.R.T. ...
    (microsoft.public.windowsxp.general)
  • Re: Bad sectors on hard drive
    ... drive conditions W2k can detect at boot. ... Simply marking bad sectors by any method ... such will probably report bad problems. ... If I run plain CHKDSK ...
    (microsoft.public.win2000.hardware)
  • Bad Sectors
    ... >The bad sectors were fixed & are no longer on the drive. ... >report, ... >I remeber, when CHKDSK ran, & moved the data off, then ...
    (microsoft.public.windowsxp.hardware)