Re: Bad Clusters
- From: "Robert Reader" <RobertReader@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Sun, 14 Aug 2005 10:09:01 -0700
Thanks for the detailed answer. I have two external Maxtor One touch hard
drives I use for backup. I bought them 17 months ago and I ran the chkdsk
just to make sure they were holding out OK. BOTH drives issued a "bad
clusters" notice on one (not the same) file. I hope this doesn't mean they
have both decided to self-destruct simultaneously.
I sure wish someone would create a utility that would give you a clear
picture of what's going on with the drive - whether its starting to cascade
into failure or whether a bad sector was just a solitary, lone occurance.
Like you said, I don't know if dozens of other sectors have recently been
marked out as bad, which would suggest the catastrophic scenario.
Thanks again,
Robert Reader
"cquirke (MVP Windows shell/user)" wrote:
> On Fri, 12 Aug 2005 04:58:03 -0700, "Robert Reader"
>
> >I recently ran a chkdsk with the repair option on an external hard drive, and
> >it reported the following on one of the files:
>
> >Windows replaced bad clusters in file 99407
> >of name \Archive\SOFTWA~1\TopoMaps\CONSOL~1\TopoMaps.zip.
>
> That means the hard drive is dying. Evacuate data and replace it.
>
> >I have several questions:
>
> >1) Does this mean the file it refers to is now corrupt? The file was still
> >present on the disk after chkdsk finished. If bad clusters were found, I
> >assume the data in the clusters is now trash.
>
> Yep, most likely it will be.
>
> It's important not to confuse bad clusters with lost clusters.
>
> Lost clusters are chained out of the free space, but have no directory
> entry that defines them as a file or subdirectory. This is a file
> system logic error that follows bad exits in particular.
>
> Bad clusters contain physically bad sectors, and indicate the hard
> drive is failing at the hardware level. Listen up!
>
>
> For a cluster to show up as bad, it has to escape the hard drive's
> built-in defect "management". When the hard drive's firmware detects
> excessive retries are required to read a sector without error, it will
> try to copy the contents to another "spare" sector. If it succeeds,
> then the "spare" is assigned the address of the sick sector, and the
> sick sector is never used again. All of this happens below the
> awareness of the rest of the PC; the OS has no clue, and nothing shows
> up in ChkDsk logs, etc. It's successfullt swept under the carpet.
>
> There's only one window into this process, and that is S.M.A.R.T.,
> which can read the hard drive's internal record-keeping, and thus
> hopefully see to what extent the HD has been covering up defects.
>
> Windows has zero built-in S.M.A.R.T. awareness, even though this
> facility has been around since 2G drives of the Win9x era.
>
> BIOS can report S.M.A.R.T. status as part of the startup process, but
> by duhfault, this reporting is disabled.
>
> Hard drive vendor's diagnostics typically just give you an "OK" or
> "fail" summary, with no detail at all. An alarming amount of
> impending carnage is rubber-stamped as "OK".
>
> When you finally find a 3rd-party tool that reports raw S.M.A.R.T.
> detail, you find it's pretty hard to understand; the values just don't
> make sense, unless you know how to interpret them. Then you find that
> even though the summary says "OK", there have been x sectors that had
> to be "fixed" on the fly, etc. Not nearly so "OK" after all.
>
>
> So by the time a cluster shows as "bad", it's had to be so bad that
> the hard drive's attempts to paper over it and deny there's a problem,
> have failed. Even at this stage, the problem can be covered up - this
> time by the OS code that operates the "better" NTFS file system.
>
> This code will do exactly the same thing that the hard drive firmware
> tried to do; read the data out of the failing allocation unit (cluster
> rather than sector, this time) and copy it somewhere else, marking the
> original cluster as "bad". This time there would be visible signs
> within the OS's record-keeping of clusters - if you could ever get a
> clear view of that information, that is.
>
> Once a cluster's marked out of use as "bad", further ChkDsk /R or
> AutoChk tests will not test it again. So an elective test may report
> "no (new) bad clusters" even when there have been 20 bad clusters
> successfully "fixed" by NTFS's on-the-fly fiddling, and another 30 bad
> sectors successfully "fixed" by the HD's internal defect management.
>
>
> I think you can see what all this means - that despite any claims to
> the contrary, the game is rigged to hide impending HD failure from
> you, hopefully until the HD's warranty period has expired. The large
> print may claim your vendors really care about your data, but the
> small print confirms they just want to duck support calls.
>
> >2) Are bad clusters the same as bad sectors, ie, if a bad cluster exists it
> >means it contains one or more bad sectors?
>
> A sector is a hardware-level unit of storage, typically containing 512
> bytes. A cluster is a file system level unit of data storage,
> containing a power of 2 sectors - typically 8, for 4k clusters.
>
> Yes, a newly-discovered bad cluster means one or more bad sectors,
> unless something has faked the marking process for some reason.
> Viruses used to do that long ago; I don't think today's OSs lend
> themselves to that particular way of hiding malicious code anymore.
>
> There's one circumstance in which existing bad clusters do not mean a
> failing hard drive, and that is where the contents of failing hard
> drive are imaged (copied exactly) to a good replacement hard drive.
> The raw imaging process will preserve the existing bad cluster marks,
> even though they no longer refer to actual bad clusters.
>
> >3) Further down in the same chkdsk report, it reported "0 KB in bad
> >sectors". Why does it report this when it had just found some bad clusters?
> >Does it report this because it had "fixed" them, ie, replaced them with
> >spares, and bad sectors no longer exist?
>
> Good question. As one who does data recovery, and who has seen too
> many "too late" dead drives that ate data which might have been saved
> if an earlier alarm had been raised, I'm not inclined to trust vendor
> reporting, especially ChkDsk. "Everything's fine" may be a lie, but
> "hey, something might possibly be wrong" is certain to mean trouble.
>
>
> >-- Risk Management is the clue that asks:
> "Why do I keep open buckets of petrol next to all the
> ashtrays in the lounge, when I don't even have a car?"
> >----------------------- ------ ---- --- -- - - - -
>
.
- Follow-Ups:
- Re: Bad Clusters
- From: cquirke (MVP Windows shell/user)
- Re: Bad Clusters
- References:
- Bad Clusters
- From: Robert Reader
- Re: Bad Clusters
- From: cquirke (MVP Windows shell/user)
- Bad Clusters
- Prev by Date: RE: Backup, RAID, Mirror confusion...
- Next by Date: Acrobat Reader 7.0 defaults
- Previous by thread: Re: Bad Clusters
- Next by thread: Re: Bad Clusters
- Index(es):
Relevant Pages
|