Re: NTFS drive capacity/utilization percentage?

From: Al Dykes (adykes_at_panix.com)
Date: 06/18/04


Date: 18 Jun 2004 08:50:46 -0400

In article <69BFD014-2B81-4BF0-81D0-1115E5366098@microsoft.com>,
=?Utf-8?B?QmVhbmJhZw==?= <Beanbag@discussions.microsoft.com> wrote:
>"usenetacct@lycos.com" wrote:
>
>> sure, we need to keep files created by jobs online for several months
>> & that may be going to a year. This is currently 120,000 new files
>> per day & could go up dramatically very soon with a new client. Size
>> of them ranges from 4MB to 1KB, mostly on the smaller end (<50KB). We
>> constantly delete files to make room for new ones, so each day we're
>> deleting anywhere from 120,000 to 500,000 of these files while adding
>> those I mention above. They're stored in a directory structure based
>> on the
>> year-month-day-hour-minute-job#. When I say millions of files, I am
>> being literal. Tens of millions, to be accurate.
>>
>> We also have a single directory that contains the original files the
>> work is based on, & we have a weeks' worth of data in that, which is
>> ~150,000 files in that single directory. We delete old data from here
>> every day to make room for the new also. This system runs 24/7 and we
>> cannot take it down to defragment, run chkdsk, or anything else. If
>> we have disk corruption, we have to format & restore from tape,
>> because the sheer number of files means running chkdsk takes many days
>> to run.
>>
>> At the moment, we can store 3 months of data in on an 880GB array only
>> if we literally let the system run constantly at extremely high
>> utilization rates. We may need to expand it to allow a years worth.
>> At current volumes, that would mean a single 4 terabyte volume. If we
>> continue growing at current rates (very fast), 4 terabytes is only the
>> beginning.
>>
>> thanks
>>
>

>Is there a special reason why you store all data on a single volume?
>Otherwise I would advise to divide your data up and store it on
>several smaller volumes. This probably minimizes the effect of data
>corruption on your work and should give you the ability to do a fsck
>rather than recover the data from tape. It will also increase your
>overall file system performance and conforms to the Microsoft
>performance tuning guidelines.

>There are defragmentation tools available that do a much more
>sophisticated job than the built-in defrag. You can do online
>defragmentation or defragment with less than 15% free space on your
>volume, schedule jobs and much more. Try OO Defrag
>(www.oo-software.com) for example.

>When you fill your volumes up to 99% this will result in MFT
fragmentation in addition to file fragmentation which slows down the
file system, particularly with such a large amount of files. See MS
KB article 174619 for details.

> I would not recommend to use compression. This will probably not
>result in a faster file system, rather in a slower one.

>When you use Windows Server 2003 you can tune the file system via
>several registry parameters, you can disable the update of the Last
>Access Time attribute or the generation of short names in the 8.3
>naming convention. I don't know if these parameters are available
>under Windows 2000. Check the Windows Server 2003 Performance Tuning
>Guidelines for details. Cheers Frank

Since your post a week or so ago, A couple of things have occurred to
me; (this is based on my reading of your problem description, I may
have things wrong.)

- You are running much too close to the hairy edge and making lots of
  pain for yourself. multi-TB-scale storage arrays are getting amazingly
  cheap. If you boss doesn't approve basic expenses like idks space
  you've got other problems. You (or he) is putting your business at risk.

- I believe your CPU and IO is being sucked up by the number of files
  in your folders. Your performance may greatly improve if you modify
  your application to use the first character of your file name as a
  subfolder name (ie file abcdef.txt gets stored in
  ./a/bcdef.txt). (36 subfolders). If your application is going to
  scale up, you might use the first 2 characters (1296 folders).

  You can easily test this hypothisis on a PC with a big disk; write a
  script that creates folders and 100,000 files with your naming
  convention, but one byte size. Do a DIR command, try defrag, etc.
  You may be suprised.

- It's possible that a well-designed Oracle or Sybase database could
  handle your data much better than NTFS files and folders can, but
  that kind of advice doesn't come for free.

- Contact Dell/EMC. Talk to a salesman about a configuration and quote
  for a NAS/SAN storage box. If they decide you are serious you will
  be able to ask their engineers about how well their file systems
  will behave with your data. If you can get them to tell you how much
  better their product is than NTFS you will learn lots about the
  shortcommings of NTFS, if any.) There are other NAS/SAN vendors, and
  you can see lots of parts pricing at http://www.aberdeeninc.com/

I still think that using NTFS compression for your file systems will
be a big win, but you've got to solve fundamental problems, first.
IMHO it's the number of files in your folders.

I'd like to hear how it works out.

-- 
Al Dykes
-----------
adykes  at p a n i x . c o m


Relevant Pages

  • Re: Modification dates of folders are inaccurate
    ... But when I examine the contents of the folders on both the Imac and the ... you call the paper in the directory, and the file system records ... The file system allows the use of hard links. ... to decide to backup the contents of a directory. ...
    (comp.sys.mac.apps)
  • Re: folder maxes
    ... As to the "how many subfolders" question, the answer is that it depends on ... Since a folder is treated by the file system as "just another ... the only practical limit on the number of folders is the size of your ... Microsoft Windows MVP ...
    (microsoft.public.win2000.file_system)
  • Re: OE6 Message display delay
    ... You don't have to do anything special before a Defrag. ... "Lynn W" wrote in message ... > Thanks Bruce I have now unchecked compact folders in background and I ... >>> I haven't done is the information regarding compacting folders. ...
    (microsoft.public.windows.inetexplorer.ie6_outlookexpress)
  • Re: Random File Visibility
    ... My guess would be something in the file system is corrupt. ... (If it were an ownership, or permissions error, you would only get an ... > folders that are shared beneath it. ...
    (microsoft.public.windowsxp.help_and_support)
  • Re: Hide folders with ntfs rights
    ... folders (excluding the fact that hiding subfolders is part of all the ... saying that showing someone the door to the safe doesn't ... More accurate file system auditing. ... positives' in your security event log. ...
    (microsoft.public.win2000.file_system)