Re: Splitting Files into Multiple Folders for Read Performance



On May 8, 9:50 am, "Pegasus \(MVP\)" <I....@xxxxxxxxxx> wrote:
"BlackStarPro" <jr...@xxxxxxxxxxxxxxxxxxx> wrote in message

news:5122c9c8-82d9-4bb6-b8ce-0d940cab5eb0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx





Hi,

I am a .NET developer for a company that has been producing upward of
100,000 PDF reports per month for several years, which has now turned
into a huge problem. We are releasing a new reporting system that
allows for dynamic reports and custom dashboards, however they want
ALL their legacy PDF reports to still be available through our new
reporting system.

We decided the best thing would be to build a simple app that iterated
through the millions of files and folders and re-organize them onto a
different (better performing) HD array. Right now the files are in a
very elaborate folder structure, the names of the folders are
essentially parameters for organizing the files. Our idea was to have
our application pull the files out and when doing so break its
original path up into parameters and write the parameters into a
database, then simply rename the PDF file something unique and store
them in a simpler folder structure, also keeping track of its new
location.

I had heard that for read performance reasons a folder shouldn't have
more than 2000 files in it. Is there anyone that can verify this?

Using that knowledge (which I don't know if true or not) we could have
main folders incrementally numbered (1,2,3,4, etc.) each containing
2000 uniquely named files.

example:
c:\1\ (would hold files 1-2000.pdf)
c:\2\ (would hold files 2001-4000.pdf) etc.

Our app would just keep track of the old folder parameters (structure)
and a reference to its new location.

Currently we are seeing major performance issues when we try to deal
with the millions of files and folders that are in there right now.
Can anyone recommend a better solution or concur with the above
solution? Any suggestions would help.

My own experience says that performance under NTFS is fine up
to 5000 files per folder and that the number should not exceed 10,000.

Seeing that this is a major project, I would run some tests beforehand.
Here is what I would do:
1. Create the test folder c:\Speed1.
2. Populate it with x files, using this command:
    for /L %a in (1,1,x) do echo. > c:\Speed1\Test%a.txt
3. Log off, then log on in order to clear the disk cache.
4. Create the batch file c:\SpeedTest.bat:
    @echo off
    if exist c:\Speed2 rd /s /q c:\Speed2
    md c:\Speed2
    for /L %%i in (1,1,100) do call :Test
    goto :eof
    :Test
    set /a r=%random% * x / 32767
    copy /y c:\Speed1\Test%%r%.txt c:\Speed2 > nul
5. Invoke c:\Speed.bat like so:
    timethis c:\SpeedTest.bat

This test will copy 100 randomly selected files from
c:\Speed1 to c:\Speed2. It will then tell you how long
the process took. Select different values for x to see
how many files you can safely store in each folder.- Hide quoted text -

- Show quoted text -

Nice! Thank you very much for the response. I will do these tests and
will post the results. Thanks again.
.



Relevant Pages

  • Splitting Files into Multiple Folders for Read Performance
    ... 100,000 PDF reports per month for several years, which has now turned ... ALL their legacy PDF reports to still be available through our new ... the names of the folders are ... them in a simpler folder structure, also keeping track of its new ...
    (microsoft.public.windows.file_system)
  • Re: too many modules
    ...  Crystal Reports may be an option to pull a large amount of reports ...   John Spencer ... I.e., step back, look at the bigger picture of business objectives, ...
    (microsoft.public.access.modulesdaovba)
  • Re: SBS 2003 Status and Performance Reports
    ... These folders should be under %systemroot%\inetpub ... Go into IIS manager and check the properties for these web sites and the ... SBS ROCKS ... > status and performance reports, ...
    (microsoft.public.windows.server.sbs)
  • Re: VBA Sumif *********************** Help
    ... I have many different kinds of representative name folders such as ... "ABC123" or refer to cell A1 or whatever, Open up the correct file on ... Dim WB As Workbook ...     CELL.Select ...
    (microsoft.public.excel.programming)
  • Re: Cobols File System Vs. RDBMs...
    ...   What do you man by columns in an index file? ... The Cobol index file system is considered to be Closed simply ... gives the user the ability to create such reports using a GUI. ...
    (comp.lang.cobol)