Re: Splitting Files into Multiple Folders for Read Performance




"BlackStarPro" <jreid@xxxxxxxxxxxxxxxxxxx> wrote in message
news:5122c9c8-82d9-4bb6-b8ce-0d940cab5eb0@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Hi,

I am a .NET developer for a company that has been producing upward of
100,000 PDF reports per month for several years, which has now turned
into a huge problem. We are releasing a new reporting system that
allows for dynamic reports and custom dashboards, however they want
ALL their legacy PDF reports to still be available through our new
reporting system.

We decided the best thing would be to build a simple app that iterated
through the millions of files and folders and re-organize them onto a
different (better performing) HD array. Right now the files are in a
very elaborate folder structure, the names of the folders are
essentially parameters for organizing the files. Our idea was to have
our application pull the files out and when doing so break its
original path up into parameters and write the parameters into a
database, then simply rename the PDF file something unique and store
them in a simpler folder structure, also keeping track of its new
location.

I had heard that for read performance reasons a folder shouldn't have
more than 2000 files in it. Is there anyone that can verify this?

Using that knowledge (which I don't know if true or not) we could have
main folders incrementally numbered (1,2,3,4, etc.) each containing
2000 uniquely named files.

example:
c:\1\ (would hold files 1-2000.pdf)
c:\2\ (would hold files 2001-4000.pdf) etc.

Our app would just keep track of the old folder parameters (structure)
and a reference to its new location.

Currently we are seeing major performance issues when we try to deal
with the millions of files and folders that are in there right now.
Can anyone recommend a better solution or concur with the above
solution? Any suggestions would help.

My own experience says that performance under NTFS is fine up
to 5000 files per folder and that the number should not exceed 10,000.

Seeing that this is a major project, I would run some tests beforehand.
Here is what I would do:
1. Create the test folder c:\Speed1.
2. Populate it with x files, using this command:
for /L %a in (1,1,x) do echo. > c:\Speed1\Test%a.txt
3. Log off, then log on in order to clear the disk cache.
4. Create the batch file c:\SpeedTest.bat:
@echo off
if exist c:\Speed2 rd /s /q c:\Speed2
md c:\Speed2
for /L %%i in (1,1,100) do call :Test
goto :eof
:Test
set /a r=%random% * x / 32767
copy /y c:\Speed1\Test%%r%.txt c:\Speed2 > nul
5. Invoke c:\Speed.bat like so:
timethis c:\SpeedTest.bat

This test will copy 100 randomly selected files from
c:\Speed1 to c:\Speed2. It will then tell you how long
the process took. Select different values for x to see
how many files you can safely store in each folder.


.



Relevant Pages

  • Outlook calling CompareEntryId a lot after calling GetHierarchyTable
    ... symbol beside a folder in the folder view pane. ... |- 2002 Reports ... Compare entry Id for folder "2002 Reports" against entry Id for folder ...
    (microsoft.public.win32.programmer.messaging)
  • Re: Package to CD Woes
    ... She reports the same problem. ... If that doesn't do the trick, see if Package to Folder and then burning ... the PowerPoint Viewer opens and she sees her presentation files. ...
    (microsoft.public.powerpoint)
  • Re: Mail Spool Problems / IMAP
    ... DON'T DELETE THIS MESSAGE -- FOLDER INTERNAL DATA ... Since the mail below was written, I have seen this corruption happen with Outlook and Mozilla Tbird, not just with Apple Mail. ... I'd been receiving more frequent reports of imapd locking problems with imapd-2006h, and have updated the imapd to imap-2007 based on the comments in the release notes about the locking issues which hopefully were resolved with 2006k. ...
    (freebsd-questions)
  • Re: A Dialogue of the Deaf
    ... "KGB's secret UFO files finally made public ... Files comprising the famous Blue Folder have been declassified a while ... folder contains numerous descriptions of UFO flights and reports on ... the plane at about 1,000 km per hour before vanishing without a trace. ...
    (soc.culture.baltics)
  • Re: How to clone a class library to a new one?
    ... Alpha ... > clicking on the project in the solution explorer, ... >> (copied of the entire folder) and then move that to the new folder name or ... >>> using sub reports, but personally, I try to stay as far from Crystal ...
    (microsoft.public.dotnet.languages.csharp)

Loading