Performance with reading large numbers of files...
- From: Mark Rochotte <muftiteg@xxxxxxxxxxxxxxxxxxxxx>
- Date: Sun, 22 Jan 2006 18:24:19 -0800
Hi All.
I have a small test application (a condensation of the issue from a much larger application) that recurses a directory and adds all the file names to a string collection. The app then iterates over these names, opens the file, and reads the entire file (or the first 4Mb if greater than 4Mb), and closes the file.
The problem is that after 10000 or so files, the application slows down significantly. I've played with various GC and Thread.Sleep() calls as part of the loop, but the slowdown still occurs.
There are 29000 files or so, in a couple hundred directories, and all the files are 100-200k in size.
Can someone explain a way to fix the issue? I'm including the source to the app and some sample output. This particular code is for C# 1.1, but the same issue exists for 2.0 RTM.
Thanks,
m
----------------------------- using System; using System.Collections.Specialized; using System.IO;
namespace FileReadTest
{
internal class Class1
{
const string vDir = "<SOME DIR HERE>";
[STAThread]
private static void Main(string[] args)
{
DateTime sStartTime = DateTime.Now;
StringCollection sDiskFiles = new StringCollection();
RecurseDirs(new DirectoryInfo(vDir), sDiskFiles);
byte[] sBuf = new byte[1024*1024*4];int x = 0;
foreach (string sFileName in sDiskFiles)
{
if (x++%256 == 1)
Console.WriteLine("Processing {0,6:#,##0} of {1,6:#,##0} Elapsed: {2}", x, sDiskFiles.Count, (DateTime.Now-sStartTime));
using (FileStream sStream = File.OpenRead(sFileName))
{
sStream.Read(sBuf, 0, sBuf.Length);
sStream.Close();
}
}}
private static void RecurseDirs(DirectoryInfo aDir, StringCollection aList)
{
DirectoryInfo[] sDirArr = aDir.GetDirectories();
for (int x = 0; x < sDirArr.Length; x++)
{
RecurseDirs(sDirArr[x], aList);
}
FileInfo[] sFileArr = aDir.GetFiles();
for (int x = 0; x < sFileArr.Length; x++)
{
FileInfo sInfo = sFileArr[x];
aList.Add(sInfo.FullName);
}
}
}
}
-----------------------------
Processing 1 of 29,041 Elapsed: 00:00:17.3281250
Processing 257 of 29,041 Elapsed: 00:00:18.5312500 << about 1.2s per 250 files.
Processing 513 of 29,041 Elapsed: 00:00:19.9531250
Processing 769 of 29,041 Elapsed: 00:00:21.1562500
Processing 1,025 of 29,041 Elapsed: 00:00:22.5937500
Processing 1,281 of 29,041 Elapsed: 00:00:23.6718750
Processing 1,537 of 29,041 Elapsed: 00:00:24.8750000
Processing 1,793 of 29,041 Elapsed: 00:00:25.9843750
Processing 2,049 of 29,041 Elapsed: 00:00:27.3906250
Processing 2,305 of 29,041 Elapsed: 00:00:28.6718750
<snip>
Processing 12,545 of 29,041 Elapsed: 00:01:55.1875000
Processing 12,801 of 29,041 Elapsed: 00:01:58.9218750
Processing 13,057 of 29,041 Elapsed: 00:02:02.9687500
Processing 13,313 of 29,041 Elapsed: 00:02:06.7968750
Processing 13,569 of 29,041 Elapsed: 00:02:10.7812500
Processing 13,825 of 29,041 Elapsed: 00:02:14.7343750
Processing 14,081 of 29,041 Elapsed: 00:02:18.6406250
Processing 14,337 of 29,041 Elapsed: 00:02:22.2968750
Processing 14,593 of 29,041 Elapsed: 00:02:25.9687500 << about 3.7s per 250 files
<snip>
Processing 20,225 of 29,041 Elapsed: 00:04:19.5156250
Processing 20,481 of 29,041 Elapsed: 00:04:26.4531250
Processing 20,737 of 29,041 Elapsed: 00:04:33.4375000
Processing 20,993 of 29,041 Elapsed: 00:04:39.3593750
Processing 21,249 of 29,041 Elapsed: 00:04:47.6562500
Processing 21,505 of 29,041 Elapsed: 00:04:56.9843750
Processing 21,761 of 29,041 Elapsed: 00:05:07.4062500
Processing 22,017 of 29,041 Elapsed: 00:05:15.7812500 << about 8.3s per 250 files
<snip>
Processing 27,649 of 29,041 Elapsed: 00:07:20.1250000
Processing 27,905 of 29,041 Elapsed: 00:07:24.6093750
Processing 28,161 of 29,041 Elapsed: 00:07:32.8281250
Processing 28,417 of 29,041 Elapsed: 00:07:41.6406250 << about 8.8s per 250 files.
.
- Follow-Ups:
- Re: Performance with reading large numbers of files...
- From: Jon Skeet [C# MVP]
- RE: Performance with reading large numbers of files...
- From: AMercer
- Re: Performance with reading large numbers of files...
- Prev by Date: Regular Assembly versus COM+ assembly
- Next by Date: RE: Performance with reading large numbers of files...
- Previous by thread: Regular Assembly versus COM+ assembly
- Next by thread: RE: Performance with reading large numbers of files...
- Index(es):
Relevant Pages
|