Re: Multithreaded File Writes

From: Justin Rogers (Justin_at_games4dotnet.com)
Date: 07/10/04


Date: Sat, 10 Jul 2004 02:19:48 -0700

I'm with Shri here, you need to figure out what is happening during that
idle time because it is extremely suspicious.

--- Some tips that I might point out if asked the question on the street
That said, multi-threading the application will help cover up bottlenecks in
your algorithm by saturating the processor with 8 or so copies of your
currently bottlenecked algorithm.

You mention performance as a primary though, so you already know
covering up the issue won't help you much in the long term. However, let's
examine 300,000 files. If each StreamWriter (assuming you are using that)
is taking up buffer space in memory for allocating byte arrays then you are
talking about a large number of relatively large allocations.

char[1024] (2K bytes), a byte array that can handle that many characters,
so either 1K or 2K based on the encoding, and 4096 for the underlying
FileStream.

7K puts you at nearly 2 gigs of allocations when processing all of your files.
Dropping down to a FileStream you can achieve better results with only 4K
by default used. Drop a smaller buffer in place to lower your mem requirements
event more. Make sure when you call Write, call it with a buffer that is bigger
than the allocated internal buffer and you will skip using the internal buffer
entirely, and avoid the extra memory copy.

---
Do your profiling first, figure out where the idle time is going, then work on
optimizing that file layer.
-- 
Justin Rogers
DigiTec Web Consultants, LLC.
Blog: http://weblogs.asp.net/justin_rogers
"Shri Borde [MS]" <sborde@online.microsoft.com> wrote in message
news:hy0b9vgZEHA.3316@cpmsftngxa06.phx.gbl...
> What is the CPU doing when its idle and there is no disk activity. You
> should use a profiler to find out what is going on.
>
> I dont think that writing from multiple threads could speed up the disk
> writes.
>
> Shri Borde [MSFT]
>
> This posting is provided "AS IS" with no warranties, and confers no rights.
> --------------------
> > From: "Robin Day" <robin.day@advorto.com>
> > Subject: Multithreaded File Writes
> > Date: Fri, 2 Jul 2004 17:37:42 +0100
> > Lines: 16
> > X-Priority: 3
> > X-MSMail-Priority: Normal
> > X-Newsreader: Microsoft Outlook Express 6.00.2800.1409
> > X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1409
> > Message-ID: <eFWEnLFYEHA.384@TK2MSFTNGP10.phx.gbl>
> > Newsgroups: microsoft.public.dotnet.framework.performance
> > NNTP-Posting-Host: 82-68-248-116.dsl.in-addr.zen.co.uk 82.68.248.116
> > Path: cpmsftngxa06.phx.gbl!TK2MSFTNGP08.phx.gbl!TK2MSFTNGP10.phx.gbl
> > Xref: cpmsftngxa06.phx.gbl
> microsoft.public.dotnet.framework.performance:7869
> > X-Tomcat-NG: microsoft.public.dotnet.framework.performance
> >
> > I have a system (written in C#) which writes out a large number of small
> > textfiles. (300,000+)
> > Performance is an issue here, during the actual writing of the files stage
> > of the application the processor runs at approx 20% and there is not
> > continuous disk activity. You can also continue to use the machine hardly
> > noticing any performance change in other applications.
> > I have a theory that if I fire of the file writes in a multi threaded way
> > then it may allow windows to make more use of the disk and processor in
> > order to complete this part of the application faster.
> > Has anyone had a similar task to do, or can anyone point me in the
> direction
> > of some good multi threaded file writing code in order to do this?
> >
> > Many Thanks in Advance
> > Robin
> >
> >
> >
>