Re: High-performance IO



Hi mate

In my previous post I answered some of your questions, but somehow
overlooked the general picture. Therefore, let me tell you what I
think.

First of all, "one file-one thread" strategy is not the best one in
your situation. Why????
Let's say thread X processes file A, and thread Y processes file B.
These files may be physically located in totally different parts of the
disk. Therefore, every time context switch occur, the system has to
update the position of the head, i.e. you have to search the disk all
the time. As a result, you get performance degradation, instead of
improvement. I think the best thing to do is to process files
sequentially, i.e. one by one


Second, as I have already mentioned in my previous message, you should
try to avoid page faults. Imagine what happens if the size of your
buffer exceeds the amount of memory that is available to your process -
you will be moving data from the disk to memory and back to the disk
all the time. If you have SeLockMemory privilege, probably it makes
sense to allocate memory with MapUserPhysicalPage() - if you do it this
way, you will make sure that your buffer is always resident in RAM. As
a result, you will be able to read data, update it, write it back to
the disk and proceed to the new range, without a SINGLE page fault. If
you do it this way, you have to do everything synchronously, so that
there is no need for FILE_FLAG_OVERLAPPED flag


To summarize, the fact that you are about to access files only once and
do so sequentially gives you a HUGE potential for optimization in terms
of minimizing the number of disk accesses and disk searching.
Therefore, you should take advantage on it - in your situation it
offers much more than multithreading and asynchronous IO do

Anton Bassov


Piotr Wyderski wrote:
I would like to obtain the highest possible throughput
of disk IO under the following conditions. The program
simultaneously reads and writes several large files,
~0.25+ GiB each. Each file must be processed in order,
so no explicit fine-grained parallelization provided by IOCP
is possible -- the file has its own processing thread. The
files are accessed sequentially and only once, so no caching
will help. Currently I use four independent memory buffers
per file and overlapped IO to separate the phases of
reading/writing and processing. The IO completion notifications
are issued through Win32 event objects.
I set the FILE_FLAG_NO_BUFFERING flag to bypass
the filesystem cache and (hopefully) cause the system to
perform explicit DMA transfers directly into the buffers.

But several questions remain open:

1. How big should be the data block to obtain the highest
performance? Currently it is hardcoded to 512KiB, but it
is an early implementation and is subject to change. If
it is OS/filesystem specific, then how can I get the best
value at run time using WinAPI?

2. Does FILE_FLAG_NO_BUFFERING and
FILE_FLAG_OVERLAPPED interfere somehow
with FILE_FLAG_SEQUENTIAL_SCAN?
Should I avoid the latter hint when the first two flags
are specified?

3. How should I allocate the memory buffer? A simple
VirtualAlloc will do, but it can optimize (cache coloring,
clustering etc.) the virtual->physical address mapping
for memory access, not for IO, especially for large DMA
transfers. Is there a way to allocate a contiguous block of
physical memory? Will it help much under NT/XP? I'm
asking, because I don't know the details of NT's low-level
disk IO architecture and implementation.

4. Can I directly read into/write from an AWE buffer?

5. Is there something I forgot to use in order to obtain
the highest throughput? Except IOCP, of course. :-)

Best regards
Piotr Wyderski

.



Relevant Pages

  • Re: Discovering variable types...
    ... >memory it points to is on the heap. ... sequentially reading data, if one is randomly reading records, then a ... >project is what's prompting me to improve disk access. ... from a memory buffer I can do it in about a second. ...
    (comp.lang.pascal.delphi.misc)
  • Re: is there a user mode way to flush disk cache
    ... > arrived on the disk. ... Cache contains data that is already present on ... For that you have the bdflush controls on buffer ... I suspect there is a text on memory tuning in the ...
    (comp.os.linux.development.system)
  • Re: Out of memory
    ... It flushes the buffer and write data directly.. ... I call the close method and It failed with out of memory. ... it will write the data to disk. ... I am thinking of compressing ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: Time difference between MCE and a normal TV
    ... you are always playing from the "time shift buffer" which is on disk ... when watching Live TV however if the OS has not yet reused the memory space ... >> I have my MCE pc in a room next to the living room where my wife watches ...
    (microsoft.public.windows.mediacenter)
  • Re: limiting Buf memory
    ... how can i get it down to somehow like 40MB? ... memory that is bufferring raw blocks from the disk. ...
    (freebsd-questions)