Re: "Parallel.For GC problems" and a solution.

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



Robert,

A lot of words, do you also have some code, you tried EVERYTHING, can you give us any idea what is in your mind EVERYTHING?

Cor


"Robert" <no@xxxxxxxx> wrote in message news:uzBGEa$YJHA.684@xxxxxxxxxxxxxxxxxxxxxxx
Quick summary:
1) running one class serially, all is well
2) running same thing in parallel, Gen 2 bytes go way up, and LOH usage goes way up.
3) the classes share ZERO state. The only thing they share is a callback to the GUI
reporting the number of records processed, and records fixed. The classes do a bunch
of tallying, verifying no dupes, etc. Should parallelize very well..
4) I am using the TPL library, and was using Parallel.For, to spawn of instances for each
file.
5) This causes the GC to get very confused:
a) We can not coalesce mem regions, since 4-8 threads are always in use.
b) as one thread dies, some collections occur, but the other threads keep allocating.
c) never "rests" to give the GC time to coalesce everything back to a clean point.
d) this just gives ever rising memory counters.
e) Running "Performance Explorer" inside VStudio shows a bunch of Ints, int[], etc
in Gen 2, and LOH. I think Dictionary( of dictionary(of small array of ints))
with the dictionaries holding arrays of keys, and values is the problem.
f) all these dictionaries are released in the normal way. Tried EVERYTHING to explicitly deallocate them..

Solution:
When calling Parallel.For, do not pass it a large array of things to process.
Currently I batch them into Processors * ThreadsPerProcessor chunks
Run Parallel.For on the chunks. Run GC. Repeat as necessary. This idles the cpu periodically,
giving a spiky looking CPU graph, but, it runs faster than serial, and no mem probs.

Summary:
With rest breaks the GC behaves normally. With no breaks, memory goes crazy.


This took about a day and a half to figure out.

I suspect this would also happen with the stock standard ThreadPool
as the GC is the same. Lighter threads, without so much alloc/dealloc
would probably not have this problem.
Each of my threads is using 10-60 megs. 8 of them would need half a gig.
This is about 1/3 of the max memory for a 32 bit process. When running the
bad way, recs/sec would drop off steadily until OOM.

Moral of the story:
When running in parallel, make sure you take a breather now and then..


.



Relevant Pages

  • "Parallel.For GC problems" and a solution.
    ... running same thing in parallel, Gen 2 bytes go way up, and LOH usage goes way up. ... with the dictionaries holding arrays of keys, ... do not pass it a large array of things to process. ... With no breaks, memory goes crazy. ...
    (microsoft.public.dotnet.languages.vb)
  • Re: contiguity of arrays
    ... >> What you seem to be saying is that for a region of memory to constitute ... >> an array of four ints, it doesn't have to be declared with a type that ... A struct type that consists of four ints ...
    (comp.lang.c)
  • Re: contiguity of arrays
    ... >> What you seem to be saying is that for a region of memory to ... >> a type that involves an array of four ints, ... > of type T. The memory allocation functions are a special case that return ...
    (comp.lang.c)
  • Re: what is the best datatype for..
    ... dictionaries etc will work, ... If I understand your suggestion correctly, you are proposing creating an array that will be scanned searching for the index value, at which point the count can be updated. ... Furthermore, for large data sets where the difference in memory footprint is likely to be of concern, the performance difference will be phenomenal. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: There is no sizeof in a managed language, right?
    ... I was trying to find how much memory is taken up by an array of ints, ... With references and managed languages, what do you use to figure this ...
    (microsoft.public.dotnet.languages.csharp)