Re: Garbage Collection Issues in long-standing services

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



Larry,

There is something in this whole story which isn't clear to me:
If your buffers are that small (4K-16K) why does your working set grows to
over 800MB?
Is it because they are much larger, or because there are a lot of them, or
are you allocating many more objects in the process, or are you pegging the
NIC buffer (NDIS or winsock) with too many requests dispatched by a lot of
threads, while you aren't able to process the received buffers in a timely
fashion?

Willy.

"Larry Herbinaux" <LarryHerbinaux@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in
message news:064137C9-2E69-4CC4-930A-DCF788254225@xxxxxxxxxxxxxxxx
> Willy, thanks for pointing this out, it does make sense concerning the
> Pinned
> array.
>
> In our case, the client should be fulfilling the requests in an immediate
> fashion. If this is the case, the pinned array should get unpinned or
> compacted since the async call completes pretty quickly, right?
>
> The receive case is done quite well, I do create a 4K buffer and reuse it
> for the life of the connection. In most cases, I don't expect an
> outstanding
> begin receive at the end of the connection, but it could happen. I do
> keep
> track of the IAsyncResults and if it is outstanding, I grab the state
> object
> and remove the reference to my wrapper socket class, but I don't remove
> the
> reference to the buffer. I should probably do this, right? The OS still
> has
> a reference to the request object, can I just call EndReceive if it is
> outstanding (not completed)? The one issue I see with this is if for some
> reason the remote (client) socket hasn't closed (e.g. physical route is
> broken, etc.), then EndReceive would block according to MSDN (framework
> 1.1).
>
> The send case is a little more difficult and I would appreciate your
> advise
> on it. I also have to support SSL and there is a maximum limit of 16K
> that
> can be transfered at a time. I try to make life easy on the application
> programmer by buffering this data if the amount is larger that 16K. I
> have a
> class that my wrapper socket class references that contains an ArrayList
> of
> byte arrays, so there could be more than one. Should I go to the exteme
> of
> having a perminent send buffer and then just copy the data to this prior
> to
> sending so that OS has only one reference to a byte array during the
> length
> of the connection?
>
> One other note, sorry about the long set of questions, I did use the CLR
> Profiler using a simple application that mimics the references of my
> TCPServer. My one worry was that my wrapper socket class did have a
> reference to the TCPServer object (which is obviously long-standing) and I
> was worried about the GC not collecting due to this. The TCPServer also
> has
> a hashtable that references the wrapper socket object. I did verify that
> when I removed the the wrapper socket from the hashtable that the GC
> wasn't
> worried about the reference back to the TCPServer; I could explicitly set
> this to null but feel that it is cleaner not having to do this throughout
> the
> code. What's your opinion?
>
>
> "Willy Denoyette [MVP]" wrote:
>
>> Larry,
>>
>> The problem with sockets is the unmanaged memory buffers used by the
>> underlying Winsock library, whenever you need to transfer a block of data
>> (probably a byte array) to and from the unmanaged socket send/receive
>> function, these arrays must be pinned. Now, when using asynchronous
>> sockets,
>> these arrays may stay pinned for a relatively long period of time, right.
>> The problem with pinned objects (the arrays) is that they prevent the GC
>> to
>> compact the heap (pinned objects can't move!).
>> The result is that, depending of the number of buffers (your State
>> object?),
>> you might end with a highly fragmented GC heap that keeps growing, not
>> because the GC cannot collect but because he cannot compact.
>> So what you need to do is review your design, and try to find out how
>> many
>> pinned objects you have in the youngest generations ((Gen0 and Gen1),
>> these
>> have the most impact). Try to prevent pinning for young objects by
>> (pre)allocating your buffers very early in the process and use the same
>> buffers for the whole run of the process, that way they will end in the
>> Gen2
>> after a few collector runs and stay there where they don't hurt the GC
>> that
>> much.
>>
>> Hope this helps.
>> Willy.
>>
>>
>>
>> "Larry Herbinaux" <LarryHerbinaux@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in
>> message news:7EFFDE15-6BB8-43DF-8A96-04200AB4A468@xxxxxxxxxxxxxxxx
>> > Thanks for the reply.
>> >
>> > I would agree that I must be holding on to some references, I will try
>> > to
>> > isolate. If the GC was really being aggressive, then why after
>> > everything
>> > came back to steady state did the memory in use reduce by half?
>> > (800MB -
>> > 350MB)
>> >
>> > I did run some tests on the classes that would give me the most
>> > problems
>> > and
>> > it appeared that the GC was smart enough to solve the issue. Here is a
>> > brief
>> > synopsis:
>> >
>> > TCPServer - Has a hashtable holding a class that encapsulates the
>> > client
>> > socket (CS), including a refrence to a class that handles the
>> > application
>> > processing (AP). The first thing AP does when it handles an OnConnect
>> > is
>> > store a reference to CS so that it can use it to send data back to the
>> > client.
>> >
>> > State Object (SO) - This holds a reference to CS and a pointer to a 4K
>> > byte
>> > array. This is passed on the Async Send / Receive Calls.
>> >
>> > Connection Termination - After the connection is closed (abnormally or
>> > normally), the CS is removed from the hashtable. I've done test to see
>> > that
>> > once this is performed that the GC is smart enough to deal with the
>> > circular
>> > reference between CS and AP, so I don't explicitly set each reference
>> > to
>> > null.
>> >
>> > Outstanding Async Send / Receive - Depending how the connection is
>> > terminated, there is a chance that there can be an Outstanding Async
>> > Send
>> > and
>> > Receive. I store a reference to the most current IAsyncResult for both
>> > the
>> > Send and Receive in the CS object. When closing, I check to see if
>> > these
>> > are
>> > still outstanding and if so, I retrieve the SO object and set the CS
>> > reference to null. I'm not setting the buffer to null and I probably
>> > should.
>> > I don't expect that this condition is normal, but I do my best at
>> > handling
>> > it. You can't just stop an asyncronous request which is unfortunate
>> > because
>> > I'm sure there is a structure being stored in the OS to encapsulate
>> > each
>> > (e.g. same structure that holds the state object).
>> >
>> > I will continue to try and isolate. Any ideas based on the above?
>> >
>> > "Willy Denoyette [MVP]" wrote:
>> >
>> >> Did you ever checked the GC performance counters (using perfmon) to
>> >> check
>> >> whether this is true? The GC is more aggressive than you imagine.
>> >> Just check the Gen0 1 and 2 performance counters and you will see that
>> >> the
>> >> collector runs, your problem is that you are holding references to
>> >> objects
>> >> (probably large objects) which you never release, so there is little
>> >> or
>> >> nothing to collect.
>> >> By starting another process that allocates memory, your service
>> >> working
>> >> set
>> >> gets trimmed by the OS, that all that happens.
>> >>
>> >> Willy.
>> >>
>> >> "Larry Herbinaux" <LarryHerbinaux@xxxxxxxxxxxxxxxxxxxxxxxxx> wrote in
>> >> message news:72FFED84-F740-456E-A986-0BB25C18ED39@xxxxxxxxxxxxxxxx
>> >> > I'm having issues with garbage collection with my long-standing
>> >> > service
>> >> > process. If you could review and point me in the right direction it
>> >> > would
>> >> > be
>> >> > of great help. If there are any helpful documents that you could
>> >> > point
>> >> > me
>> >> > to
>> >> > help me control the GC, then that would be great also.
>> >> >
>> >> > The .Net GC does not cleanup memory of our service process unless it
>> >> > is
>> >> > forced to by another process that hogs memory.
>> >> > Ã?â??Ã?· GC Algorithm - This is an issue because if the GC is not
>> >> > forced
>> >> > into
>> >> > doing
>> >> > this, it does not aggressively cleanup until the amount of physical
>> >> > memory
>> >> > available is very small. I understand why it
>> >> > doesnÃ?¢ââ??‰â??¢t want to
>> >> > force
>> >> > cleanup
>> >> > due to processor efficiency, but it forces applications into
>> >> > conditions
>> >> > that
>> >> > are not acceptable. It would be nice to be able to hint an upper
>> >> > limit
>> >> > for
>> >> > an application that helps the GC be more aggressive when required.
>> >> > Ã?â??Ã?· Race Condition Ã?¢ââ??‰â?¬Å? The GC Algorithm causes
>> >> > race conditions
>> >> > because the
>> >> > GC is
>> >> > not coordinated with our application and our application throws
>> >> > OutOfMemoryExceptions. We have very good exception handling that
>> >> > guards
>> >> > against unhandled exceptions in the main thread and thread pool
>> >> > threads.
>> >> > The
>> >> > problem is that we use memory to log these issues so the handlers
>> >> > are
>> >> > probably throwing another OutOfMemoryException. We can handle this,
>> >> > but
>> >> > the
>> >> > point is the OutOfMemoryException cause a transaction to fail.
>> >> > Ã?â??Ã?· Force GC To Collect - I wrote a Memory Hogger application
>> >> > which
>> >> > when
>> >> > run
>> >> > will reduce the amount of memory used by the service application
>> >> > from
>> >> > 800MB
>> >> > to 3MB, so this proves that the GC will cleanup the memory when it
>> >> > truly
>> >> > needs it. I noticed that Memory Usage displayed in the Task Manager
>> >> > Process
>> >> > tab did not add up to the total amount of memory in use, so this
>> >> > means
>> >> > that
>> >> > the inactive applications probably moved their heap to swap. One
>> >> > concern
>> >> > here was that when the Memory Hogger application was terminated, our
>> >> > service
>> >> > application reclaimed half of its memory and we were not processing
>> >> > transactions. Maybe GC just moved it to swap.
>> >> > Ã?â??Ã?· GC.Collect Ã?¢ââ??‰â?¬Å? Using this is not
>> >> > recommended within the
>> >> > application.
>> >> > Even
>> >> > when this is used, it doesnÃ?¢ââ??‰â??¢t make the GC any more
>> >> > aggressive,
>> >> > so I
>> >> > agree,
>> >> > there is no reason to use it. It would be nice to have the ability
>> >> > to
>> >> > make
>> >> > the GC more aggressive.
>> >> > Ã?â??Ã?· CLR Profiler - IÃ?¢ââ??‰â??¢ve used the CLR Profiler
>> >> > to determine what
>> >> > memory is
>> >> > not
>> >> > being collected. Mostly string and byte arrays. Our service
>> >> > handles
>> >> > TCP
>> >> > Connections asynchronously and we reuse the same byte array when
>> >> > receiving
>> >> > data for the next asynchronous read. We store a reference to the
>> >> > current
>> >> > IAsyncResult for both the asynchronous send and receive requests.
>> >> > The
>> >> > state
>> >> > object holds the byte array which we are currently not setting to
>> >> > null.
>> >> > There isnÃ?¢ââ??‰â??¢t a way of canceling the async requests,
>> >> > so we might
>> >> > have to
>> >> > explicitly set this to null. I will try this to see if makes a
>> >> > difference.
>> >> > As for the strings, IÃ?¢ââ??‰â??¢m not sure where the problem
>> >> > is here.
>> >> >
>> >>
>> >>
>> >>
>>
>>
>>


.



Relevant Pages