RE: WCF- Can't LoadBalance netTCP , ( LeaseTimeout + ConnectionPool )



Did you ever find a resolution to this issue. I am having the same trouble
with server affinity regardless of the lease or idle time settings. I am
not sure what load balancer you are using, but the one we have here is an A10
networks model. I have watched the network interface performance counters
and could not see any new connections being established. Hopefully, you were
able to find a solution.

Thanks ,
Heath

"eAndy" wrote:

Synopsis: A single WCF Service client making a large number of successive
calls always finds the same machine in a load balanced farm. The result is
that the load is not balanced and causes resource scarcity and the app does
not scale properly.

*** Note: the application is a batch application not an interactive
application or process.


System Endpoints
****************
- C# WCF Service using netTCP lives on 3 machine farm that is load balanced
by a Radware device
- single C# WCF client bulk processes a set of data and calls to the load
balanced farm

Background
****************
- client should be considered a "batch process" as it makes 400-1200 calls
in <3 minutes (not interactive, process set of data and quit, no callbacks)
- the service side processing of all of the calls could take 1-4 hours of
dedicated processing (if balanced properly 1-2 hours).
- Client and Services are run on Windows 2003 R2
- .NET 3.5 sp1
- the calls between client and service are
1. not involved in a transaction
2. not one way

Behavior
****************
1. All calls made during a "run" are received and processed a single
machine. For example, once started all calls go to a single machine. If we
wait 10 minutes and start the client again all the calls are again processed
by a single machine BUT it may be a different machine than the first run. Day
to day over a month the actual machine that gets all the requests differs and
appears random.

2. using network monitor it appears that all traffic is moving between the
client ip and the load balancer ip (the server ip addresses are never seen in
the capture).

What's been done
****************
Load Balancer Changes

The load balancer supports a number of different modes.


* Regular: Each client that connects to the farm represents one entry in the
Client Table, regardless of the number of sessions that client has.

*** Note: we can't use this as some of the services on the farm do a special
type of callback and for that we require natting to be enabled and have been
advised that we cannot use regular mode when natting.

* Entry Per Session: Each session a client opens is recorded in the client
table. This provides more accurate minimum-user load balancing.

* Server Per Session: Different sessions opened by a client's application
will be served by different servers, according to the load balancing
algorithms. This option enhances load balancing performance but may hinder
some applications that depend on being served by the same server. It also may
overload AppDirector`s internal tables.

* Remove on Session End - EPS: After a TCP client session ends, the next
time that the device scans the Client Table (between 5 - 60 seconds) the
client's entry is removed from the Client Table. This option automatically
enables Entry Per Session.

* Remove on Session End - SPS:: After a TCP client session ends, the next
time that the device scans the Client Table (between 5 - 60 seconds) the
client's entry is immediately removed from the Client Table. This option
automatically enables Server Per Session

WCF Client Configuration
****************
Changed the leasetime out to :15 (seconds) in conjunction with testing each
of the Load Balancer modes above.

No change to the behavior.

here's the endpoint and the custom binding I'm using to get at the
connectionpooling settings.

<endpoint address="net.tcp://LBFarm:20100/mdm/loadtest/"
behaviorConfiguration="default"

binding="customBinding"
bindingConfiguration="LoadBalancedOptimizedNetTCP"

contract="LoadTestService.ILoadTest" name="LoadTestServiceEP">

<identity>

<userPrincipalName value="host/localhost" />

</identity>

</endpoint>



<customBinding>

<binding name="LoadBalancedOptimizedNetTCP"
receiveTimeout="00:20:00" sendTimeout="00:20:00">

<binaryMessageEncoding>

<readerQuotas maxDepth="32" maxStringContentLength="10000"
maxArrayLength="100000" maxBytesPerRead="4096" maxNameTableCharCount="16384"
/>

</binaryMessageEncoding>

<transactionFlow />

<windowsStreamSecurity protectionLevel="EncryptAndSign" />

<tcpTransport maxReceivedMessageSize="1000000000"
listenBacklog="1000" portSharingEnabled="true" >

<connectionPoolSettings leaseTimeout="00:00:15" />

</tcpTransport>

</binding>

</customBinding>





Did I miss something in the configuration? I'm still seeing that all of the
traffic

Conclusion
****************
I could theorize that it’s a combination of wcf connection pooling and the
load balancer's session based server affinity.

For example, if .NET is keeping the connection alive and the load balancer
continues to see successive traffic before the session times out it keeps
pushing the traffic to the same server in the farm.

Questions:
***********
1. Did I miss something in the configuration? I'm still seeing that all of
the traffic goes to a single server regardless of teh changes on the load
balancer.

2. How to get .NET to drop the tcp connection and create a new one every
5-10 seconds.

3. how to verify when tcpNet connections are being created/destroyed? it
doesn't appear to be instrumented but we all know there may be additionall
steps to get the perf counters "turned on"

4. what are other people’s experiences with this specific situation (batch
style netTCP WCF service calling through a load balancer)



****
NOTE
****
In this specific case, it is completely acceptable to incur the overhead of
killing and re-creating new connections every 5-10 seconds because the
processing time on the server is far greater and the need to balance the load
far outweighs any connection pool efficiency.

I do realize that this is not the norm. If I were creating an interactive
application that called to the farm connection pool completely makes sense.

Looking for other people's experiences.


Repro
*******
I have a repro that I can forward if desired.


.