Re: SQL Resets
- From: CQL Users <CQLUsers@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Thu, 3 Sep 2009 18:39:01 -0700
Well, the environments are completely different on production and in
development, but we can make it happen at both. Right now, the only thing we
can initiate with is the Xenu crawler. With the default values set with the
connection pool disabled. (5000 ports) we haven't had Xenu throw the errors.
When we turn on connection pooling, it loads up pretty fast to throw the
errors. The NIC's and switches are different in both environments.
It's really strange. I think it might be throwing some form of infinite
loop and ramping up realy quick in the pool. The RST's that I am seeing are
actually on the SQL server. so I know the traffic is hitting the server. I
am not seeing errors thrown in either the SQL Logs or the Web Logs. Just the
exceptions that we are capturing saying the connection was unexpectantly
reset.
I tend to believe it has something to do with the connection pooling. Is
there more detailed logging I can set on the SQL environment?
Any other suggestions/things to try?
"Ruben Garrigos" wrote:
Hi CQL,.
If you disable connection pooling and the app ran slower maybe you aren't
hitting any limit (TCP/IP or other). Have you had a look at the SQL Server
Error Log? Maybe there you can find a clue about the reset (if it is caused
by SQL Server). Anything on the Windows Error Log? The maximum amount of
connections that SQL Server 2000 can handle is 32,767 but I think that you
are far from that when using the connection pooling. Which error receive
your application when the connection is reset?
More ideas... maybe a buggy network card driver can be the problem. Can you
check if disabling all kind of TCP offloads (TOE) on the network card helps?
Different switches/routers on each Site?
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
One more testing update:
We disabled connection pooling on the app and left the default 5,000
user ports. The site ran slower, but never threw nad error. Which
makes us believe that the timing is off due to the 5,000 user ports?
We have the connection pools set to their min/max settings of 0 and
100. We've tested with the max set to 10 and it errored at about 7000
packets as usual.
Suggestions?
"Ruben Garrigos" wrote:
Hi CQL,
Have you tested the development environment with MaxUserPort=64534,
TcpTimeWaitDelay=30 and SynAttackProtect=0 on the SQL Server side? I
don't remember if a machine restart is required.. better restart the
machine to be sure that the TCP/IP stack is reading the new
parameters. If with these parameters it still happens, please send
the netstat -n results during a hammer session on the SQL Server
side.
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
Thx for the suggestions. I have currently 2 WEB fronts but am
putting the 3rd into production shortly. I assume you are asking me
to up it on the SQL side.
When do you want the netstat -n command? During a hammer session or
anytime? I'll be giving you everything from my dev environment as
we can reproduce it on that with a single Web Front and a single SQL
back end.
"Ruben Garrigos" wrote:
Hi CQL,
Maybe it is not a SynAttack problem but using connection pooling is
not always protecting you from the maxuserport issue. The default
value for MaxUserPort is 5000 and that's pretty low if you have a
good amount of connection pools connecting to the same server. Try
to increase MaxUserPort to its maximum value (65534).
How big is your connection pool (min/max connections) per WEB
front? How many WEB fronts do you have? If your connection pools
are decreasing/increasing their size frequently or if you
"overcommit" the number of free ports you can still have problems.
Try to decrease also the TcpTimeWaitDelay value to 30 seconds to
return the ports faster and check if it makes any difference.
Can you paste here the result of a "netstat -n" command on your
database server?
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
We have a web environment that has multiple WEB fronts hitting a
SQL 2000 Server with SP4. (Windows 2003 Server).
We are using the connection pooling, so it's not the maxuserport
setting in the registry.
From time to time, under heavy loads, we are getting resets at the
TCP level on SQL connections from the Web server.
I'll get the RST, ACK flag in the TCP packets and the web
application will throw an exception in the web app, but only under
heavy loads.
My question, what would cause the RST, ACK in the TCP packet and
what might I look for to troubleshoot this further?
- References:
- SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- Re: SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- Re: SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- SQL Resets
- Prev by Date: Re: SQL Resets
- Next by Date: Re: SQL Resets
- Previous by thread: Re: SQL Resets
- Next by thread: Re: SQL Resets
- Index(es):
Relevant Pages
|
Loading