Re: SQL Resets
- From: CQL Users <CQLUsers@xxxxxxxxxxxxxxxxxxxxxxxxx>
- Date: Fri, 4 Sep 2009 02:54:01 -0700
I will give this a test today.
2 other thoughts. 1, we are not seeing any real loads on the SQL
environment. CPU is less than 20%, memory is holding with a low page file,
network traffic is less than 4%.
I would also expect that if server 1 is starting to throw errors, I would
expect to see errors on Server 2, but I'm not. It looks like it sticks to
one server at a time. If it were SQL that was falling short, I would expect
to see both servers failing at the same time. That's not the case. It's one
server or the other.
(unless the Xenu crawler is hitting both servers at the same time.)
"Ruben Garrigos" wrote:
Hi CQL,.
Let's try another network configuration. This time we will try to increase
the backlog of the Winsock library that SQL Server uses to listen for connections.
The default backlog for SQL Server is 5. That means that the maximum pending
connection sockets you can hold is five. Pending connections shouldn't be
"on hold" for a long time but maybe the stress test push the server kernel
CPU pretty high and force the situation.
Try to increase in a small amount (5 by 5) the value of the WinsockListenBacklog
parameter at HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\MSSQLServer\MSSQLServer\SuperSocketNetLib
or
HKEY_LOCAL_MACHINE\SOFTWARE\Microsoft\Microsoft SQL Server\<Instance Name>\MSSQLServer\SuperSocketNetLib.
You need to restart the SQL Server instance after changing this parameter.
Anyway, it is possible that the load you are pushing is too high for your
architecture and you are generating an effective DoS against your SQL Server.
The differences between the Site 1 and the others can be that Site 1 code
is "slower" hitting the database or its queries are lighter. It is OK to
tune the architecture as much as you can but you will always find a load
point that is "too much" for it. I will recommend you to add a cache layer
if you don't have one to avoid hitting the SQL Server too hard. It is a very
effective solution to increase the performance of a data access layer.
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
Part of the exception error:
System.InvalidOperationException: Internal connection fatal error.
at System.Data.SqlClient.TdsParser.Run(RunBehavior runBehavior,
SqlCommand cmdHandler, SqlDataReader dataStream,
BulkCopySimpleResultSet
bulkCopyHandler, TdsParserStateObject stateObj)
"Ruben Garrigos" wrote:
Hi CQL,
If you disable connection pooling and the app ran slower maybe you
aren't hitting any limit (TCP/IP or other). Have you had a look at
the SQL Server Error Log? Maybe there you can find a clue about the
reset (if it is caused by SQL Server). Anything on the Windows Error
Log? The maximum amount of connections that SQL Server 2000 can
handle is 32,767 but I think that you are far from that when using
the connection pooling. Which error receive your application when the
connection is reset?
More ideas... maybe a buggy network card driver can be the problem.
Can you check if disabling all kind of TCP offloads (TOE) on the
network card helps? Different switches/routers on each Site?
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
One more testing update:
We disabled connection pooling on the app and left the default 5,000
user ports. The site ran slower, but never threw nad error. Which
makes us believe that the timing is off due to the 5,000 user ports?
We have the connection pools set to their min/max settings of 0 and
100. We've tested with the max set to 10 and it errored at about
7000 packets as usual.
Suggestions?
"Ruben Garrigos" wrote:
Hi CQL,
Have you tested the development environment with MaxUserPort=64534,
TcpTimeWaitDelay=30 and SynAttackProtect=0 on the SQL Server side?
I don't remember if a machine restart is required.. better restart
the machine to be sure that the TCP/IP stack is reading the new
parameters. If with these parameters it still happens, please send
the netstat -n results during a hammer session on the SQL Server
side.
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
Thx for the suggestions. I have currently 2 WEB fronts but am
putting the 3rd into production shortly. I assume you are asking
me to up it on the SQL side.
When do you want the netstat -n command? During a hammer session
or anytime? I'll be giving you everything from my dev environment
as we can reproduce it on that with a single Web Front and a
single SQL back end.
"Ruben Garrigos" wrote:
Hi CQL,
Maybe it is not a SynAttack problem but using connection pooling
is not always protecting you from the maxuserport issue. The
default value for MaxUserPort is 5000 and that's pretty low if
you have a good amount of connection pools connecting to the same
server. Try to increase MaxUserPort to its maximum value (65534).
How big is your connection pool (min/max connections) per WEB
front? How many WEB fronts do you have? If your connection pools
are decreasing/increasing their size frequently or if you
"overcommit" the number of free ports you can still have
problems. Try to decrease also the TcpTimeWaitDelay value to 30
seconds to return the ports faster and check if it makes any
difference.
Can you paste here the result of a "netstat -n" command on your
database server?
Regards,
Rubén Garrigós
Solid Quality Mentors
Blog: http://blogs.solidq.com/es/elrincondeldba
We have a web environment that has multiple WEB fronts hitting a
SQL 2000 Server with SP4. (Windows 2003 Server).
We are using the connection pooling, so it's not the maxuserport
setting in the registry.
From time to time, under heavy loads, we are getting resets at
the TCP level on SQL connections from the Web server.
I'll get the RST, ACK flag in the TCP packets and the web
application will throw an exception in the web app, but only
under heavy loads.
My question, what would cause the RST, ACK in the TCP packet and
what might I look for to troubleshoot this further?
- Follow-Ups:
- Re: SQL Resets
- From: Ruben Garrigos
- Re: SQL Resets
- References:
- SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- Re: SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- Re: SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- Re: SQL Resets
- From: CQL Users
- Re: SQL Resets
- From: Ruben Garrigos
- SQL Resets
- Prev by Date: Re: SQL Resets
- Next by Date: Re: SQL Resets
- Previous by thread: Re: SQL Resets
- Next by thread: Re: SQL Resets
- Index(es):
Relevant Pages
|
Loading