Re: How to terminate a socket in CLOSE_WAIT state
From: Francois Piette (francois.piette_at_overbyte.be)
Date: 05/21/04
- Next message: Viviana Vc: "Using a Global Mutex"
- Previous message: hector: "Re: How to terminate a socket in CLOSE_WAIT state"
- In reply to: hector: "Re: How to terminate a socket in CLOSE_WAIT state"
- Messages sorted by: [ date ] [ thread ]
Date: Fri, 21 May 2004 11:27:01 +0200
Thanks you for this long and interesting reply.
-- francois.piette@overbyte.be Author of ICS (Internet Component Suite, freeware) Author of MidWare (Multi-tier framework, freeware) http://www.overbyte.be "hector" <nospam@nospam.com> a écrit dans le message de news:efNlmiwPEHA.832@TK2MSFTNGP09.phx.gbl... > Francois, long time no talk. > > In my experience, when this is happening it is simple a PE "programming > error" somewhere in the code. This should always be a controlled state > with the socket released within a few seconds of proper closure on both > ends. > > Maybe this info will help: > > Specifically with this CLOSE_WAIT, in one instance we found it to be > happening with our FTP server with WS_FTP clients. IPSwitch provided a > developer version to help solve the problem. Here are my product update > notes I have on it: > > o WIldcat! FTP Server fixed for certain FTP clients who use both passive > and non-passive commands in their logic. Certain FTP clients like, WS_FTP, > will use the PASV command to start a download. If the download fails for > some good reason, WS_FTP will try to use the non-passive PORT command to > repeat the process. This was causing a PASSIVE opened socket to be left in > a CLOSE_WAIT state and on occasion, left the FTP user hung in Wildcat. This > is now fixed. > > In another instance, we found recent IE updates within the last year or so > causing RSETs to occur and many CLOSE_WAIT sockets. Search the net for the > various forms of the behavior by thousands of users reporting things like > "Page can't be displayed" and "Browser automatically refreshes..." etc. > > After extensive research to hit this problem over the head, without MS > confirmation, it is my belief Microsoft change the socket behavior to solved > TCP half close issues, and in doing so broke many existing servers out there > who were not ready for it. The problem showed up with IE and I could only > repeat under a DSL line. > > The thing is, MS actually does it right. If you follow MSDN, it is > correctly "describes" what needs to be done but uses event driven operations > for examples. However, in synchronous mode, you may not catch it. Any > way, here are my Jan/03 summary notes with customers and testers who helped > test the fixed version: > > (The following was sent to Microsoft) > > After all my analysis, I believe I finally solve the "IE puzzle." I had to > read up and get a better understanding of TCP/IP specifications. > > In my previous messages, I indicated adding a sleep (thus delaying the > close), helped solve the problem we were experiencing with IE. I was not > comfortable with this sleep solution and continued with more investigation. > > Again, the MSDN documention indicated specific steps for a graceful shutdown > to send the remaining data. I took this to mean it applied to a "receiver > application" (like a browser) since it is the server that is doing the > sending, > not receiving in this particular state. In addition, with the SO_LINGER > setting, the purpose of this option is to block a close while there is data > in the send queue. So I didn't think these steps applied to the server > attempting > to close the socket. > > After analyzing tcp/ip packets and read tcp/ip information in the book > TCP/IP > Illustrated Volume 1, specifically about "TCP Half Close" operations, I > realized the server using the recv() during a shutdown(), was not to expect > data, but instead expect the "half close" from the receiver. > > So instead of using a sleep like so: > > Send HTTP response > ... > ... > > Sleep(300); > closesocket(sock) > > I changed the logic to follow what the MSDN documentation says to do: > > Send HTTP response > ... > ... > > // notify receiver we are about to close, no more data will be sent. > // note: this does not close the socket > > shutdown(sock,SD_SEND); > > // new - wait until receiver closes socket. > // note: sanity check removed from loop > > char buf[8*1024]; > while (recv(sock, buf,sizeof(buf)) > 0); > > // finally close the socket > > closesocket(sock) > > This fixed the problem! 100% with no sleeps and/or no send speed throttles. > > This is the proper way to do it. I understand the TCP/IP specification > better now. > > 1) The shutdown() tells the receiver the server is done sending data. No > more data is going to be send. More importantly, it doesn't close the > socket. At the socket layer, this sends a TCP/IP FIN packet to the > receiver. > > So when you send data, PSH (push data) packets are sent. The receiver > sends ACK packet for each PSH. > > PSH 1 --> > <-- ACK 1 > PSH 2 --> > <-- ACK 2 > PSH 3 --> > <-- ACK 3 > PSH 4 --> > <-- ACK 4 > > etc.. There is NO specific order to this. It can be: > > PSH 1 --> > PSH 2 --> > <-- ACK 1 > PSH 3 --> > <-- ACK 2 > PSH 4 --> > <-- ACK 3 > <-- ACK 4 > > And this depends much on the network transmission. i.e., ADSL connections > has a faster receive (download), slower send (upload). > > When the shutdown() command is called, it will send a FIN signal. The FIN > can be in its own packet or part of the last PSH packet. > > 2) At this point, the socket layer has to wait until the receiver has > acknowledged the FIN packet by receiving a ACK packet. This is done by > using the recv() command in a loop until 0 or less value is returned. Once > recv() returns 0 (or less), 1/2 of the socket is closed. > > 3) Then you can close the second half of the socket by calling > closesocket(); > > According to TCP/IP Illustrated Volume 1, using Shutdown is not common in > applications (page 238, chapter 18.5): Most applications will terminate > the socket in both directions using the closesocket() command. > > Summary: > > If hosting applications do not support TCP Half Close, they might begin to > see problems with specific versions of Microsoft IE and/or combos of the > operating system. > > Supporting TCP Half Close fixes and enhances our web server and solves this > problem for our customers who have IE users. So from our standpoint, the > problem is now solved. > > Since our web server has been in existence since 1996 and has been well > engineered and put through the test of time, the recent avalanche of IE > issues tells me that Microsoft has changed something recently in either in > IE or in the Winsock layer in regards to SO_LINGER and closesocket(). The > net effect is that many users are now experiencing this "Page Cannot be > Displayed" and probably, without verification, for web sites not using the > Microsoft IIS web server. We are going to do a final test on this > presumption. > > According to MSDN, using SO_LINGER is suppose to block the closing of the > socket until data in the SEND queue has been exhausted (sent). That's the > purpose of the SO_LINGER option and had always worked for us since we never > saw this IE issue before. > > In other words, if there is still data in the send queue when the > closesocket() is called, the SO_LINGER socket setting (see setsockopt() in > MSDN) is suppose to BLOCK the sending the FIN packet to the receiver until > all the data has been sent. > > However, this now brings up network transmission and packet sequence issues. > > Although the socket send queue is now empty, depending on the user's > connectivity reliable and speed, it is my suspicion not all the PSH (push > data) packets were acknowledged by the receiver by the time it received a > FIN packet. Hence when a PSH packet arrived after a FIN, it was for a > closed socket. Hence according to RFC 793, if an out of sequence packet > arrives, a RST (reset) is sent back to the server. > > This explains why we were seeing multiple resends of the URL request by the > IE browser. > > For some people, they got the PAGE CANNOT BE DISPLAYED error. I > personally never saw this. Only the resends. I should note one of our > testers indicated this might depend on the IE setting "Show Friendly URL > errors" options. When ON, he got the page error. When Off, he got the > resends. > > However, I am not sure that is consistent. It believe it all depends on the > network transmission, i.e, a timing issue related to the user's internet > connectivity. > > In any case, the problem is solved for us. > > I hope the above info provides some info to Microsoft and other host > developers who might run into the problem. > > Thanks to All who tested this IE issue at our web site. > > --- > Hector Santos > WINSERVER "Wildcat! Interactive Net Server" > support: http://www.winserver.com > sales: http://www.santronics.com > > > > "Francois PIETTE" <francois.piette@overbyte.be> wrote in message > news:40ac7920$0$8414$a0ced6e1@news.skynet.be... > > In a server application, sometimes there are a lot of sockets in > CLOSE_WAIT > > state (shown by netstat utility). It could even consume all available > > sockets. > > The questions are: > > - How to close such socket so that they can be reused ? > > - How to reduce the time the system allow a socket to be in CLOSE_WAIT ? > > - How to completely avoid this state and still gracefully close tcp > sessions > > ? > > > > -- > > francois.piette@overbyte.be > > The author for the freeware multi-tier middleware MidWare > > The author of the freeware Internet Component Suite (ICS) > > http://www.overbyte.be > > > > >
- Next message: Viviana Vc: "Using a Global Mutex"
- Previous message: hector: "Re: How to terminate a socket in CLOSE_WAIT state"
- In reply to: hector: "Re: How to terminate a socket in CLOSE_WAIT state"
- Messages sorted by: [ date ] [ thread ]