Re: Async socket & active connections
- From: "atlaste" <atlaste@xxxxxxxxx>
- Date: 28 Mar 2007 04:58:30 -0700
Well actually I have tried the async methods of webrequest, libcurl,
libwww and some other solutions. I am well aware of the capabilities
of those implementations.
Perhaps not so needless to say is that I don't merely concentrate my
efforts on http/web, but also on other protocols (which aren't so well
supported). I chose the example of a webcrawler because it is in my
opinion a good example of the amount of distribution and connections
that I'm looking for. Furthermore I'd like to compare the different
methods to evaluate them and simply pick the best method for my
purposes.
Reinventing the wheel or not and discussing if I'm doing that or not
isn't really what I would like to debate. The fact remains that the
whole network traffic just shuts down with my tcp wrapper class, which
just shouldn't happen in any case.
Thanks,
Stefan.
On Mar 28, 3:13 am, Peter Bromberg [C# MVP]
<pbromb...@xxxxxxxxxxxxxxxxxxxxxxx> wrote:
Stefan,
I don't understand why the effort to (in some ways) "reinvent the wheel",
but I've written what I believe are very efficient webcrawlers using the
built-in asynchronous methods without any of the nasty side effects you
describe. Timeouts can be added if necessary.
Peter
--
Site: http://www.eggheadcafe.com
UnBlog: http://petesbloggerama.blogspot.com
Short urls & more: http://ittyurl.net
"atlaste" wrote:
Hi,
In an attempt to create a full-blown webcrawler I've found myself
writing a wrapper around the Socket class in an attempt to make it
completely async, supporting timeouts and some scheduling mechanisms.
I use a non-blocking approach for this, using the call to 'poll' to
support the async mechanism - rather than the 'begin' and 'end'
functions. I already found that connecting doesn't set the
"isconnected" variable correctly (SocketException is thrown: non-
blocking has this effect...) - but doesn't appear to be a problem
because poll, read and write work fine.
For measuring the performance of the crawler, I started "perfmon.msc"
and added the "active connections" item from object "TCP". After a
while I found the number of this performance counter to reach over
300K connections (!), enough to start worrying...
My crawler is designed to support around 200 connections simultaneous.
"netstat -an" doesn't support this finding, but does show hundreds of
connections that are in either "CLOSE_WAIT", "FIN_WAIT_2" or another
closing state.
After a host has completed, I try to disconnect the TCP/IP connection.
I've attempted combinations of "shutdown(both)", (async) "disconnect"
and "close(0)" - where no combination appears to have the desired
effect. When the application is shut down, all connections (including
the CLOSE_WAIT connections) are removed. The FIN_WAIT_2 connections
linger forever...
Perhaps someone knows a solution to this problem?
Greetings,
Stefan de Bruijn.
.
- Follow-Ups:
- Re: Async socket & active connections
- From: DeveloperX
- Re: Async socket & active connections
- References:
- Async socket & active connections
- From: atlaste
- Async socket & active connections
- Prev by Date: Re: Large TXT Files
- Next by Date: Security Software!!
- Previous by thread: Re: Async socket & active connections
- Next by thread: Re: Async socket & active connections
- Index(es):
Relevant Pages
|