Re: Why WebHttpRequest.GetResponse() stuck?
- From: "Morgan Cheng" <morgan.chengmo@xxxxxxxxx>
- Date: 16 Oct 2006 18:59:19 -0700
Jesse Houwing wrote:
Morgan Cheng wrote:Do you mean that HttpWebRequest.GetResponse() doesn't download the uri
I happens to surf to
http://www.codeproject.com/cs/internet/Crawler.asp, which claims that
WebRequest.GetResponse() will block other thread calling this function
until WebResponse.Close() is called.
I did some experimentation.
public static void Main(string[] args)
{
for (int idx=0; idx<10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWeb), idx);
}
}
private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
Console("in thread " + idx);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Console.WriteLine( response.ContentType + "; idx = " + (int)flag );
// response.Close();
}
The code runs with output like below:
in thread 0
in thread 1
text/html; charset=UTF-8; idx=0
text/html; charset=UTF-8; idx=1
in thread 2
in thread 3
in thread 4
in thread 5
in thread 6
in thread 7
in thread 8
in thread 9
"idx" may be other value, but only 2 threads get through GetRespnse()
all the time. It seems other 18 threads are stuck at
HttpWebRequest.GetResponse().
After I un-comment the line " response.Close()", it prints expected 20
lines. There must something occupied by HttpWebResonse before it is
closed.
Does HttpWebResonse instance occupy some resouce which there is only 2
availabe instance? If this is the case, it is really a issue for
application needs many WebResponse instance. e.g. web crawler.
The response stream is left open for you to examine the data returned b
the webresponse, but it's actually only ever downloaded should you need
it (to prevent unneeded data transfers and to make sure you get the
response within a reasonable amount of time).
resouce to local machine? I tried to fetch some big resouce. The
GetResponse() takes time, whilte response.GetResponseStream() returns
immediately. I belive that downloading happends at GetResponse().
I'm not sure why the number is two, but it is good practise to keep the
number of concurrent connections you open to one site to a minimum, so
that you don't overload the site in question. The WebClient
automatically makes sure you don't open too many connections.
I checked Http/1.1 protocol. In RFC 2616 section 8.1.4, it reads,
Clients that use persistent connections SHOULD limit the number of
simultaneous connections that they maintain to a given server. A
single-user client SHOULD NOT maintain more than 2 connections with
any server or proxy. A proxy SHOULD use up to 2*N connections to
another server or proxy, where N is the number of simultaneously
active users. These guidelines are intended to improve HTTP response
times and avoid congestion.
I believe that is why .net framework limit connectioin to one host no
more than 2.
By the way, you shouldn't just call Close after you've gotten theThat's cool. Thanks.
webrespone. If anything happens in between the connection is likely to
remain open for some time, which is not what you would want. To make
sure it is closed in time add a using statement:
using System.Threading;
using System.IO;
using System.Net;
using System;
public class TestConsoleApp
{
public static void Main(string[] args)
{
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWeb), idx);
}
Console.ReadLine();
}
private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.KeepAlive = false;
Console.WriteLine("in thread " + idx);
using (HttpWebResponse response =
(HttpWebResponse)request.GetResponse())
{
Console.WriteLine(response.ContentType + "; idx = " +
(int)idx);
}
}
}
The WebResponse is then automatically closed once it goes out of scope.
But, how does CLR get to know that response.Close() should be called
when out of the scope? Does CLR always call **.Close() for keyword
using?
You can see that this only happens if you try to open many connections
to the same website. I've altered your test to show this:
using System.Threading;
using System.IO;
using System.Net;
using System;
public class TestConsoleApp
{
private static string[] _urls = new string[]
{
"http://www.gmail.com",
"http://www.google.com",
"http://www.google.co.uk",
"http://www.google.nl",
"http://www.google.ie",
"http://www.google.de",
"http://www.amazon.com",
"http://www.microsoft.com",
"http://www.tweakers.net",
"http://www.cnn.com"
};
private static string[] _urlsSame = new string[]
{
"http://www.gmail.com",
"http://www.gmail.com",
"http://www.gmail.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.google.com",
"http://www.google.com",
"http://www.google.com"
};
public static void Main(string[] args)
{
Console.WriteLine("Test A");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new
WaitCallback(testWebWorking), _urls[idx]);
}
Console.ReadLine();
Console.WriteLine("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new
WaitCallback(testWebFaulty), _urls[idx]);
}
Console.ReadLine();
Console.WriteLine("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new
WaitCallback(testWebWorking), _urlsSame[idx]);
}
Console.ReadLine();
}
private static void testWebWorking(object url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.KeepAlive = false;
Console.WriteLine("opening: " + uri);
using (HttpWebResponse response =
(HttpWebResponse)request.GetResponse())
{
Console.WriteLine(response.ContentType + "; uri = " + uri);
}
}
private static void testWebFaulty(object url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.KeepAlive = false;
Console.WriteLine("opening: " + uri);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Console.WriteLine(response.ContentType + "; uri = " + uri);
}
}
test A works regardless of which uri you feed it.
test B only works if there are not too many connections to the same
server (first test B will succeed, second test will fail).
Jesse Houwing
.
- Follow-Ups:
- Re: Why WebHttpRequest.GetResponse() stuck?
- From: Jesse Houwing
- Re: Why WebHttpRequest.GetResponse() stuck?
- References:
- Why WebHttpRequest.GetResponse() stuck?
- From: Morgan Cheng
- Re: Why WebHttpRequest.GetResponse() stuck?
- From: Jesse Houwing
- Why WebHttpRequest.GetResponse() stuck?
- Prev by Date: Re: Reading great code
- Next by Date: Re: How Math.Cos & Math.Sin is implemented?
- Previous by thread: Re: Why WebHttpRequest.GetResponse() stuck?
- Next by thread: Re: Why WebHttpRequest.GetResponse() stuck?
- Index(es):
Relevant Pages
|