Re: Why WebHttpRequest.GetResponse() stuck?

Tech-Archive recommends: Repair Windows Errors & Optimize Windows Performance



Morgan Cheng wrote:
I happens to surf to
http://www.codeproject.com/cs/internet/Crawler.asp, which claims that
WebRequest.GetResponse() will block other thread calling this function
until WebResponse.Close() is called.

I did some experimentation.

public static void Main(string[] args)
{
for (int idx=0; idx<10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWeb), idx);
}
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
Console("in thread " + idx);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Console.WriteLine( response.ContentType + "; idx = " + (int)flag );
// response.Close();
}

The code runs with output like below:
in thread 0
in thread 1
text/html; charset=UTF-8; idx=0
text/html; charset=UTF-8; idx=1
in thread 2
in thread 3
in thread 4
in thread 5
in thread 6
in thread 7
in thread 8
in thread 9


"idx" may be other value, but only 2 threads get through GetRespnse()
all the time. It seems other 18 threads are stuck at
HttpWebRequest.GetResponse().

After I un-comment the line " response.Close()", it prints expected 20
lines. There must something occupied by HttpWebResonse before it is
closed.

Does HttpWebResonse instance occupy some resouce which there is only 2
availabe instance? If this is the case, it is really a issue for
application needs many WebResponse instance. e.g. web crawler.



The response stream is left open for you to examine the data returned b the webresponse, but it's actually only ever downloaded should you need it (to prevent unneeded data transfers and to make sure you get the response within a reasonable amount of time).

I'm not sure why the number is two, but it is good practise to keep the number of concurrent connections you open to one site to a minimum, so that you don't overload the site in question. The WebClient automatically makes sure you don't open too many connections.

By the way, you shouldn't just call Close after you've gotten the webrespone. If anything happens in between the connection is likely to remain open for some time, which is not what you would want. To make sure it is closed in time add a using statement:

using System.Threading;
using System.IO;
using System.Net;
using System;

public class TestConsoleApp
{
public static void Main(string[] args)
{
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWeb), idx);
}
Console.ReadLine();
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.KeepAlive = false;
Console.WriteLine("in thread " + idx);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Console.WriteLine(response.ContentType + "; idx = " + (int)idx);
}
}
}

The WebResponse is then automatically closed once it goes out of scope.

You can see that this only happens if you try to open many connections to the same website. I've altered your test to show this:

using System.Threading;
using System.IO;
using System.Net;
using System;

public class TestConsoleApp
{
private static string[] _urls = new string[]
{
"http://www.gmail.com";,
"http://www.google.com";,
"http://www.google.co.uk";,
"http://www.google.nl";,
"http://www.google.ie";,
"http://www.google.de";,
"http://www.amazon.com";,
"http://www.microsoft.com";,
"http://www.tweakers.net";,
"http://www.cnn.com";
};

private static string[] _urlsSame = new string[]
{
"http://www.gmail.com";,
"http://www.gmail.com";,
"http://www.gmail.com";,
"http://www.cnn.com";,
"http://www.cnn.com";,
"http://www.cnn.com";,
"http://www.cnn.com";,
"http://www.google.com";,
"http://www.google.com";,
"http://www.google.com";
};

public static void Main(string[] args)
{
Console.WriteLine("Test A");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWebWorking), _urls[idx]);
}

Console.ReadLine();

Console.WriteLine("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWebFaulty), _urls[idx]);
}

Console.ReadLine();

Console.WriteLine("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.QueueUserWorkItem(new WaitCallback(testWebWorking), _urlsSame[idx]);
}

Console.ReadLine();
}

private static void testWebWorking(object url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.KeepAlive = false;
Console.WriteLine("opening: " + uri);
using (HttpWebResponse response = (HttpWebResponse)request.GetResponse())
{
Console.WriteLine(response.ContentType + "; uri = " + uri);
}
}

private static void testWebFaulty(object url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(uri);
request.KeepAlive = false;
Console.WriteLine("opening: " + uri);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Console.WriteLine(response.ContentType + "; uri = " + uri);
}
}

test A works regardless of which uri you feed it.
test B only works if there are not too many connections to the same server (first test B will succeed, second test will fail).

Jesse Houwing
.



Relevant Pages

  • Re: Why WebHttpRequest.GetResponse() stuck?
    ... public static void Main ... private static void testWeb ... HttpWebRequest request = WebRequest.Create; ... automatically makes sure you don't open too many connections. ...
    (microsoft.public.dotnet.languages.csharp)
  • Re: been workin hard
    ... private static void out{ ... public static void getTime{ ... int m = now.MINUTE; ...
    (comp.lang.java.help)
  • Re: could i pls get sum hlp w/this???
    ... public static void showClock() ... int m = now.getMinutes; ... class MyClock extends JPanel{ ...
    (comp.lang.java.help)
  • Re: been workin hard
    ... private static void out{ ... public static void getTime{ ... int m = now.MINUTE; ...
    (comp.lang.java.help)