473,661 Members | 2,449 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Why WebHttpRequest. GetResponse() stuck?

I happens to surf to
http://www.codeproject.com/cs/internet/Crawler.asp, which claims that
WebRequest.GetR esponse() will block other thread calling this function
until WebResponse.Clo se() is called.

I did some experimentation .

public static void Main(string[] args)
{
for (int idx=0; idx<10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew WaitCallback(te stWeb), idx);
}
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
Console("in thread " + idx);
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse();
Console.WriteLi ne( response.Conten tType + "; idx = " + (int)flag );
// response.Close( );
}

The code runs with output like below:
in thread 0
in thread 1
text/html; charset=UTF-8; idx=0
text/html; charset=UTF-8; idx=1
in thread 2
in thread 3
in thread 4
in thread 5
in thread 6
in thread 7
in thread 8
in thread 9
"idx" may be other value, but only 2 threads get through GetRespnse()
all the time. It seems other 18 threads are stuck at
HttpWebRequest. GetResponse().

After I un-comment the line " response.Close( )", it prints expected 20
lines. There must something occupied by HttpWebResonse before it is
closed.

Does HttpWebResonse instance occupy some resouce which there is only 2
availabe instance? If this is the case, it is really a issue for
application needs many WebResponse instance. e.g. web crawler.

Oct 16 '06 #1
8 13562
Morgan Cheng wrote:
I happens to surf to
http://www.codeproject.com/cs/internet/Crawler.asp, which claims that
WebRequest.GetR esponse() will block other thread calling this function
until WebResponse.Clo se() is called.

I did some experimentation .

public static void Main(string[] args)
{
for (int idx=0; idx<10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew WaitCallback(te stWeb), idx);
}
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
Console("in thread " + idx);
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse();
Console.WriteLi ne( response.Conten tType + "; idx = " + (int)flag );
// response.Close( );
}

The code runs with output like below:
in thread 0
in thread 1
text/html; charset=UTF-8; idx=0
text/html; charset=UTF-8; idx=1
in thread 2
in thread 3
in thread 4
in thread 5
in thread 6
in thread 7
in thread 8
in thread 9
"idx" may be other value, but only 2 threads get through GetRespnse()
all the time. It seems other 18 threads are stuck at
HttpWebRequest. GetResponse().

After I un-comment the line " response.Close( )", it prints expected 20
lines. There must something occupied by HttpWebResonse before it is
closed.

Does HttpWebResonse instance occupy some resouce which there is only 2
availabe instance? If this is the case, it is really a issue for
application needs many WebResponse instance. e.g. web crawler.

The response stream is left open for you to examine the data returned b
the webresponse, but it's actually only ever downloaded should you need
it (to prevent unneeded data transfers and to make sure you get the
response within a reasonable amount of time).

I'm not sure why the number is two, but it is good practise to keep the
number of concurrent connections you open to one site to a minimum, so
that you don't overload the site in question. The WebClient
automatically makes sure you don't open too many connections.

By the way, you shouldn't just call Close after you've gotten the
webrespone. If anything happens in between the connection is likely to
remain open for some time, which is not what you would want. To make
sure it is closed in time add a using statement:

using System.Threadin g;
using System.IO;
using System.Net;
using System;

public class TestConsoleApp
{
public static void Main(string[] args)
{
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew WaitCallback(te stWeb), idx);
}
Console.ReadLin e();
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
request.KeepAli ve = false;
Console.WriteLi ne("in thread " + idx);
using (HttpWebRespons e response =
(HttpWebRespons e)request.GetRe sponse())
{
Console.WriteLi ne(response.Con tentType + "; idx = " +
(int)idx);
}
}
}

The WebResponse is then automatically closed once it goes out of scope.

You can see that this only happens if you try to open many connections
to the same website. I've altered your test to show this:

using System.Threadin g;
using System.IO;
using System.Net;
using System;

public class TestConsoleApp
{
private static string[] _urls = new string[]
{
"http://www.gmail.com",
"http://www.google.com" ,
"http://www.google.co.u k",
"http://www.google.nl",
"http://www.google.ie",
"http://www.google.de",
"http://www.amazon.com" ,
"http://www.microsoft.c om",
"http://www.tweakers.ne t",
"http://www.cnn.com"
};

private static string[] _urlsSame = new string[]
{
"http://www.gmail.com",
"http://www.gmail.com",
"http://www.gmail.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.google.com" ,
"http://www.google.com" ,
"http://www.google.com"
};

public static void Main(string[] args)
{
Console.WriteLi ne("Test A");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew
WaitCallback(te stWebWorking), _urls[idx]);
}

Console.ReadLin e();

Console.WriteLi ne("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew
WaitCallback(te stWebFaulty), _urls[idx]);
}

Console.ReadLin e();

Console.WriteLi ne("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew
WaitCallback(te stWebWorking), _urlsSame[idx]);
}

Console.ReadLin e();
}

private static void testWebWorking( object url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
request.KeepAli ve = false;
Console.WriteLi ne("opening: " + uri);
using (HttpWebRespons e response =
(HttpWebRespons e)request.GetRe sponse())
{
Console.WriteLi ne(response.Con tentType + "; uri = " + uri);
}
}

private static void testWebFaulty(o bject url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
request.KeepAli ve = false;
Console.WriteLi ne("opening: " + uri);
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse();
Console.WriteLi ne(response.Con tentType + "; uri = " + uri);
}
}

test A works regardless of which uri you feed it.
test B only works if there are not too many connections to the same
server (first test B will succeed, second test will fail).

Jesse Houwing
Oct 16 '06 #2
Isn't there some OS rule that prevents you from opening more than 2
connections to a domain at once? I'm sure there is. How that directly
affects your code, I'm not sure.

Oct 16 '06 #3

Jesse Houwing wrote:
Morgan Cheng wrote:
I happens to surf to
http://www.codeproject.com/cs/internet/Crawler.asp, which claims that
WebRequest.GetR esponse() will block other thread calling this function
until WebResponse.Clo se() is called.

I did some experimentation .

public static void Main(string[] args)
{
for (int idx=0; idx<10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew WaitCallback(te stWeb), idx);
}
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
Console("in thread " + idx);
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse();
Console.WriteLi ne( response.Conten tType + "; idx = " + (int)flag );
// response.Close( );
}

The code runs with output like below:
in thread 0
in thread 1
text/html; charset=UTF-8; idx=0
text/html; charset=UTF-8; idx=1
in thread 2
in thread 3
in thread 4
in thread 5
in thread 6
in thread 7
in thread 8
in thread 9
"idx" may be other value, but only 2 threads get through GetRespnse()
all the time. It seems other 18 threads are stuck at
HttpWebRequest. GetResponse().

After I un-comment the line " response.Close( )", it prints expected 20
lines. There must something occupied by HttpWebResonse before it is
closed.

Does HttpWebResonse instance occupy some resouce which there is only 2
availabe instance? If this is the case, it is really a issue for
application needs many WebResponse instance. e.g. web crawler.


The response stream is left open for you to examine the data returned b
the webresponse, but it's actually only ever downloaded should you need
it (to prevent unneeded data transfers and to make sure you get the
response within a reasonable amount of time).
Do you mean that HttpWebRequest. GetResponse() doesn't download the uri
resouce to local machine? I tried to fetch some big resouce. The
GetResponse() takes time, whilte response.GetRes ponseStream() returns
immediately. I belive that downloading happends at GetResponse().

>
I'm not sure why the number is two, but it is good practise to keep the
number of concurrent connections you open to one site to a minimum, so
that you don't overload the site in question. The WebClient
automatically makes sure you don't open too many connections.
I checked Http/1.1 protocol. In RFC 2616 section 8.1.4, it reads,
Clients that use persistent connections SHOULD limit the number of
simultaneous connections that they maintain to a given server. A
single-user client SHOULD NOT maintain more than 2 connections with
any server or proxy. A proxy SHOULD use up to 2*N connections to
another server or proxy, where N is the number of simultaneously
active users. These guidelines are intended to improve HTTP response
times and avoid congestion.

I believe that is why .net framework limit connectioin to one host no
more than 2.
By the way, you shouldn't just call Close after you've gotten the
webrespone. If anything happens in between the connection is likely to
remain open for some time, which is not what you would want. To make
sure it is closed in time add a using statement:

using System.Threadin g;
using System.IO;
using System.Net;
using System;

public class TestConsoleApp
{
public static void Main(string[] args)
{
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew WaitCallback(te stWeb), idx);
}
Console.ReadLin e();
}

private static void testWeb(object idx)
{
string uri = "http://www.gmail.com";
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
request.KeepAli ve = false;
Console.WriteLi ne("in thread " + idx);
using (HttpWebRespons e response =
(HttpWebRespons e)request.GetRe sponse())
{
Console.WriteLi ne(response.Con tentType + "; idx = " +
(int)idx);
}
}
}

The WebResponse is then automatically closed once it goes out of scope.
That's cool. Thanks.
But, how does CLR get to know that response.Close( ) should be called
when out of the scope? Does CLR always call **.Close() for keyword
using?
You can see that this only happens if you try to open many connections
to the same website. I've altered your test to show this:

using System.Threadin g;
using System.IO;
using System.Net;
using System;

public class TestConsoleApp
{
private static string[] _urls = new string[]
{
"http://www.gmail.com",
"http://www.google.com" ,
"http://www.google.co.u k",
"http://www.google.nl",
"http://www.google.ie",
"http://www.google.de",
"http://www.amazon.com" ,
"http://www.microsoft.c om",
"http://www.tweakers.ne t",
"http://www.cnn.com"
};

private static string[] _urlsSame = new string[]
{
"http://www.gmail.com",
"http://www.gmail.com",
"http://www.gmail.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.cnn.com",
"http://www.google.com" ,
"http://www.google.com" ,
"http://www.google.com"
};

public static void Main(string[] args)
{
Console.WriteLi ne("Test A");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew
WaitCallback(te stWebWorking), _urls[idx]);
}

Console.ReadLin e();

Console.WriteLi ne("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew
WaitCallback(te stWebFaulty), _urls[idx]);
}

Console.ReadLin e();

Console.WriteLi ne("Test B");
for (int idx = 0; idx < 10; ++idx)
{
ThreadPool.Queu eUserWorkItem(n ew
WaitCallback(te stWebWorking), _urlsSame[idx]);
}

Console.ReadLin e();
}

private static void testWebWorking( object url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
request.KeepAli ve = false;
Console.WriteLi ne("opening: " + uri);
using (HttpWebRespons e response =
(HttpWebRespons e)request.GetRe sponse())
{
Console.WriteLi ne(response.Con tentType + "; uri = " + uri);
}
}

private static void testWebFaulty(o bject url)
{
string uri = (string)url;
HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate(uri);
request.KeepAli ve = false;
Console.WriteLi ne("opening: " + uri);
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse();
Console.WriteLi ne(response.Con tentType + "; uri = " + uri);
}
}

test A works regardless of which uri you feed it.
test B only works if there are not too many connections to the same
server (first test B will succeed, second test will fail).

Jesse Houwing
Oct 17 '06 #4
According to MSDN, the 2 connection limit can be set by changing
ServicePointMan ager.DefaultCon nectionLimit. By default, the property is
2 since HTTP/1.1 enforce that.

However, I tried to change the propety and found it doesn't work
sometimes.
As below code, sometimes 5 url all goes through, sometimes it doens't.
It seems this trick doesn't work steadily.
public static void Main(string[] args)
{
ServicePointMan ager.DefaultCon nectionLimit = 20;

for (int i=0; i<5; ++i)
{
ThreadPool.Queu eUserWorkItem(n ew WaitCallback(te stWeb),
i);
}
Thread.Sleep(10 *1000);
}

private static void testWeb(object idx)
{
string url = "http://www.google.com" ;
HttpWebRequest request =
(HttpWebRequest )WebRequest.Cre ate(url);
request.KeepAli ve = false;
Console.WriteLi ne("enter " + idx);

HttpWebResponse response =
(HttpWebRespons e)request.GetRe sponse();
Console.WriteLi ne(url + " ; idx = " + (int)idx);
}
Steven Nagy wrote:
Isn't there some OS rule that prevents you from opening more than 2
connections to a domain at once? I'm sure there is. How that directly
affects your code, I'm not sure.
Oct 17 '06 #5
>The WebResponse is then automatically closed once it goes out of scope.
That's cool. Thanks.
But, how does CLR get to know that response.Close( ) should be called
when out of the scope? Does CLR always call **.Close() for keyword
using?
It actually calls IDisposable.Dis pose which is implemented explicitly.
Dispose in turn calls the Close Method.

Jesse
Oct 17 '06 #6
Perhaps that property is only for managing incoming connections and
restricting the number of connections your app will accept?

I found info here:
http://msdn.microsoft.com/library/de...tnetarch14.asp

If the remote site is not configured to accept more than 2 connections
per client, then I can't see how you can get around this problem.

Steven

Oct 18 '06 #7

Steven Nagy wrote:
Perhaps that property is only for managing incoming connections and
restricting the number of connections your app will accept?
I believe the property is for outgoing connection.
I found info here:
http://msdn.microsoft.com/library/de...tnetarch14.asp
In my understanding of this article, the connection setting is as
client side.
"server. For simulating multiple clients sending simultaneous requests
to the remote object, we changed the default of 2 to 100 connections to
the server per client using the client's configuration".
If the remote site is not configured to accept more than 2 connections
per client, then I can't see how you can get around this problem.
Yes, web server COULD limit connection from same IP address, but
normally they won't do that. Since many IP address are belind a proxy,
if limit connection from one IP address(in this case, from one proxy),
it impacts all client behind the proxy.

Actually, I believe that webmaster prefer more web access:)
Since almost all clients has such 2-connection-to-one-host limit,
server might want to break it. I heared from someone that Yahoo! has
some techniques to trick browser to access its host with more than 2
connection. Not clear how Yahoo! make it.

>
Steven
Oct 18 '06 #8
Not sure if this is relevant, but I remember something at Tech Ed about
Virtual Earth being spread across multiple domains so that IE could
open many connections at once to get the data back faster, as opposed
to just one domain restricting to 2 connections.

Oct 18 '06 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
6141
by: Jonathan | last post by:
Calling the System.Net.WebResponse.GetResponse method (see code below) results in the following error: System.Net.WebException: The underlying connection was closed: An unexpected error occurred on a receive. at(System.Net.HttpWebRequest.CheckFinalStatus()) at System.Net.HttpWebRequest.EndGetResponse(IAsyncResult asyncResult) at(System.Net.HttpWebRequest.GetResponse())
5
35972
by: Nathan | last post by:
This is a copy of a message at microsoft.public.dotnet.framework.clr: THE CODE: I'm using an HttpWebResponse object to send an HTTP POST to a Java server I have written and are running on the same machine (for dev and testing). Here is the C# code snippet: 1 string clientAddr = "http://127.0.0.1:22225/"; 2 try 3 { 4 webreq = (HttpWebRequest)WebRequest.Create( clientAddr );
1
12586
by: Jeff B | last post by:
I'm trying to create a simple screen scraping application and I kept getting a System.Net.WebException thrown back with a message of "The operation has timed-out." At first I thought it was some kind of connectivity issue on the machine. It didn't make sense though, because I can open up a browser on the same machine and easily browse the web. I'm stumped. I looked over my code for any errors and just couldn't find what I was doing...
4
7223
by: Terry | last post by:
Hello, I am trying to get a response for an .aspx page in my current project (same virtual directory) by using WebRequest.GetResponse but I keep getting a exception with "500 Internal server error" in the exception message. I am able to do this fine with another .aspx page that has no code-behind. The page that has code-behind throws the exception. What I am doing is getting the .aspx response, reading the stream, replacing
2
14089
by: Steve Richter | last post by:
I have a page that uses simple HTTP GET to do an ISBN lookup via Amazon.com. The page works when I run it from //localhost. But I have moved it to my godaddy.com shared hoster site, and I get errors on the HttpWebRequest.GetResponse statement. The remote server returned an error: (401) Unauthorized also, when I use the network credentials object in the context of my request, I get this error:
2
20174
by: GlennLanier | last post by:
Hello, I've searched the forums and can't find an answer -- if it i there, kindly point me in that direction. I would like to simulate a browser POSTing a FORM and be able to pars the response. I have the following code in my Page_Load (litResponse is defined a <ASP:Literal>):
2
4881
by: microdevsolutions | last post by:
Hello I've seen examples to read a file from somewhere into a HttpWebRequest object then write it to a HttpWebResponse object then stream it into a Stream object, very similar to the following code snipet :- ============================================== Uri uri = new Uri("http://www.somewhere/pdf/image1.pdf"); HttpWebRequest httpRequest = null;
5
4868
by: mr.newsgroupguy | last post by:
I am working in C# .NET 1.1. My app has a button on its main form that checks to see if it has access to a file on our server, just an XML file. On our server we are running W2K IIS with a virtual directory, set to Windows Authentication. I am creating an HTTPWebRequest object on the client, and setting its Credentials to CredentialCache.DefaultCredentials. Also, I have tried numerous Timeout settings, but they have not made a...
4
5193
by: =?Utf-8?B?SmltIE93ZW4=?= | last post by:
Hi, I've run into a set of errors I don't understand coming back from HttpWebRequest.GetResponse, In one case, null is returned from the request without an Exception and in the other the request does not appear to leave my system yet still returns the 500 error. In the code below, there is a xml-snippet that I use as a test. When I run the test using the snippet, the server on the other side logs and processes the request correctly...
0
8432
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8855
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8758
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8545
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8633
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5653
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4346
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2762
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1743
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.