Joerg, thanks for taking the time to respond to my request for help.
I'm sorry about the comment. I believe the comment for 256 bytes
belonged to the original example which I copied the code. The 2000
byte buffer is the actual size I was using the in the program from
which I derived the sample code to illustrate my problem in this
newsgroup.
I did not notice that the entire page of data does not download. That
is a good catch. The code I wrote using Studio 6/C++ does not have
that problem (see original post).
I was able to implement the Asynchronous approach using
WebResponse.BeginGetResponse(), and WebResponse.EndGetResponse() that
you suggested and note that it also does not download the entire page
of data as you have described.
What my program does is start to download the data up to a point and
then close the connection. The reason for closing the connection is
because it takes so long to get the entire amount of data and there is
not always a need to get all of it. The implementation I have using
Studio 6/C++ does this and works perfectly. It is very dissapointing
that .Net/C# does not work.
Were you able to get better results using the socket approach?
How about some of you Microsoft gurus taking a look into this problem
and give answers to the following two questions:
1) What do you do to download the entire page of data.
2) What can you do to close the connection with zero delay (after
reading 1 or more 2000 byte buffers of data).
It is easy to set up this experiment by using cut paste with the
sample code I gave and putting it into a butten event of a simple
windows app.
Thanks in advance!
"Joerg Jooss" <jo*********@gmx.net> wrote in message news:<ei*************@tk2msftngp13.phx.gbl>...
No_Excuses wrote:
All,
I am interested in reading the text of a web page and parsing it.
After searching on this newgroup I decided to use the following:
******************************* START OF CODE ************************
String sTemp =
"http://cgi3.igl.net/cgi-bin/ladder/teamsql/team_view.cgi?ladd=teamknights&n
um=238&showall=1";
WebRequest myWebRequest = WebRequest.Create(sTemp);
WebResponse myWebResponse = myWebRequest.GetResponse();
Stream myStream = myWebResponse.GetResponseStream();
// default encoding is utf-8
StreamReader SR = new StreamReader( myStream );
Char[] buffer = new Char[2048];
// Read 256 charcters at a time.
int count = SR.Read( buffer, 0, 2000 );
//while (count > 0)
//{
// do some processing - may read all or part
// count = SR.Read(buffer, 0, 2000);
//}
SR.Close(); // Release the resources
myWebResponse.Close();
******************************* END OF CODE ************************
This code should look very familiar because it is all over the
newsgroup and Microsoft support help pages.
I doubt that, as the code doesn't do what it advertises ;-)
Char[] buffer = new Char[2048];
// Read 256 charcters at a time.
int count = SR.Read( buffer, 0, 2000 );
Why a 2 kB buffer, when you're supposedly reading only 256 chars, but you're
specifying 2000 chars for the Read() call?
The web page has a big table on it and it takes a while to download
(even with a cable modem).
What I observe is the following. If I open and read all the data
(i.e.
until count > 0 fails, then stepping over SR.Close() execution time is
immediate. If I read only 2000 bytes as the above example shows, when
I step over SR.Close() it takes a long time (for me around 10-15
seconds). This may be a coincidence but it seems to take the same
amount of time as if I was reading all of the data.
Well, this particular page is an insane 6 MB large... the web server does
not help the client either, as there's no Content-Length header provided,
just Connection: close:
HTTP/1.1 200 OK
Date: Sat, 10 Apr 2004 10:20:31 GMT
Server: Apache/1.3.24 (Unix) mod_throttle/3.1.2 PHP/4.2.0
Connection: close
Content-Type: text/html
Even more interestingly, I cannot even download the entire page at all...
neither WebClient nor WebRequest/WebResponse are able to download that
beast. Both stop downloading at the exact same position -- I guess the
underlying TCP stream is prematurely closed. This must be some WinInet
default behaviour (quirk?), as the same thing happens to me when I download
the page using some ancient old Visual J++ code that uses plain TCP. I think
I'll write some plain HTTP client using System.Net.Sockets and see what
happens.
(Note: If the web server returns a Content-Length header, downloading the
page works just fine.)
[...] Does anyone know how to terminate the loading of the page so I can
eliminate the delay? I had implemented this in C++ with MFC using
CInternetSession.OpenURL() and did not have this problem.
Use asynchronous I/O -- see WebRequest.Abort(),
WebResponse.BeginGetResponse(), and WebResponse.EndGetResponse().
Cheers,