472,954 Members | 1,760 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,954 software developers and data experts.

Read contents of a web page

Hi All,

I am trying to read the contents of a page through its URL.

My code snippet is as follows:
public void mtdGetPageDataHWR()
{
HttpWebRequest objRequ =
(HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
HttpWebResponse objResp = (HttpWebResponse)objRequ.GetResponse();
string strVersion = objResp.ProtocolVersion.ToString();
StreamReader objRd = new StreamReader(objResp.GetResponseStream());
string strRd = objRd.ReadLine();
while(strRd!=null)
{
Response.Write(strRd);
strRd = objRd.ReadLine();
}
}

Is there any other way to achieve this which could be more efficient or
faster than this.

Any help on this would be very handy

Thanks,

Kuldeep
Oct 24 '06 #1
4 4148
You should use myHttpWebRequest.BeginGetResponse to do it asynchronous.
and retrieve the response from EndGetResponse.

Here is an example from MSDN on how to do that.
using System;
using System.Net;
using System.IO;
using System.Text;
using System.Threading;
public class RequestState
{
// This class stores the State of the request.
const int BUFFER_SIZE = 1024;
public StringBuilder requestData;
public byte[] BufferRead;
public HttpWebRequest request;
public HttpWebResponse response;
public Stream streamResponse;
public RequestState()
{
BufferRead = new byte[BUFFER_SIZE];
requestData = new StringBuilder("");
request = null;
streamResponse = null;
}
}

class HttpWebRequest_BeginGetResponse
{
public static ManualResetEvent allDone= new ManualResetEvent(false);
const int BUFFER_SIZE = 1024;
const int DefaultTimeout = 2 * 60 * 1000; // 2 minutes timeout

// Abort the request if the timer fires.
private static void TimeoutCallback(object state, bool timedOut) {
if (timedOut) {
HttpWebRequest request = state as HttpWebRequest;
if (request != null) {
request.Abort();
}
}
}

static void Main()
{

try
{
// Create a HttpWebrequest object to the desired URL.
HttpWebRequest myHttpWebRequest=
(HttpWebRequest)WebRequest.Create("http://www.contoso.com");
/**
* If you are behind a firewall and you do not have your browser
proxy setup
* you need to use the following proxy creation code.

// Create a proxy object.
WebProxy myProxy = new WebProxy();

// Associate a new Uri object to the _wProxy object, using the
proxy address
// selected by the user.
myProxy.Address = new Uri("http://myproxy");
// Finally, initialize the Web request object proxy property with
the _wProxy
// object.
myHttpWebRequest.Proxy=myProxy;
***/

// Create an instance of the RequestState and assign the previous
myHttpWebRequest
// object to its request field.
RequestState myRequestState = new RequestState();
myRequestState.request = myHttpWebRequest;
// Start the asynchronous request.
IAsyncResult result=
(IAsyncResult) myHttpWebRequest.BeginGetResponse(new
AsyncCallback(RespCallback),myRequestState);

// this line implements the timeout, if there is a timeout, the
callback fires and the request becomes aborted
ThreadPool.RegisterWaitForSingleObject (result.AsyncWaitHandle,
new WaitOrTimerCallback(TimeoutCallback), myHttpWebRequest,
DefaultTimeout, true);

// The response came in the allowed time. The work processing
will happen in the
// callback function.
allDone.WaitOne();

// Release the HttpWebResponse resource.
myRequestState.response.Close();
}
catch(WebException e)
{
Console.WriteLine("\nMain Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
Console.WriteLine("Press any key to continue..........");
}
catch(Exception e)
{
Console.WriteLine("\nMain Exception raised!");
Console.WriteLine("Source :{0} " , e.Source);
Console.WriteLine("Message :{0} " , e.Message);
Console.WriteLine("Press any key to continue..........");
Console.Read();
}
}
private static void RespCallback(IAsyncResult asynchronousResult)
{
try
{
// State of request is asynchronous.
RequestState myRequestState=(RequestState)
asynchronousResult.AsyncState;
HttpWebRequest myHttpWebRequest=myRequestState.request;
myRequestState.response = (HttpWebResponse)
myHttpWebRequest.EndGetResponse(asynchronousResult );

// Read the response into a Stream object.
Stream responseStream = myRequestState.response.GetResponseStream();
myRequestState.streamResponse=responseStream;

// Begin the Reading of the contents of the HTML page and print
it to the console.
IAsyncResult asynchronousInputRead =
responseStream.BeginRead(myRequestState.BufferRead , 0, BUFFER_SIZE, new
AsyncCallback(ReadCallBack), myRequestState);
return;
}
catch(WebException e)
{
Console.WriteLine("\nRespCallback Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
}
allDone.Set();
}
private static void ReadCallBack(IAsyncResult asyncResult)
{
try
{

RequestState myRequestState = (RequestState)asyncResult.AsyncState;
Stream responseStream = myRequestState.streamResponse;
int read = responseStream.EndRead( asyncResult );
// Read the HTML page and then print it to the console.
if (read 0)
{

myRequestState.requestData.Append(Encoding.ASCII.G etString(myRequestState.BufferRead,
0, read));
IAsyncResult asynchronousResult = responseStream.BeginRead(
myRequestState.BufferRead, 0, BUFFER_SIZE, new
AsyncCallback(ReadCallBack), myRequestState);
return;
}
else
{
Console.WriteLine("\nThe contents of the Html page are : ");
if(myRequestState.requestData.Length>1)
{
string stringContent;
stringContent = myRequestState.requestData.ToString();
Console.WriteLine(stringContent);
}
Console.WriteLine("Press any key to continue..........");
Console.ReadLine();

responseStream.Close();
}

}
catch(WebException e)
{
Console.WriteLine("\nReadCallBack Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
}
allDone.Set();

}

Kuldeep wrote:
Hi All,

I am trying to read the contents of a page through its URL.

My code snippet is as follows:
public void mtdGetPageDataHWR()
{
HttpWebRequest objRequ =
(HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
HttpWebResponse objResp = (HttpWebResponse)objRequ.GetResponse();
string strVersion = objResp.ProtocolVersion.ToString();
StreamReader objRd = new StreamReader(objResp.GetResponseStream());
string strRd = objRd.ReadLine();
while(strRd!=null)
{
Response.Write(strRd);
strRd = objRd.ReadLine();
}
}

Is there any other way to achieve this which could be more efficient or
faster than this.

Any help on this would be very handy

Thanks,

Kuldeep

Oct 24 '06 #2
Hi Sun,

Is there a method which could be as fast as a "Ctrl+F" on
a web page to achieve the same?
Or something that we could control through Javascript or any scripting
language for that matter?
Thanks for the response
Kuldeep
"Jianwei Sun" <js***********@gmail.comwrote in message
news:uI**************@TK2MSFTNGP05.phx.gbl...
You should use myHttpWebRequest.BeginGetResponse to do it asynchronous.
and retrieve the response from EndGetResponse.

Here is an example from MSDN on how to do that.
using System;
using System.Net;
using System.IO;
using System.Text;
using System.Threading;
public class RequestState
{
// This class stores the State of the request.
const int BUFFER_SIZE = 1024;
public StringBuilder requestData;
public byte[] BufferRead;
public HttpWebRequest request;
public HttpWebResponse response;
public Stream streamResponse;
public RequestState()
{
BufferRead = new byte[BUFFER_SIZE];
requestData = new StringBuilder("");
request = null;
streamResponse = null;
}
}

class HttpWebRequest_BeginGetResponse
{
public static ManualResetEvent allDone= new ManualResetEvent(false);
const int BUFFER_SIZE = 1024;
const int DefaultTimeout = 2 * 60 * 1000; // 2 minutes timeout

// Abort the request if the timer fires.
private static void TimeoutCallback(object state, bool timedOut) {
if (timedOut) {
HttpWebRequest request = state as HttpWebRequest;
if (request != null) {
request.Abort();
}
}
}

static void Main()
{

try
{
// Create a HttpWebrequest object to the desired URL.
HttpWebRequest myHttpWebRequest=
(HttpWebRequest)WebRequest.Create("http://www.contoso.com");
/**
* If you are behind a firewall and you do not have your browser proxy
setup
* you need to use the following proxy creation code.

// Create a proxy object.
WebProxy myProxy = new WebProxy();

// Associate a new Uri object to the _wProxy object, using the proxy
address
// selected by the user.
myProxy.Address = new Uri("http://myproxy");
// Finally, initialize the Web request object proxy property with
the _wProxy
// object.
myHttpWebRequest.Proxy=myProxy;
***/

// Create an instance of the RequestState and assign the previous
myHttpWebRequest
// object to its request field.
RequestState myRequestState = new RequestState();
myRequestState.request = myHttpWebRequest;
// Start the asynchronous request.
IAsyncResult result=
(IAsyncResult) myHttpWebRequest.BeginGetResponse(new
AsyncCallback(RespCallback),myRequestState);

// this line implements the timeout, if there is a timeout, the
callback fires and the request becomes aborted
ThreadPool.RegisterWaitForSingleObject (result.AsyncWaitHandle, new
WaitOrTimerCallback(TimeoutCallback), myHttpWebRequest, DefaultTimeout,
true);

// The response came in the allowed time. The work processing will
happen in the
// callback function.
allDone.WaitOne();

// Release the HttpWebResponse resource.
myRequestState.response.Close();
}
catch(WebException e)
{
Console.WriteLine("\nMain Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
Console.WriteLine("Press any key to continue..........");
}
catch(Exception e)
{
Console.WriteLine("\nMain Exception raised!");
Console.WriteLine("Source :{0} " , e.Source);
Console.WriteLine("Message :{0} " , e.Message);
Console.WriteLine("Press any key to continue..........");
Console.Read();
}
}
private static void RespCallback(IAsyncResult asynchronousResult)
{
try
{
// State of request is asynchronous.
RequestState myRequestState=(RequestState)
asynchronousResult.AsyncState;
HttpWebRequest myHttpWebRequest=myRequestState.request;
myRequestState.response = (HttpWebResponse)
myHttpWebRequest.EndGetResponse(asynchronousResult );

// Read the response into a Stream object.
Stream responseStream = myRequestState.response.GetResponseStream();
myRequestState.streamResponse=responseStream;

// Begin the Reading of the contents of the HTML page and print it
to the console.
IAsyncResult asynchronousInputRead =
responseStream.BeginRead(myRequestState.BufferRead , 0, BUFFER_SIZE, new
AsyncCallback(ReadCallBack), myRequestState);
return;
}
catch(WebException e)
{
Console.WriteLine("\nRespCallback Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
}
allDone.Set();
}
private static void ReadCallBack(IAsyncResult asyncResult)
{
try
{

RequestState myRequestState = (RequestState)asyncResult.AsyncState;
Stream responseStream = myRequestState.streamResponse;
int read = responseStream.EndRead( asyncResult );
// Read the HTML page and then print it to the console.
if (read 0)
{

myRequestState.requestData.Append(Encoding.ASCII.G etString(myRequestState.BufferRead,
0, read));
IAsyncResult asynchronousResult = responseStream.BeginRead(
myRequestState.BufferRead, 0, BUFFER_SIZE, new
AsyncCallback(ReadCallBack), myRequestState);
return;
}
else
{
Console.WriteLine("\nThe contents of the Html page are : ");
if(myRequestState.requestData.Length>1)
{
string stringContent;
stringContent = myRequestState.requestData.ToString();
Console.WriteLine(stringContent);
}
Console.WriteLine("Press any key to continue..........");
Console.ReadLine();

responseStream.Close();
}

}
catch(WebException e)
{
Console.WriteLine("\nReadCallBack Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
}
allDone.Set();

}

Kuldeep wrote:
>Hi All,

I am trying to read the contents of a page through its URL.

My code snippet is as follows:
public void mtdGetPageDataHWR()
{
HttpWebRequest objRequ =
(HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
HttpWebResponse objResp = (HttpWebResponse)objRequ.GetResponse();
string strVersion = objResp.ProtocolVersion.ToString();
StreamReader objRd = new StreamReader(objResp.GetResponseStream());
string strRd = objRd.ReadLine();
while(strRd!=null)
{
Response.Write(strRd);
strRd = objRd.ReadLine();
}
}

Is there any other way to achieve this which could be more efficient or
faster than this.

Any help on this would be very handy

Thanks,

Kuldeep
Oct 24 '06 #3
If I understand correctly, you are looking for some client-side
functionalities, then this is really a wrong group to post this question.

Kuldeep wrote:
Hi Sun,

Is there a method which could be as fast as a "Ctrl+F" on
a web page to achieve the same?
Or something that we could control through Javascript or any scripting
language for that matter?
Thanks for the response
Kuldeep
"Jianwei Sun" <js***********@gmail.comwrote in message
news:uI**************@TK2MSFTNGP05.phx.gbl...
>You should use myHttpWebRequest.BeginGetResponse to do it asynchronous.
and retrieve the response from EndGetResponse.

Here is an example from MSDN on how to do that.
using System;
using System.Net;
using System.IO;
using System.Text;
using System.Threading;
public class RequestState
{
// This class stores the State of the request.
const int BUFFER_SIZE = 1024;
public StringBuilder requestData;
public byte[] BufferRead;
public HttpWebRequest request;
public HttpWebResponse response;
public Stream streamResponse;
public RequestState()
{
BufferRead = new byte[BUFFER_SIZE];
requestData = new StringBuilder("");
request = null;
streamResponse = null;
}
}

class HttpWebRequest_BeginGetResponse
{
public static ManualResetEvent allDone= new ManualResetEvent(false);
const int BUFFER_SIZE = 1024;
const int DefaultTimeout = 2 * 60 * 1000; // 2 minutes timeout

// Abort the request if the timer fires.
private static void TimeoutCallback(object state, bool timedOut) {
if (timedOut) {
HttpWebRequest request = state as HttpWebRequest;
if (request != null) {
request.Abort();
}
}
}

static void Main()
{

try
{
// Create a HttpWebrequest object to the desired URL.
HttpWebRequest myHttpWebRequest=
(HttpWebRequest)WebRequest.Create("http://www.contoso.com");
/**
* If you are behind a firewall and you do not have your browser proxy
setup
* you need to use the following proxy creation code.

// Create a proxy object.
WebProxy myProxy = new WebProxy();

// Associate a new Uri object to the _wProxy object, using the proxy
address
// selected by the user.
myProxy.Address = new Uri("http://myproxy");
// Finally, initialize the Web request object proxy property with
the _wProxy
// object.
myHttpWebRequest.Proxy=myProxy;
***/

// Create an instance of the RequestState and assign the previous
myHttpWebRequest
// object to its request field.
RequestState myRequestState = new RequestState();
myRequestState.request = myHttpWebRequest;
// Start the asynchronous request.
IAsyncResult result=
(IAsyncResult) myHttpWebRequest.BeginGetResponse(new
AsyncCallback(RespCallback),myRequestState);

// this line implements the timeout, if there is a timeout, the
callback fires and the request becomes aborted
ThreadPool.RegisterWaitForSingleObject (result.AsyncWaitHandle, new
WaitOrTimerCallback(TimeoutCallback), myHttpWebRequest, DefaultTimeout,
true);

// The response came in the allowed time. The work processing will
happen in the
// callback function.
allDone.WaitOne();

// Release the HttpWebResponse resource.
myRequestState.response.Close();
}
catch(WebException e)
{
Console.WriteLine("\nMain Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
Console.WriteLine("Press any key to continue..........");
}
catch(Exception e)
{
Console.WriteLine("\nMain Exception raised!");
Console.WriteLine("Source :{0} " , e.Source);
Console.WriteLine("Message :{0} " , e.Message);
Console.WriteLine("Press any key to continue..........");
Console.Read();
}
}
private static void RespCallback(IAsyncResult asynchronousResult)
{
try
{
// State of request is asynchronous.
RequestState myRequestState=(RequestState)
asynchronousResult.AsyncState;
HttpWebRequest myHttpWebRequest=myRequestState.request;
myRequestState.response = (HttpWebResponse)
myHttpWebRequest.EndGetResponse(asynchronousResul t);

// Read the response into a Stream object.
Stream responseStream = myRequestState.response.GetResponseStream();
myRequestState.streamResponse=responseStream;

// Begin the Reading of the contents of the HTML page and print it
to the console.
IAsyncResult asynchronousInputRead =
responseStream.BeginRead(myRequestState.BufferRea d, 0, BUFFER_SIZE, new
AsyncCallback(ReadCallBack), myRequestState);
return;
}
catch(WebException e)
{
Console.WriteLine("\nRespCallback Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
}
allDone.Set();
}
private static void ReadCallBack(IAsyncResult asyncResult)
{
try
{

RequestState myRequestState = (RequestState)asyncResult.AsyncState;
Stream responseStream = myRequestState.streamResponse;
int read = responseStream.EndRead( asyncResult );
// Read the HTML page and then print it to the console.
if (read 0)
{

myRequestState.requestData.Append(Encoding.ASCII. GetString(myRequestState.BufferRead,
0, read));
IAsyncResult asynchronousResult = responseStream.BeginRead(
myRequestState.BufferRead, 0, BUFFER_SIZE, new
AsyncCallback(ReadCallBack), myRequestState);
return;
}
else
{
Console.WriteLine("\nThe contents of the Html page are : ");
if(myRequestState.requestData.Length>1)
{
string stringContent;
stringContent = myRequestState.requestData.ToString();
Console.WriteLine(stringContent);
}
Console.WriteLine("Press any key to continue..........");
Console.ReadLine();

responseStream.Close();
}

}
catch(WebException e)
{
Console.WriteLine("\nReadCallBack Exception raised!");
Console.WriteLine("\nMessage:{0}",e.Message);
Console.WriteLine("\nStatus:{0}",e.Status);
}
allDone.Set();

}

Kuldeep wrote:
>>Hi All,

I am trying to read the contents of a page through its URL.

My code snippet is as follows:
public void mtdGetPageDataHWR()
{
HttpWebRequest objRequ =
(HttpWebRequest)WebRequest.Create("http://www.microsoft.com");
HttpWebResponse objResp = (HttpWebResponse)objRequ.GetResponse();
string strVersion = objResp.ProtocolVersion.ToString();
StreamReader objRd = new StreamReader(objResp.GetResponseStream());
string strRd = objRd.ReadLine();
while(strRd!=null)
{
Response.Write(strRd);
strRd = objRd.ReadLine();
}
}

Is there any other way to achieve this which could be more efficient or
faster than this.

Any help on this would be very handy

Thanks,

Kuldeep
Oct 24 '06 #4
Hi,

A very simple way to download data over web is using a WebClient

System.Net.WebClient client = new System.Net.WebClient();
byte[] data = client.DownloadData("http://www.microsoft.com");
string html = System.Text.Encoding.UTF8.GetString(data);

However, using a webclient, you have little control of the transfer.

--
Happy Coding!
Morten Wennevik [C# MVP]
Oct 24 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Marcel Brekelmans | last post by:
Hi, This is my situation: My ISP doesn't allow me the execute server-code. So, no ASP or otherwise.As a small compensation, they run some CGI scripts that we can use. With one of these...
4
by: ESPN Lover | last post by:
Below is two snippets of code from MSDN showing how to read a file. Is one way preferred over the other and why? Thanks. using System; using System.IO; class Test { public static void...
1
by: Beowulf | last post by:
I have a report laid out in Design View as shown at the end of this message. I have code that performs the following steps: 1. In main report's Report_Open(), DELETE any old rows in tblTOC for...
3
by: dave_nul | last post by:
Hello. I'm trying to read the contents of any Web page on another host. I'm finding that... echo file_get_contents( 'http://www.php.net' ); .... works fine if I upload the PHP script to my...
3
by: Jim S | last post by:
I have a need to read the contents of an html table on a remote web page into a variable. I guess this is called screen scraping but not sure. I'm not sure where to start or what the best...
9
by: =?Utf-8?B?TWlrZQ==?= | last post by:
Hi. Is it programatically possible in VB.NET to read the contents of web.config's <customErrorselement? I looked at using ConfigurationSettings.AppSettings, but that doesn't work. I need to...
0
by: jayasabari | last post by:
Hai, I am reading the content form the msword file and displaying(writing) in aspx page(i.e., .net page). The content is displaying but some symbole is displaying My codding: ...
2
by: dakshayini | last post by:
hi all, I will browse the file and read the name of file in my web page.similarly i need to read the contents of that browsed file into my web page.Is it possible to read and write the contents of...
28
by: tlpell | last post by:
Hey, read some tips/pointers on PHP.net but can't seem to solve this problem. I have a php page that reads the contents of a file and then displays the last XX lines of the file. Problem is...
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
3
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be using a very simple database which has Form (clsForm) & Report (clsReport) classes that simply handle making the calling Form invisible until the Form, or all...
3
by: nia12 | last post by:
Hi there, I am very new to Access so apologies if any of this is obvious/not clear. I am creating a data collection tool for health care employees to complete. It consists of a number of...
0
NeoPa
by: NeoPa | last post by:
Introduction For this article I'll be focusing on the Report (clsReport) class. This simply handles making the calling Form invisible until all of the Reports opened by it have been closed, when it...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
1
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.