473,837 Members | 1,601 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Downloading WebSites using HttpWebRequest

I am building a precache engine... one that request over 100 pages on an
remote server to cache them remotely...
can i use the HttpWebRequest and WebResponse classes for this? or must i use
the MSHTML objects to really load the HTML and request all of the images on
site?

string lcUrl = http://www.cnn.com;

// *** Establish the request

HttpWebRequest loHttp =

(HttpWebRequest ) WebRequest.Crea te(lcUrl);

// *** Set properties

loHttp.Timeout = 10000; // 10 secs

loHttp.UserAgen t = "Code Sample Web Client";

// *** Retrieve request info headers

HttpWebResponse loWebResponse = (HttpWebRespons e) loHttp.GetRespo nse();

Encoding enc = Encoding.GetEnc oding(1252); // Windows default Code Page

StreamReader loResponseStrea m =

new StreamReader(lo WebResponse.Get ResponseStream( ),enc);

string lcHtml = loResponseStrea m.ReadToEnd();

loWebResponse.C lose();

loResponseStrea m.Close();
Nov 16 '05 #1
16 12653
Hi Thomas,

As for the request and cache remote pages question, I think the
HttpWebRequest is capable of handling this. We can use HttpWebRequest to
send request to a certain url and get it's response stream, thus, we can
store the response result(Html or anyother mime type) into the persistence
medium we want , for example, file system, memory ,database or ...

And the MSHTML components are the components library that help to
progrmatically process the certain web page's response as a Document(DOM
structure) , just like what we can do in a web browser. If we just want to
get the response result (the html ouput or file stream), the HttpWEbRequest
is enough and the MSHTML is not necessary.
In addition, here are some tech articles on using the HttpWebRequest to
request web resources:

#Accessing Web Sites Using Desktop Applications
http://www.devsource.ziffdavis.com/p...=119849,00.asp

#Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET and Visual
Basic .NET
http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/

Hope also helps. Thanks.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx
Nov 16 '05 #2
Thanks Steven,

I need to make sure that i am remotely caching all of the html including all
pitcures... hence i figured a simple WebRequest wont do...
so i am trying to get the GetResponseStre am() into an HTMLDocument object to
ensure that the entire site loads...
But
StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response stream
into/as htmldocument?

Any ideas?



///--------------- Full example

HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate
(http://www.microsoft.com);

request.Maximum AutomaticRedire ctions = 4;

request.Maximum ResponseHeaders Length = 4;
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse ();

Console.WriteLi ne ("Content length is {0}", response.Conten tLength);

Console.WriteLi ne ("Content type is {0}", response.Conten tType);

Stream receiveStream = response.GetRes ponseStream ();

StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp;
response.Close ();

readStream.Clos e ();
"Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message
news:kB******** ******@cpmsftng xa10.phx.gbl...
Hi Thomas,

As for the request and cache remote pages question, I think the
HttpWebRequest is capable of handling this. We can use HttpWebRequest to
send request to a certain url and get it's response stream, thus, we can
store the response result(Html or anyother mime type) into the persistence
medium we want , for example, file system, memory ,database or ...

And the MSHTML components are the components library that help to
progrmatically process the certain web page's response as a Document(DOM
structure) , just like what we can do in a web browser. If we just want to
get the response result (the html ouput or file stream), the HttpWEbRequest is enough and the MSHTML is not necessary.
In addition, here are some tech articles on using the HttpWebRequest to
request web resources:

#Accessing Web Sites Using Desktop Applications
http://www.devsource.ziffdavis.com/p...=119849,00.asp

#Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET and Visual Basic .NET
http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/

Hope also helps. Thanks.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx

Nov 16 '05 #3
it now appears that i cannot use HttpWebRequest because i need to be able to
specify the Host Header.... and HttpWebRequest. Headers HOST is set by the
system to the current host information and now way for me to modify it..

I need to retrive webpages for the remote server to cache it... any ideas?


"Thomas Peter" <al*******@K.co m> wrote in message
news:OT******** ******@tk2msftn gp13.phx.gbl...
Thanks Steven,

I need to make sure that i am remotely caching all of the html including all pitcures... hence i figured a simple WebRequest wont do...
so i am trying to get the GetResponseStre am() into an HTMLDocument object to ensure that the entire site loads...
But
StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response stream
into/as htmldocument?

Any ideas?



///--------------- Full example

HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate
(http://www.microsoft.com);

request.Maximum AutomaticRedire ctions = 4;

request.Maximum ResponseHeaders Length = 4;
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse ();

Console.WriteLi ne ("Content length is {0}", response.Conten tLength);

Console.WriteLi ne ("Content type is {0}", response.Conten tType);

Stream receiveStream = response.GetRes ponseStream ();

StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp;
response.Close ();

readStream.Clos e ();
"Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message
news:kB******** ******@cpmsftng xa10.phx.gbl...
Hi Thomas,

As for the request and cache remote pages question, I think the
HttpWebRequest is capable of handling this. We can use HttpWebRequest to
send request to a certain url and get it's response stream, thus, we can
store the response result(Html or anyother mime type) into the persistence medium we want , for example, file system, memory ,database or ...

And the MSHTML components are the components library that help to
progrmatically process the certain web page's response as a Document(DOM
structure) , just like what we can do in a web browser. If we just want to get the response result (the html ouput or file stream), the

HttpWEbRequest
is enough and the MSHTML is not necessary.
In addition, here are some tech articles on using the HttpWebRequest to
request web resources:

#Accessing Web Sites Using Desktop Applications
http://www.devsource.ziffdavis.com/p...=119849,00.asp

#Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET and

Visual
Basic .NET
http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/

Hope also helps. Thanks.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx


Nov 16 '05 #4
Hi Thomas,

Thanks for your followup. Based on my experience, since you want to request
the page and retrieve it's reponse stream and load it into the HTMLDocument
to process it. I think you can consider using the WEbBrowser control to do
the task. You can use WebBrowser control to navigate a certain web resource
and when the page is loaded, it'll automatically be loaded into a Document
object.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx
Nov 16 '05 #5
Cant use webbrowser because application must be a webapplication. ...

I dropped HTTPWebRequest/Response methods and opted for MSXML2, But does
ServerXMLHTTP open support different ports?
MSXML2.ServerXM LHTTPClass();

"Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message
news:hz******** ******@cpmsftng xa10.phx.gbl...
Hi Thomas,

Thanks for your followup. Based on my experience, since you want to request the page and retrieve it's reponse stream and load it into the HTMLDocument to process it. I think you can consider using the WEbBrowser control to do
the task. You can use WebBrowser control to navigate a certain web resource and when the page is loaded, it'll automatically be loaded into a Document
object.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx

Nov 16 '05 #6
Hi,

why do you need to change the HOST header?

Sunny
In article <OL************ **@TK2MSFTNGP09 .phx.gbl>, al*******@K.com
says...
it now appears that i cannot use HttpWebRequest because i need to be able to
specify the Host Header.... and HttpWebRequest. Headers HOST is set by the
system to the current host information and now way for me to modify it..

I need to retrive webpages for the remote server to cache it... any ideas?


"Thomas Peter" <al*******@K.co m> wrote in message
news:OT******** ******@tk2msftn gp13.phx.gbl...
Thanks Steven,

I need to make sure that i am remotely caching all of the html including

all
pitcures... hence i figured a simple WebRequest wont do...
so i am trying to get the GetResponseStre am() into an HTMLDocument object

to
ensure that the entire site loads...
But
StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response stream
into/as htmldocument?

Any ideas?



///--------------- Full example

HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate
(http://www.microsoft.com);

request.Maximum AutomaticRedire ctions = 4;

request.Maximum ResponseHeaders Length = 4;
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse ();

Console.WriteLi ne ("Content length is {0}", response.Conten tLength);

Console.WriteLi ne ("Content type is {0}", response.Conten tType);

Stream receiveStream = response.GetRes ponseStream ();

StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);

string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp;
response.Close ();

readStream.Clos e ();
"Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message
news:kB******** ******@cpmsftng xa10.phx.gbl...
Hi Thomas,

As for the request and cache remote pages question, I think the
HttpWebRequest is capable of handling this. We can use HttpWebRequest to
send request to a certain url and get it's response stream, thus, we can
store the response result(Html or anyother mime type) into the persistence medium we want , for example, file system, memory ,database or ...

And the MSHTML components are the components library that help to
progrmatically process the certain web page's response as a Document(DOM
structure) , just like what we can do in a web browser. If we just want to get the response result (the html ouput or file stream), the

HttpWEbRequest
is enough and the MSHTML is not necessary.
In addition, here are some tech articles on using the HttpWebRequest to
request web resources:

#Accessing Web Sites Using Desktop Applications
http://www.devsource.ziffdavis.com/p...=119849,00.asp

#Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET and

Visual
Basic .NET
http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/

Hope also helps. Thanks.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Get Preview at ASP.NET whidbey
http://msdn.microsoft.com/asp.net/whidbey/default.aspx



Nov 16 '05 #7
Different Websites sharing same IP's example

microsoft.com and abc.com both on server 207.71.34.12

require host header to specify desired site

"Sunny" <su******@icebe rgwireless.com> wrote in message
news:Ol******** ******@TK2MSFTN GP09.phx.gbl...
Hi,

why do you need to change the HOST header?

Sunny
In article <OL************ **@TK2MSFTNGP09 .phx.gbl>, al*******@K.com
says...
it now appears that i cannot use HttpWebRequest because i need to be able to specify the Host Header.... and HttpWebRequest. Headers HOST is set by the system to the current host information and now way for me to modify it..

I need to retrive webpages for the remote server to cache it... any ideas?

"Thomas Peter" <al*******@K.co m> wrote in message
news:OT******** ******@tk2msftn gp13.phx.gbl...
Thanks Steven,

I need to make sure that i am remotely caching all of the html including
all
pitcures... hence i figured a simple WebRequest wont do...
so i am trying to get the GetResponseStre am() into an HTMLDocument
object to
ensure that the entire site loads...
But
StreamReader readStream = new StreamReader (receiveStream,
Encoding.UTF8);
string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response stream
into/as htmldocument?

Any ideas?



///--------------- Full example

HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate
(http://www.microsoft.com);

request.Maximum AutomaticRedire ctions = 4;

request.Maximum ResponseHeaders Length = 4;
HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse ();

Console.WriteLi ne ("Content length is {0}", response.Conten tLength);

Console.WriteLi ne ("Content type is {0}", response.Conten tType);

Stream receiveStream = response.GetRes ponseStream ();

StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8);
string tmp = readStream.Read Line();

HTMLDocument htmlDoc = new HTMLDocumentCla ss();

htmlDoc = (HTMLDocument) tmp;
response.Close ();

readStream.Clos e ();
"Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message
news:kB******** ******@cpmsftng xa10.phx.gbl...
> Hi Thomas,
>
> As for the request and cache remote pages question, I think the
> HttpWebRequest is capable of handling this. We can use HttpWebRequest to > send request to a certain url and get it's response stream, thus, we can > store the response result(Html or anyother mime type) into the

persistence
> medium we want , for example, file system, memory ,database or ...
>
> And the MSHTML components are the components library that help to
> progrmatically process the certain web page's response as a Document(DOM > structure) , just like what we can do in a web browser. If we just want to
> get the response result (the html ouput or file stream), the
HttpWEbRequest
> is enough and the MSHTML is not necessary.
> In addition, here are some tech articles on using the HttpWebRequest

to > request web resources:
>
> #Accessing Web Sites Using Desktop Applications
> http://www.devsource.ziffdavis.com/p...=119849,00.asp >
> #Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET and
Visual
> Basic .NET
> http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/
>
> Hope also helps. Thanks.
>
> Regards,
>
> Steven Cheng
> Microsoft Online Support
>
> Get Secure! www.microsoft.com/security
> (This posting is provided "AS IS", with no warranties, and confers no > rights.)
>
> Get Preview at ASP.NET whidbey
> http://msdn.microsoft.com/asp.net/whidbey/default.aspx
>
>


Nov 16 '05 #8
So,
are you saying that:

HttpWebRequest myReq =
(HttpWebRequest )WebRequest.Cre ate("http://microsoft.com/");

and

HttpWebRequest myReq =
(HttpWebRequest )WebRequest.Cre ate("http://abc.com/");

both create one and the same HttpWebRequest object, and you need to fix
the HOST header?

In my tests, the correct header is created, so still I'm wondering why
you can not use HttpWebRequest for your task.

I have created in the past a very basic web spider, which uses
HttpWebRequest, the creates a MSHTMLDocument document with the content
fetched, and then I was able to iterate and download all links and
pictures.
Sunny

In article <ui************ **@TK2MSFTNGP10 .phx.gbl>, al*******@K.com
says...
Different Websites sharing same IP's example

microsoft.com and abc.com both on server 207.71.34.12

require host header to specify desired site

"Sunny" <su******@icebe rgwireless.com> wrote in message
news:Ol******** ******@TK2MSFTN GP09.phx.gbl...
Hi,

why do you need to change the HOST header?

Sunny
In article <OL************ **@TK2MSFTNGP09 .phx.gbl>, al*******@K.com
says...
it now appears that i cannot use HttpWebRequest because i need to be able to specify the Host Header.... and HttpWebRequest. Headers HOST is set by the system to the current host information and now way for me to modify it..

I need to retrive webpages for the remote server to cache it... any ideas?

"Thomas Peter" <al*******@K.co m> wrote in message
news:OT******** ******@tk2msftn gp13.phx.gbl...
> Thanks Steven,
>
> I need to make sure that i am remotely caching all of the html including all
> pitcures... hence i figured a simple WebRequest wont do...
> so i am trying to get the GetResponseStre am() into an HTMLDocument object to
> ensure that the entire site loads...
> But
> StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8); >
> string tmp = readStream.Read Line();
>
> HTMLDocument htmlDoc = new HTMLDocumentCla ss();
>
> htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response stream
> into/as htmldocument?
>
> Any ideas?
>
>
>
>
>
>
>
> ///--------------- Full example
>
> HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate
> (http://www.microsoft.com);
>
> request.Maximum AutomaticRedire ctions = 4;
>
> request.Maximum ResponseHeaders Length = 4;
>
>
> HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse ();
>
> Console.WriteLi ne ("Content length is {0}", response.Conten tLength);
>
> Console.WriteLi ne ("Content type is {0}", response.Conten tType);
>
> Stream receiveStream = response.GetRes ponseStream ();
>
> StreamReader readStream = new StreamReader (receiveStream, Encoding.UTF8); >
> string tmp = readStream.Read Line();
>
> HTMLDocument htmlDoc = new HTMLDocumentCla ss();
>
> htmlDoc = (HTMLDocument) tmp;
>
>
> response.Close ();
>
> readStream.Clos e ();
>
>
> "Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message
> news:kB******** ******@cpmsftng xa10.phx.gbl...
> > Hi Thomas,
> >
> > As for the request and cache remote pages question, I think the
> > HttpWebRequest is capable of handling this. We can use HttpWebRequest to > > send request to a certain url and get it's response stream, thus, we can > > store the response result(Html or anyother mime type) into the
persistence
> > medium we want , for example, file system, memory ,database or ...
> >
> > And the MSHTML components are the components library that help to
> > progrmatically process the certain web page's response as a Document(DOM > > structure) , just like what we can do in a web browser. If we just want to
> > get the response result (the html ouput or file stream), the
> HttpWEbRequest
> > is enough and the MSHTML is not necessary.
> > In addition, here are some tech articles on using the HttpWebRequest to > > request web resources:
> >
> > #Accessing Web Sites Using Desktop Applications
> > http://www.devsource.ziffdavis.com/p...=119849,00.asp > >
> > #Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET and
> Visual
> > Basic .NET
> > http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/
> >
> > Hope also helps. Thanks.
> >
> > Regards,
> >
> > Steven Cheng
> > Microsoft Online Support
> >
> > Get Secure! www.microsoft.com/security
> > (This posting is provided "AS IS", with no warranties, and confers no > > rights.)
> >
> > Get Preview at ASP.NET whidbey
> > http://msdn.microsoft.com/asp.net/whidbey/default.aspx
> >
> >
>
>


Nov 16 '05 #9
Sunny,

I am saying that HttpWebRequest myReq =
(HttpWebRequest )WebRequest.Cre ate("http://microsoft.com/");

works great if you have a domain name... what about

(HttpWebRequest )WebRequest.Cre ate(http://207.71.134.23);

for microsoft.com and

(HttpWebRequest )WebRequest.Cre ate(http://207.71.134.23);

for abc.com, quite common for multiple sites to be sharing 1 IP address,
usually going thru DNS its no problem... but i need to be able to directly
access a site...
example above: in order for me to get the correct site i must also supply
the microsoft.com host header value or abc.com host header value.

It appears that one cannot modify certain Headers in HttpWebRequest and Host
is one of them.

Be a hero and share your spider code ;0) i am working on something
similar...

"Sunny" <su******@icebe rgwireless.com> wrote in message
news:eb******** ******@TK2MSFTN GP09.phx.gbl...
So,
are you saying that:

HttpWebRequest myReq =
(HttpWebRequest )WebRequest.Cre ate("http://microsoft.com/");

and

HttpWebRequest myReq =
(HttpWebRequest )WebRequest.Cre ate("http://abc.com/");

both create one and the same HttpWebRequest object, and you need to fix
the HOST header?

In my tests, the correct header is created, so still I'm wondering why
you can not use HttpWebRequest for your task.

I have created in the past a very basic web spider, which uses
HttpWebRequest, the creates a MSHTMLDocument document with the content
fetched, and then I was able to iterate and download all links and
pictures.
Sunny

In article <ui************ **@TK2MSFTNGP10 .phx.gbl>, al*******@K.com
says...
Different Websites sharing same IP's example

microsoft.com and abc.com both on server 207.71.34.12

require host header to specify desired site

"Sunny" <su******@icebe rgwireless.com> wrote in message
news:Ol******** ******@TK2MSFTN GP09.phx.gbl...
Hi,

why do you need to change the HOST header?

Sunny
In article <OL************ **@TK2MSFTNGP09 .phx.gbl>, al*******@K.com
says...
> it now appears that i cannot use HttpWebRequest because i need to be

able to
> specify the Host Header.... and HttpWebRequest. Headers HOST is set by
the
> system to the current host information and now way for me to modify
it.. >
> I need to retrive webpages for the remote server to cache it... any

ideas?
>
>
>
>
> "Thomas Peter" <al*******@K.co m> wrote in message
> news:OT******** ******@tk2msftn gp13.phx.gbl...
> > Thanks Steven,
> >
> > I need to make sure that i am remotely caching all of the html

including
> all
> > pitcures... hence i figured a simple WebRequest wont do...
> > so i am trying to get the GetResponseStre am() into an HTMLDocument

object
> to
> > ensure that the entire site loads...
> > But
> > StreamReader readStream = new StreamReader (receiveStream,

Encoding.UTF8);
> >
> > string tmp = readStream.Read Line();
> >
> > HTMLDocument htmlDoc = new HTMLDocumentCla ss();
> >
> > htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response stream > > into/as htmldocument?
> >
> > Any ideas?
> >
> >
> >
> >
> >
> >
> >
> > ///--------------- Full example
> >
> > HttpWebRequest request = (HttpWebRequest )WebRequest.Cre ate
> > (http://www.microsoft.com);
> >
> > request.Maximum AutomaticRedire ctions = 4;
> >
> > request.Maximum ResponseHeaders Length = 4;
> >
> >
> > HttpWebResponse response = (HttpWebRespons e)request.GetRe sponse (); > >
> > Console.WriteLi ne ("Content length is {0}", response.Conten tLength); > >
> > Console.WriteLi ne ("Content type is {0}", response.Conten tType);
> >
> > Stream receiveStream = response.GetRes ponseStream ();
> >
> > StreamReader readStream = new StreamReader (receiveStream,

Encoding.UTF8);
> >
> > string tmp = readStream.Read Line();
> >
> > HTMLDocument htmlDoc = new HTMLDocumentCla ss();
> >
> > htmlDoc = (HTMLDocument) tmp;
> >
> >
> > response.Close ();
> >
> > readStream.Clos e ();
> >
> >
> > "Steven Cheng[MSFT]" <v-******@online.m icrosoft.com> wrote in message > > news:kB******** ******@cpmsftng xa10.phx.gbl...
> > > Hi Thomas,
> > >
> > > As for the request and cache remote pages question, I think the
> > > HttpWebRequest is capable of handling this. We can use

HttpWebRequest to
> > > send request to a certain url and get it's response stream, thus, we can
> > > store the response result(Html or anyother mime type) into the
> persistence
> > > medium we want , for example, file system, memory ,database or
.... > > >
> > > And the MSHTML components are the components library that help to > > > progrmatically process the certain web page's response as a

Document(DOM
> > > structure) , just like what we can do in a web browser. If we just want
> to
> > > get the response result (the html ouput or file stream), the
> > HttpWEbRequest
> > > is enough and the MSHTML is not necessary.
> > > In addition, here are some tech articles on using the
HttpWebRequest to
> > > request web resources:
> > >
> > > #Accessing Web Sites Using Desktop Applications
> > >

http://www.devsource.ziffdavis.com/p...=119849,00.asp
> > >
> > > #Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET
and > > Visual
> > > Basic .NET
> > > http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/
> > >
> > > Hope also helps. Thanks.
> > >
> > > Regards,
> > >
> > > Steven Cheng
> > > Microsoft Online Support
> > >
> > > Get Secure! www.microsoft.com/security
> > > (This posting is provided "AS IS", with no warranties, and

confers no
> > > rights.)
> > >
> > > Get Preview at ASP.NET whidbey
> > > http://msdn.microsoft.com/asp.net/whidbey/default.aspx
> > >
> > >
> >
> >
>
>
>


Nov 16 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
17312
by: Hans Kamp | last post by:
Is it possible to write a function like the following: string ReadURL(string URL) { .... } The purpose is that it reads the URL (determined by the parameter) and returns the string in which there is HTML-code, for example:
6
6070
by: Andrew | last post by:
I'm trying to download a document to a machine using the documents URL. I've been playing around with the HTTPWebRequest object but without any joy. My code is wrWebRequest = (HttpWebRequest)WebRequest.Create(strDocURL); wrWebRequest.Credentials = System.Net.CredentialCache.DefaultCredentials; wrWebRequest.Method = "POST"; StreamReader Stream = new StreamReader(wrWebReq.GetRequestStream());
2
2078
by: Tamir Khason | last post by:
'm using HttpWebRequest/Response to get content type of urls (a lot of them) It takes a while, so is it possible to tell to request/response just get a content type (download only headers) ? TNX -- Tamir Khason You want dot.NET? Just ask: "Please, www.dotnet.us "
1
2817
by: ATS | last post by:
PRB - HttpWebRequest does not work with CGI websites and/or RAW data Please help, I'm trying to make a web deploying UserControl that gets RAW binary data that is dynamically created from a CGI application webiste, and have the UserControl write the data down to a file. I've used the code listed here after to make this happen. If I plug in a URL to a non-CGI site, such as a static HTML page, the UserControl successfully get the output...
4
2954
by: Joe | last post by:
I'm hosting my web service on a Windows 2003 box which is remotely located. When trying to add a web reference to a C# project I get an error message 'There was an error downloading 'http://mydomain.com:port/webservice.asmx' The operation has timed-out (I've tried with and without using a separate port for the service) The weird thing is the page does show up on the left side of the screen listing the available methods but the Add...
8
5264
by: jdhancock | last post by:
Can someone give me an example of how I can execute a program on a corporate server and then download the response to the user? I'm not sure if I'm asking this right. Here is the asp code used today. I am attempting to recreate in ASP.NET. Thanks in advance, Clem -------------------------------------------------------------------------------------------------------------------------------------- Set HttpObj =...
4
2019
by: PiotrKolodziej | last post by:
hi I have a thread that downloades a file. Problem is : Not all files are beeing downloaded. I observed that only the small files are beeing downloaded correctly. I also cant download two files in a row. I got to Exit and run the program again to even start downloading. The second problem i interprett as iam not closing something ( but don't know what ). For the first problem i have no idea what is wrong.Here is the download thread Maybe...
4
3457
by: Nik0001 | last post by:
Hello everyone! I have the following problem I need to download several HTML pages and get meta-tags out of the code. I decided it would be better to download only the meta-tags rather than downloading the whole page. But the standard method (HttpWebRequest) in C# only allows me to download the whole page. Is there some alternative method?
7
7793
by: raids51 | last post by:
Hello, i have a program that downloads a file using the httpwebrequest/response, and it usually works, but sometimes it will freeze at a random part of the download without an error. here is the code: 'Creating the request and getting the response Dim theResponse As HttpWebResponse Dim theRequest As HttpWebRequest Try 'Checks if the file exist theRequest = WebRequest.Create(Me.Filelocation) ...
0
9852
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10902
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10583
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10642
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7824
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7014
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5680
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
4062
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3128
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.