Sunny,
I am saying that HttpWebRequest myReq =
(HttpWebRequest)WebRequest.Create("http://microsoft.com/");
works great if you have a domain name... what about
(HttpWebRequest)WebRequest.Create(
http://207.71.134.23);
for microsoft.com and
(HttpWebRequest)WebRequest.Create(
http://207.71.134.23);
for abc.com, quite common for multiple sites to be sharing 1 IP address,
usually going thru DNS its no problem... but i need to be able to directly
access a site...
example above: in order for me to get the correct site i must also supply
the microsoft.com host header value or abc.com host header value.
It appears that one cannot modify certain Headers in HttpWebRequest and Host
is one of them.
Be a hero and share your spider code ;0) i am working on something
similar...
"Sunny" <sunnyask@icebergwireless.com> wrote in message
news:eboa7guXEHA.2868@TK2MSFTNGP09.phx.gbl...[color=blue]
> So,
> are you saying that:
>
> HttpWebRequest myReq =
> (HttpWebRequest)WebRequest.Create("http://microsoft.com/");
>
> and
>
> HttpWebRequest myReq =
> (HttpWebRequest)WebRequest.Create("http://abc.com/");
>
> both create one and the same HttpWebRequest object, and you need to fix
> the HOST header?
>
> In my tests, the correct header is created, so still I'm wondering why
> you can not use HttpWebRequest for your task.
>
> I have created in the past a very basic web spider, which uses
> HttpWebRequest, the creates a MSHTMLDocument document with the content
> fetched, and then I was able to iterate and download all links and
> pictures.
>
>
> Sunny
>
> In article <uiqpoTuXEHA.2544@TK2MSFTNGP10.phx.gbl>,
alexander@K.com
> says...[color=green]
> > Different Websites sharing same IP's example
> >
> > microsoft.com and abc.com both on server 207.71.34.12
> >
> > require host header to specify desired site
> >
> >
> >
> > "Sunny" <sunnyask@icebergwireless.com> wrote in message
> > news:OlGznysXEHA.2664@TK2MSFTNGP09.phx.gbl...[color=darkred]
> > > Hi,
> > >
> > > why do you need to change the HOST header?
> > >
> > > Sunny
> > >
> > >
> > > In article <OLB$WXUXEHA.3668@TK2MSFTNGP09.phx.gbl>,
alexander@K.com
> > > says...
> > > > it now appears that i cannot use HttpWebRequest because i need to be[/color]
> > able to[color=darkred]
> > > > specify the Host Header.... and HttpWebRequest.Headers HOST is set[/color][/color][/color]
by[color=blue][color=green]
> > the[color=darkred]
> > > > system to the current host information and now way for me to modify[/color][/color][/color]
it..[color=blue][color=green][color=darkred]
> > > >
> > > > I need to retrive webpages for the remote server to cache it... any[/color]
> > ideas?[color=darkred]
> > > >
> > > >
> > > >
> > > >
> > > > "Thomas Peter" <alexander@K.com> wrote in message
> > > > news:OTz7SwSXEHA.1684@tk2msftngp13.phx.gbl...
> > > > > Thanks Steven,
> > > > >
> > > > > I need to make sure that i am remotely caching all of the html[/color]
> > including[color=darkred]
> > > > all
> > > > > pitcures... hence i figured a simple WebRequest wont do...
> > > > > so i am trying to get the GetResponseStream() into an HTMLDocument[/color]
> > object[color=darkred]
> > > > to
> > > > > ensure that the entire site loads...
> > > > > But
> > > > > StreamReader readStream = new StreamReader (receiveStream,[/color]
> > Encoding.UTF8);[color=darkred]
> > > > >
> > > > > string tmp = readStream.ReadLine();
> > > > >
> > > > > HTMLDocument htmlDoc = new HTMLDocumentClass();
> > > > >
> > > > > htmlDoc = (HTMLDocument) tmp; // ??? how do i get the response[/color][/color][/color]
stream[color=blue][color=green][color=darkred]
> > > > > into/as htmldocument?
> > > > >
> > > > > Any ideas?
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > ///--------------- Full example
> > > > >
> > > > > HttpWebRequest request = (HttpWebRequest)WebRequest.Create
> > > > > (
http://www.microsoft.com);
> > > > >
> > > > > request.MaximumAutomaticRedirections = 4;
> > > > >
> > > > > request.MaximumResponseHeadersLength = 4;
> > > > >
> > > > >
> > > > > HttpWebResponse response = (HttpWebResponse)request.GetResponse[/color][/color][/color]
();[color=blue][color=green][color=darkred]
> > > > >
> > > > > Console.WriteLine ("Content length is {0}",[/color][/color][/color]
response.ContentLength);[color=blue][color=green][color=darkred]
> > > > >
> > > > > Console.WriteLine ("Content type is {0}", response.ContentType);
> > > > >
> > > > > Stream receiveStream = response.GetResponseStream ();
> > > > >
> > > > > StreamReader readStream = new StreamReader (receiveStream,[/color]
> > Encoding.UTF8);[color=darkred]
> > > > >
> > > > > string tmp = readStream.ReadLine();
> > > > >
> > > > > HTMLDocument htmlDoc = new HTMLDocumentClass();
> > > > >
> > > > > htmlDoc = (HTMLDocument) tmp;
> > > > >
> > > > >
> > > > > response.Close ();
> > > > >
> > > > > readStream.Close ();
> > > > >
> > > > >
> > > > > "Steven Cheng[MSFT]" <v-schang@online.microsoft.com> wrote in[/color][/color][/color]
message[color=blue][color=green][color=darkred]
> > > > > news:kBo3suPXEHA.3788@cpmsftngxa10.phx.gbl...
> > > > > > Hi Thomas,
> > > > > >
> > > > > > As for the request and cache remote pages question, I think the
> > > > > > HttpWebRequest is capable of handling this. We can use[/color]
> > HttpWebRequest to[color=darkred]
> > > > > > send request to a certain url and get it's response stream,[/color][/color][/color]
thus, we[color=blue][color=green]
> > can[color=darkred]
> > > > > > store the response result(Html or anyother mime type) into the
> > > > persistence
> > > > > > medium we want , for example, file system, memory ,database or[/color][/color][/color]
....[color=blue][color=green][color=darkred]
> > > > > >
> > > > > > And the MSHTML components are the components library that help[/color][/color][/color]
to[color=blue][color=green][color=darkred]
> > > > > > progrmatically process the certain web page's response as a[/color]
> > Document(DOM[color=darkred]
> > > > > > structure) , just like what we can do in a web browser. If we[/color][/color][/color]
just[color=blue][color=green]
> > want[color=darkred]
> > > > to
> > > > > > get the response result (the html ouput or file stream), the
> > > > > HttpWEbRequest
> > > > > > is enough and the MSHTML is not necessary.
> > > > > > In addition, here are some tech articles on using the[/color][/color][/color]
HttpWebRequest[color=blue][color=green]
> > to[color=darkred]
> > > > > > request web resources:
> > > > > >
> > > > > > #Accessing Web Sites Using Desktop Applications
> > > > > >[/color]
> >
http://www.devsource.ziffdavis.com/p...=119849,00.asp[color=darkred]
> > > > > >
> > > > > > #Crawl Web Sites and Catalog Info to Any Data Store with ADO.NET[/color][/color][/color]
and[color=blue][color=green][color=darkred]
> > > > > Visual
> > > > > > Basic .NET
> > > > > >
http://msdn.microsoft.com/msdnmag/is...0/spiderinnet/
> > > > > >
> > > > > > Hope also helps. Thanks.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Steven Cheng
> > > > > > Microsoft Online Support
> > > > > >
> > > > > > Get Secure!
www.microsoft.com/security
> > > > > > (This posting is provided "AS IS", with no warranties, and[/color][/color][/color]
confers[color=blue][color=green]
> > no[color=darkred]
> > > > > > rights.)
> > > > > >
> > > > > > Get Preview at ASP.NET whidbey
> > > > > >
http://msdn.microsoft.com/asp.net/whidbey/default.aspx
> > > > > >
> > > > > >
> > > > >
> > > > >
> > > >
> > > >
> > > >[/color]
> >
> >
> >[/color][/color]