473,378 Members | 1,375 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Using WebClient to get the actual file name.

Hi,

I've been working on an application to do some 'scraping' of web
content, using the WebClient class. I'm using code rather like the
following..

Dim objWebClient As New WebClient
Dim strURL As String = CType(URL, String)
Dim aRequestedHTML() As Byte
Dim objUTF8 As New UTF8Encoding
Dim strRequestedHTML As String

aRequestedHTML = objWebClient.DownloadData(strURL)
strRequestedHTML = objUTF8.GetString(aRequestedHTML)

Return strRequestedHTML

However, if I enter the url, say
http://web.archive.org/web/200306221...odhouse.co.uk/

the returned html is not the page that I requested (as viewed in the
browser), but rather a error page saying that the page has not been
found.

Does anyone know why this may be the case? My ideas so far have been
that the web browser (and server) and work out to send the correct
page, while the Web Client class isn't actually specifying a page at
the end of the url..

If anyone has an idea on this problem I would like to hear it, the
deadline for this project is coming up fast and this is a stumbling
block that I need to overcome, once this is done then its (hopefully!)
all plain sailing from here.

Thanks for your time

Chris Williams (chris (at) oxymoron-failsafe.com)
Nov 21 '05 #1
1 3095
If you are wanting the content of the page, here are two ways of getting it.

Private Sub Button1_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button1.Click
Dim wc As New System.Net.WebClient
Dim ascii As System.Text.Encoding = System.Text.Encoding.ASCII
Dim Results As Byte() =
wc.DownloadData("http://web.archive.org/web/20030622145316/http://www.lynwoo
dhouse.co.uk/")
Dim asciiChars(Results.GetLength(0)) As Char
ascii.GetChars(Results, 0, Results.Length, asciiChars, 0)
Dim asciiString As New String(asciiChars)
MsgBox(asciiString)
End Sub
Private Sub Button2_Click(ByVal sender As System.Object, ByVal e As
System.EventArgs) Handles Button2.Click
Dim myRequest As System.Net.WebRequest =
System.Net.WebRequest.Create("http://web.archive.org/web/20030622145316/http
://www.lynwoodhouse.co.uk/")
Dim myResponse As System.Net.WebResponse = myRequest.GetResponse()
Dim myStream As System.IO.Stream = myResponse.GetResponseStream
Dim sr As System.IO.StreamReader = New
System.IO.StreamReader(myStream)
MsgBox(sr.ReadToEnd)
myStream.Close()
myResponse.Close()
End Sub

"Scampi" <sc****@lyingsackofshit.com> wrote in message
news:b7**************************@posting.google.c om...
Hi,

I've been working on an application to do some 'scraping' of web
content, using the WebClient class. I'm using code rather like the
following..

Dim objWebClient As New WebClient
Dim strURL As String = CType(URL, String)
Dim aRequestedHTML() As Byte
Dim objUTF8 As New UTF8Encoding
Dim strRequestedHTML As String

aRequestedHTML = objWebClient.DownloadData(strURL)
strRequestedHTML = objUTF8.GetString(aRequestedHTML)

Return strRequestedHTML

However, if I enter the url, say
http://web.archive.org/web/200306221...odhouse.co.uk/

the returned html is not the page that I requested (as viewed in the
browser), but rather a error page saying that the page has not been
found.

Does anyone know why this may be the case? My ideas so far have been
that the web browser (and server) and work out to send the correct
page, while the Web Client class isn't actually specifying a page at
the end of the url..

If anyone has an idea on this problem I would like to hear it, the
deadline for this project is coming up fast and this is a stumbling
block that I need to overcome, once this is done then its (hopefully!)
all plain sailing from here.

Thanks for your time

Chris Williams (chris (at) oxymoron-failsafe.com)

Nov 21 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: brianinbox | last post by:
Hi, I've been trying to upload file using webclient.uploadfile method from my IIS webserver to an Apache webserver without any success. On the Apache server (server that receives the incoming file)...
4
by: brianinbox | last post by:
Hi, I've been trying to upload file using webclient.uploadfile method from my IIS webserver to an Apache webserver without any success. On the Apache server (server that receives the incoming...
3
by: Bruno Otero | last post by:
Hi! I have a program that excutes the following code: DirectoryInfo diretorio = new DirectoryInfo(@"C:\teste"); FileInfo arquivos = diretorio.GetFiles("*.xml"); for(int...
1
by: Sven Putze | last post by:
my target: I want to upload a file via HTTP Post and show the user a progress bar my 1st try: I´ve used WebClient.UploadFile and WebClient.UploadData, they work fine but I don't know any...
6
by: genc ymeri | last post by:
Hi, We are struggeling to upload a file through a C# webClient into JBoss web server. Meanwhile we are able to upload a file from the webserver itself. The problem is only with C# webClient . The...
8
by: DEWright_CA | last post by:
Why does WebClient.DownloadFile loose my completed path? Ok, I have a function in my app then when my button is clicked it checks to see if the files come from a local drive or a http address....
4
by: Natalia | last post by:
Hello, I need to provide the ability to post file and some form elements via our website (asp.net) to the third party website (asp page). On http://aspalliance.com/236#Page4 - I found great...
3
by: Nathan | last post by:
Have a strange problem. Creating a windows service running under the Local Sytem account. I have a Webclient that downloads a file from the internet. The code works fine if executed from a windows...
5
by: benmess | last post by:
This code snippet works fine on a localhost because the file you upload resides on the host machine (where FileServer.aspx is a new page invoked from the UploadFile call) function...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.