473,396 Members | 1,772 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

httpWebRequest through Proxy

Hi there.

I have written a screen scraping application (both web based and windows forms) in vb.net. When testing on a public broadband link it works fine. However it fails at work due to our proxy server.

To authenticate in the asp version, I just added:
Expand|Select|Wrap|Line Numbers
  1. <defaultProxy useDefaultCredentials="true">
to my web.config - this works fine.

However, I need to integrate this with an existing windows form app (vb6) so need to get the windows form version working.

I am struggling to achieve the same settings (as per web.config) directly within my code. This is what I am trying to use so far:

Expand|Select|Wrap|Line Numbers
  1. Dim mywebRequest As HttpWebRequest = TryCast(WebRequest.Create("http://www.thedomain.com/login.aspx"), HttpWebRequest)
  2.  
  3. mywebRequest.Proxy = New WebProxy("http://myproxyserver")
  4. mywebRequest.Proxy.Credentials = CredentialCache.DefaultCredentials
  5. mywebRequest.Credentials = CredentialCache.DefaultCredentials
As the page I am trying to access requires authentication, I first capture the viewstate of the page. Once I have this I can then simulate a post of my login credentials and use a streamreader to retrieve the contents of the page.

Click here for a C# tutorial.

The thing I dont understand is that the first part of the code appears to work (and passes through the proxy). It is only when i try and close the contents of the response:

Expand|Select|Wrap|Line Numbers
  1. mywebRequest.GetResponse().Close()
that I get the following error:

The remote server returned an error: (407) Proxy Authentication Required.
Could someone advise me as to whether I am defining the proxy correctly?

Thanks.

Ben
Jun 20 '07 #1
4 9598
Just to let everyone know I got this working in the end. Made a schoolboy error as I had reinitiated the web request but did not add the proxy information for that instance.

If anyone would like a copy of the complete code (to authenticate then screen scrape) then sent me a message
Jun 20 '07 #2
Frinavale
9,735 Expert Mod 8TB
Just to let everyone know I got this working in the end. Made a schoolboy error as I had reinitiated the web request but did not add the proxy information for that instance.

If anyone would like a copy of the complete code (to authenticate then screen scrape) then sent me a message
I'm glad you got it working!
Could you please share the solution so others can learn as well.

Thanks a lot!

-Frinny
Jun 20 '07 #3
Well I am glad to help. For some tutorials on httpwebrequest and persisting viewstate for sites that require authentication I would seriously recommend:

http://odetocode.com/Articles/162.aspx (c# version of my code)

and

http://aspnet.4guysfromrolla.com/art...spx#postadlink (great article covering the basics of the httpwebrequest class by Scott Mitchell.

I do have a C# version of the following code if you require (as originally created in this then needed to write in vb.net). This also works as a windows form application or a asp.net app.

Create a new class called prices.vb. Insert the following code (comments explain it pretty well):

Expand|Select|Wrap|Line Numbers
  1. Imports System
  2. Imports System.Net
  3. Imports System.IO
  4. Imports System.Text
  5. Imports System.Web
  6. Imports System.Text.RegularExpressions
  7. Imports Microsoft.VisualBasic
  8. Namespace PageGetter
  9.     Public Class prices
  10.         Public targetURL As String
  11.         Private Function ExtractViewState(ByVal s As String) As String
  12.             Dim viewStateNameDelimiter As String = "__VIEWSTATE"
  13.             Dim valueDelimiter As String = "value="""
  14.  
  15.  
  16.             Dim viewStateNamePosition As Integer = s.IndexOf(viewStateNameDelimiter)
  17.             Dim viewStateValuePosition As Integer = s.IndexOf(valueDelimiter, viewStateNamePosition)
  18.  
  19.             Dim viewStateStartPosition As Integer = viewStateValuePosition + valueDelimiter.Length
  20.             Dim viewStateEndPosition As Integer = s.IndexOf("""", viewStateStartPosition)
  21.  
  22.             Return HttpUtility.UrlEncodeUnicode(s.Substring(viewStateStartPosition, viewStateEndPosition - viewStateStartPosition))
  23.         End Function
  24.  
  25.         Public Function GetPrices(ByVal targetUrl As String) As String
  26. *********************************************************************************
  27. 'only need this if you are behind a proxy            
  28. Dim pxy As New WebProxy("http://yourproxyaddress:0000")
  29.             pxy.Credentials = CredentialCache.DefaultCredentials
  30. *********************************************************************************
  31.             ' first, request the login form to get the viewstate value
  32.             Dim mywebRequest As HttpWebRequest = TryCast(WebRequest.Create("http://somedomain.com/TheLogOnCheck.aspx"), HttpWebRequest)
  33.  
  34.             mywebRequest.Proxy = pxy
  35.  
  36.             'Set the timeout to 1 second (or 1,000 milliseconds)
  37.             mywebRequest.Timeout = 1000
  38.  
  39.  
  40.             Try
  41.  
  42.                 Dim responseReader As New StreamReader(mywebRequest.GetResponse().GetResponseStream())
  43.                 Dim responseData As String = responseReader.ReadToEnd()
  44.  
  45.                 responseReader.Close()
  46.  
  47.                 ' extract the viewstate value and build out POST data
  48.                 Dim viewState As String = ExtractViewState(responseData)
  49.                 Dim postData As String = [String].Format("__VIEWSTATE={0}&TUserName={1}&TPassword={2}&_ctl0%3AContent%3AbtnLogon=Logon&__PREVIOUSPAGE=cys_as-zp6tmeXXlc07FggKJUKD96k3RyL8XYHQ-U3I1&__EVENTVALIDATION=%2FwEWAwLV3qjMAQKL2pbeCALf%2B9ffB29vWwhgfdAvHzzk%2F%2BqB%2BKkddRGi", viewState, "yourname@domain.co.uk", "password")
  50.  
  51.  
  52.                 ' have a cookie container ready to receive the forms auth cookie
  53.                 Dim cookies As New CookieContainer()
  54.  
  55.                 ' now post to the login form
  56.                 mywebRequest = TryCast(WebRequest.Create("http://somedomain.com/TheLogOnCheck.aspx"), HttpWebRequest)
  57.  
  58.                 mywebRequest.Proxy = pxy 'only if behind proxy (see above)
  59.  
  60.                 mywebRequest.Method = "POST"
  61.                 mywebRequest.ContentType = "application/x-www-form-urlencoded"
  62.                 mywebRequest.CookieContainer = cookies
  63.  
  64.                 ' write the form values into the request message
  65.                 Dim requestWriter As New StreamWriter(mywebRequest.GetRequestStream())
  66.                 requestWriter.Write(postData)
  67.                 requestWriter.Close()
  68.  
  69.                 ' we don't need the contents of the response, just the cookie it issues
  70.                 mywebRequest.GetResponse().Close()
  71.  
  72.                 ' now we can send out cookie along with a request for the protected page
  73.                 mywebRequest = TryCast(WebRequest.Create(targetUrl), HttpWebRequest)
  74.                 mywebRequest.CookieContainer = cookies
  75.                 responseReader = New StreamReader(mywebRequest.GetResponse().GetResponseStream())
  76.  
  77.  
  78.                 ' and read the response
  79.                 responseData = responseReader.ReadToEnd()
  80.                 responseReader.Close()
  81.  
  82.                 'Here we set up our Regular expression to snatch what's between the tags we want in our html source
  83.                 Dim regex As New Regex("<!-- main table -->((.|" & Chr(10) & ")*?)<!-- / main table -->", RegexOptions.IgnoreCase)
  84.  
  85.                 'Here we apply our regular expression to our string using the 
  86.                 'Match object. 
  87.                 Dim oM As Match = regex.Match(responseData)
  88.  
  89.                 Return oM.Value
  90.  
  91.             Catch wex As WebException
  92.                 'Something went wrong in the HTTP request!  See if it was a timeout problem
  93.                 If wex.Status = WebExceptionStatus.Timeout Then
  94.                     Return ("<font color=red>The httpWebRequest has timed out. Please contact the helpdesk</font>")
  95.                 Else
  96.                     Return ("<font color=red>FAILED TO CONNECT<br />Status: " & wex.Status & " Message: " & wex.Message & "</font>")
  97.                 End If
  98.             End Try
  99.  
  100.  
  101.         End Function
  102.     End Class
  103. End Namespace
In my windows form I then used the following code to extract (scrape) the page contents (puts the retrieved html into a webbrowser control):

Expand|Select|Wrap|Line Numbers
  1.     Private Sub myForm_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
  2.         Dim myPrice As New prices
  3.         Dim temp As String = myPrice.GetPrices("http://www.somedomain.org/thepageyouwant.aspx")
  4.         WebBrowser1.DocumentText = temp
  5.     End Sub
If you have an asp.net application you can do exactly the same but on your page_load event:
Expand|Select|Wrap|Line Numbers
  1. Dim futuresprice As New prices
  2.         Dim temp As String = futuresprice.GetPrices("http://www.somedomain.org/thepageyouwant.aspx")
  3. Reponse.Write(temp)
Just some points worth noting. The viewstate value that you build your post data with will not necessarily be the same as mine. What you need to do is download a tool called Fiddler (http://www.fiddler2.com/fiddler2). Open it up and log into your target site. Have a look in the session inspector of the page that is used to authenticate the log in (mine was called logoncheck.aspx) and you can see the complete viewstate value. You can then ammend the code to suit the site your requirements. Also I cannot guarantee that his will work for everyone's proxy servers.
Finally credit must be given to Scott Mitchell and Scott Allen as it is their tutorials I used to produce this application.

Hope this was of help.
Ben Foster
<email removed>
Jun 21 '07 #4
Frinavale
9,735 Expert Mod 8TB
Wow!
Thank you for providing your solution.

-Frinny
Jun 21 '07 #5

Sign in to post your reply or Sign up for a free account.

Similar topics

16
by: thomas peter | last post by:
I am building a precache engine... one that request over 100 pages on an remote server to cache them remotely... can i use the HttpWebRequest and WebResponse classes for this? or must i use the...
2
by: Steve Richter | last post by:
I have a page that uses simple HTTP GET to do an ISBN lookup via Amazon.com. The page works when I run it from //localhost. But I have moved it to my godaddy.com shared hoster site, and I get...
1
by: Imran Aziz | last post by:
Hello All, I am using HttpWebRequest to fetch webpages in my ASP.net C# application. The request works fine without the proxy, but on using the code from within a network that uses proxy the...
1
by: Dave Brown | last post by:
I am attempting to post to a url (https://FakeURL/logon.asp) using the HttpWebRequest class. The response for a succesful post will contain the html for the logon user's default page. We've...
8
by: Dave Brown | last post by:
I am attempting to post to a url (https://FakeURL/logon.asp) using the HttpWebRequest class. The response for a succesful post will contain the html for the logon user's default page. We've...
2
by: Tyler | last post by:
I am using httpwebrequest to do a screen scrape. This works great on my development box, but does not on the production box. Here is the code. Dim webRequest As HttpWebRequest =...
2
by: =?Utf-8?B?U2ltb25EZXY=?= | last post by:
Hi I have a utility class, called MailHandler, that I wrote to read and operate on emails on an Exchange server using WebDAV. The WebDAV SQL statements are sent using an HttpWebRequest object....
1
by: moo | last post by:
Is there a simple way to get my logon credentials to make my web request work through our proxy server? I tried CredentialCache.DefaultCredentials, but I get nothing back. I can get it to work if I...
2
by: =?Utf-8?B?TGFycnlLdXBlcm1hbg==?= | last post by:
Our WebDev team seems to have found a problem that exposes a bug in .NET 2.0. This problem can be shown when trying to access a WebService using SSL and through a proxy server after using the...
2
by: =?Utf-8?B?TGVuc3Rlcg==?= | last post by:
A C# (.NET 2) application which uses the System.Net.HttpWebRequest object to request a resource over HTTPS is failing following the installation of a new proxy server on our internal network with...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.