473,657 Members | 2,287 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Screen Scraping a web page

I am working on an application to screen scrape information from a web page.
I have the base code working but the problem is I have to login before I can
get the info I need. The page is hosted on my Router. When I go to the IP of
the router I get the following page.

<HTML>
<head>
<meta http-equiv="content-type" content="text/html;charset=is o-8859-1">
<title>Login</title>
</head>

<BODY bgcolor="#f7990 0">
<form action="LOGIN.H TM" method="post" name="tF">
<input type="hidden" name="page" value="login">
<table border="0" width="100%" height="184" cellspacing="0" >
<tr>
<td width="100%" height="103" colspan="2" align="center">
<a href="http://support.speedst ream.com"><img border="0"
src="IMAGE/SIELOGOBLACK.JP G" width="270" height="40"></a>
</td>
</tr>
<tr>
<td width="100%" height="19" colspan="2" align="center">
<H2><font face="Arial, Helvetica, sans-serif" color="#FFFFFF" >Login
&nbsp;Screen </font></H2>
</td>
</tr>

<tr>
<td width="50%" height="19" align="right">
<font face="Arial, Helvetica, sans-serif" size="2"
color="#FFFFFF" >Password&nbsp; &nbsp;&nbsp; :</font></td>
<td width="50%" height="19" align="left">
<INPUT type="password" maxLength=12 size=9 name=pws></td><p>
</tr>
<tr>
<td width="50%" height="19">&nb sp;</td>
<td width="50%" height="19">&nb sp;</td>
</tr>
<tr>
<td width="50%" height="19" align="right">
<INPUT type="submit" value=" Login ">
</td>
<td width="50%" height="19" align="left">
<INPUT class=button onclick=window. close(); type=button value= Cancel >
</td>
</tr>
</table>
</form></BODY>
</HTML>

Using the informaation on this page I have developed the following code for
my application, but ever time I run it I get:

An unhandled exception of type 'System.Net.Web Exception' occurred in
system.dll

Additional information: The underlying connection was closed: The server
committed an HTTP protocol violation.

My code is as follows:

Imports System.Net

Imports System.IO

Public Class Form1

Inherits System.Windows. Forms.Form

#Region " Windows Form Designer generated code "

Public Sub New()

MyBase.New()

'This call is required by the Windows Form Designer.

InitializeCompo nent()

'Add any initialization after the InitializeCompo nent() call

End Sub

'Form overrides dispose to clean up the component list.

Protected Overloads Overrides Sub Dispose(ByVal disposing As Boolean)

If disposing Then

If Not (components Is Nothing) Then

components.Disp ose()

End If

End If

MyBase.Dispose( disposing)

End Sub

'Required by the Windows Form Designer

Private components As System.Componen tModel.IContain er

'NOTE: The following procedure is required by the Windows Form Designer

'It can be modified using the Windows Form Designer.

'Do not modify it using the code editor.

Friend WithEvents lblMyIp As System.Windows. Forms.Label

Friend WithEvents txtMyIp As System.Windows. Forms.TextBox

<System.Diagnos tics.DebuggerSt epThrough()> Private Sub InitializeCompo nent()

Dim resources As System.Resource s.ResourceManag er = New
System.Resource s.ResourceManag er(GetType(Form 1))

Me.lblMyIp = New System.Windows. Forms.Label

Me.txtMyIp = New System.Windows. Forms.TextBox

Me.SuspendLayou t()

'

'lblMyIp

'

Me.lblMyIp.Loca tion = New System.Drawing. Point(8, 8)

Me.lblMyIp.Name = "lblMyIp"

Me.lblMyIp.Size = New System.Drawing. Size(56, 23)

Me.lblMyIp.TabI ndex = 0

Me.lblMyIp.Text = "My Ip:"

'

'txtMyIp

'

Me.txtMyIp.Loca tion = New System.Drawing. Point(64, 8)

Me.txtMyIp.Mult iline = True

Me.txtMyIp.Name = "txtMyIp"

Me.txtMyIp.Size = New System.Drawing. Size(376, 248)

Me.txtMyIp.TabI ndex = 1

Me.txtMyIp.Text = ""

'

'Form1

'

Me.AutoScaleBas eSize = New System.Drawing. Size(7, 19)

Me.ClientSize = New System.Drawing. Size(448, 262)

Me.Controls.Add (Me.txtMyIp)

Me.Controls.Add (Me.lblMyIp)

Me.Font = New System.Drawing. Font("Times New Roman", 12.0!,
System.Drawing. FontStyle.Regul ar, System.Drawing. GraphicsUnit.Po int,
CType(0, Byte))

Me.Icon = CType(resources .GetObject("$th is.Icon"), System.Drawing. Icon)

Me.MaximizeBox = False

Me.MinimizeBox = False

Me.Name = "Form1"

Me.StartPositio n = System.Windows. Forms.FormStart Position.Center Screen

Me.Text = "MyIpReader "

Me.ResumeLayout (False)

End Sub

#End Region

Private Sub Form1_Load(ByVa l sender As System.Object, ByVal e As
System.EventArg s) Handles MyBase.Load

txtMyIp.Text = ReadHTMLPage(http://192.168.1.1:88/login.htm)

End Sub

Public Function ReadHTMLPage(By Val url As String) As String

Dim result As String = ""

Dim strPost As String = "page=login&pws =password"

Dim myWriter As StreamWriter

Dim objRequest As HttpWebRequest = WebRequest.Crea te(url)

objRequest.Meth od = "POST"

objRequest.Cont entLength = strPost.Length

objRequest.Cont entType = "applicatio n/x-www-form-urlencoded"

Try

myWriter = New StreamWriter(ob jRequest.GetReq uestStream())

myWriter.Write( strPost)

Catch e As Exception

Return e.Message

Finally

myWriter.Close( )

End Try

Dim objResponse As HttpWebResponse = objRequest.GetR esponse()

Dim sr As StreamReader

sr = New StreamReader(ob jResponse.GetRe sponseStream())

result = sr.ReadToEnd()

sr.Close()

Return result

End Function

End Class

I can't figure out what I am doing wrong here? Any guidance would be
appreciated.
Nov 21 '05 #1
0 3645

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
5732
by: Roland Hall | last post by:
Am I correct in assuming screen scraping is just the response text sent to the browser? If so, would that mean that this could not be screen scraped? function moi() { var tag = '<a href='; var tagType1 = '"mail'+'to:', tagType2 = '">', tagType3 = '<\/a>'; var user1 = 'web', user2 = 'master', user3 = '@'; var dom1 = 'danger', dom2 = 'ous', dom3 = 'ly'; var tld = '.us';...
3
2354
by: Jim Giblin | last post by:
I need to scrape specific information from another website, specifically the prices of precious metals from several different vendors. While I will credit the vendors as the data source, I do not want to use the format of their pages, and want the inforamtion consolidated to a single page of my design. I did something like this for a client a couple of years ago in ASP, but it was complex, and I do not have access to the code. A...
1
1552
by: niv | last post by:
Hello, I would like to screen scrape certain parts of a webpage...how can I do this in asp.net For instance.... a stockticker thats embeded on a webpage.. I dont want the entire page.. I would like just the stock ticker...
4
3452
by: rachel | last post by:
Hello, I am currently contracted out by a real estate agent. He has a page that he has created himself that has a list of homes.. their images and data in html format. He wants me to take this page and reformat it so that it looks different. Do I use screen scraping to do this? Could someone please point me to a good screen scraping
4
9449
by: Ronald S. Cook | last post by:
I've been asked to extract data from web pages. Given that they are rendered in HTML and not any sort of XML I'm wondering how to go about "scraping" such a web page of data. Can anyone give me any starting place? Thanks, Ron
2
1727
by: Alan Silver | last post by:
Hello, I would like to pull some information off a site that requires a log in. I have a subscription to a premium content site, and I would like to be able to do a few automatic requests instead of having to load the site manually in a browser. I have seen plenty articles that explain how to do screen scraping in ..NET, others that describe how to do it via a POST, but I couldn't find any that covered my scenario.
4
3450
by: onetitfemme | last post by:
Say, people would like to log into their hotmail, yahoo and gmail accounts and "keep an eye" on some text/part of a site .. I think something like that should be out there, since not all sites provide RSS feeds nor are they really interested in providing consistent and informative content (what we (almost) all are looking for). .. I have been mostly programming java lately. THis is how I see such an API could -very basically indeed- be...
9
2201
by: Knoxy | last post by:
Hi guys, I've got this working but I have issues when there is any kind of c# coding on the page that I'm trying to scrape (pages within my site - its for a print page view basically), I get this error: The remote server returned an error: (500) Internal Server Error Now, I've stepped into the page that its calling, and it doesnt come across an error. Any ideas? Code below:
3
5162
by: WFDGW2 | last post by:
I want to write or obtain C++ code that will scrape text from a dialog box within a poker client, and then record that text somewhere else. What do I do? Thanks.
3
2632
by: bruce | last post by:
Hi... got a short test app that i'm playing with. the goal is to get data off the page in question. basically, i should be able to get a list of "tr" nodes, and then to iterate/parse them. i'm missing something, as i think i can get a single node, but i can't figure out how to display the contents of the node.. nor how to get the list of the "tr" nodes....
0
8407
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8319
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8837
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8739
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8612
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7347
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5638
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
2
1969
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1732
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.