Charles,
Thank you very much for your suggestions. I ended up using MSHTML after I
found an example that loads the HTML from memory. I used the following code,
any suggestions are welcome.
Public Function GetTableText(ByVal sHTML As String) As String
Dim myDoc As mshtml.IHTMLDocument2 = New mshtml.HTMLDocument
Dim mElement As mshtml.IHTMLElement
Dim mElement2 As mshtml.IHTMLElement
Dim mECol As mshtml.IHTMLElementCollection
Dim I As Integer
'initialize the document object within the HTMLDocument class...
myDoc.close()
myDoc.open("about
:blank")
'write the HTML to the document using the MSHTML "write" method...
Dim clsHTML() As Object = {sHTML}
myDoc.write(clsHTML)
clsHTML = Nothing
mElement = myDoc.body()
mECol = mElement.getElementsByTagName("TD")
For I = 0 To mECol.length - 1
mElement2 = mECol.item(I)
lstResults.Items.Add(mElement2.tagName & " : " & mElement2.innerText)
Next
End Function
Thanks,
Curtis
"Charles Law" <bl***@nowhere.com> wrote in message
news:O6**************@TK2MSFTNGP09.phx.gbl...
Hi Curtis
If you want to access a web page as a tree structure then you are probably
looking at something like Microsoft's mshtml. It makes a page accessible
as a document object model (DOM), just like you might use automation to
read a Word document as a DOM. You can either get the page by using the
WebBrowser control (which wraps mshtml) in your application, or using
mshtml directly.
Do you have your web page in memory, or are you also wanting to retrieve
it from the internet?
HTH
Charles
"Curtis" <cs*****@hotmail.com> wrote in message
news:e0**************@TK2MSFTNGP09.phx.gbl... Does anyone have any good examples of parsing WebPages in VB.Net. My
application needs to get information from certain HTML tables and I
haven't been able to find a good way to approach the problem. I have
researched RegularExpressions but have found it to be rather complicated
for what I am attempting to accomplish. I was hoping that there would be
some type of utility that would allow me to parse through the webpage in
a tree like structure. I found one utility called HTMLAgilityPack but it
seems to be a little difficult to work with. If any one has any
suggestions or examples I would really appreciate it.
Thanks
Curtis