Hi, I got a problem which may easy to resolve, but I can't
find any issue:
I want to parse html files, so, I want first get it from an
url, and I do like that:
Dim objMSHTML As New mshtml.HTMLDocument()
Dim objDocument As mshtml.HTMLDocument
objDocument =
objMSHTML.createDocumentFromUrl("http://www.google.fr",
vbNullString)
normally, this should work and I could parse the html
code... but in fact, I got this error:
"Une exception non gérée du type
'System.NullReferenceException' s'est produite dans
mscorlib.dll
Informations supplémentaires : La référence d'objet n'est
pas définie à une instance d'un objet."
(sorry, my vb version is french)
any Idea?
PS: I think this code works with VB6...
une idée? 8 8197
In article <00****************************@phx.gbl>, pr******@ina.fr
says... Hi, I got a problem which may easy to resolve, but I can't find any issue:
I want to parse html files, so, I want first get it from an url, and I do like that:
Dim objMSHTML As New mshtml.HTMLDocument() Dim objDocument As mshtml.HTMLDocument objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr", vbNullString)
Use the built-in .NET networking objects. See: http://tinyurl.com/98ey
--
Patrick Steele
Microsoft .NET MVP http://weblogs.asp.net/psteele
Thank you, Patrick
I've just read the article...
but it doesn't seems that it can help me to parse the
html... using mshtml.HTMLDocument, I though I could use the
"links" property which is supposed to give an access to
links in html... In article <00****************************@phx.gbl>,
pr******@ina.frsays... Hi, I got a problem which may easy to resolve, but I can't find any issue:
I want to parse html files, so, I want first get it from an url, and I do like that:
Dim objMSHTML As New mshtml.HTMLDocument() Dim objDocument As mshtml.HTMLDocument objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr", vbNullString)
Use the built-in .NET networking objects. See:
http://tinyurl.com/98ey
-- Patrick Steele Microsoft .NET MVP http://weblogs.asp.net/psteele .
Pierre,
I' never seen this methode, so I am curious if it works, but that is not in
one time.
I will advise you to take a look at the "webbrowser" with that you can
"navigate" to an URL
(It uses Internet explorer 6, don't ask me how)
Then with the "documentscomplete" events from the "webbrowser" you can get
the documents conform the dom.
When there is a frame's there is for every frame a document.
There is too a navigate-complete, but with that you get only the last page
downloaded
That's why I find the methode you use strange, but I saw it too in the
documentation
I hope I did bring you in the right direction.
It is to much to give a quick example.
And the webbrowser is only one of the methode's I think you can use, but
that I use for this things at the moment.
I hope it helps you a little bit.
Cor
In article <18****************************@phx.gbl>, pr******@ina.fr
says... Thank you, Patrick I've just read the article... but it doesn't seems that it can help me to parse the html... using mshtml.HTMLDocument, I though I could use the "links" property which is supposed to give an access to links in html...
Sorry -- forgot about your parsing issue.
Perhaps you could get the raw HTML using the .NET WebRequest and then
feed that into the mshtml.HTMLDocument object. I've never used that
object before so I'm not sure if you can load it with your own HTML.
--
Patrick Steele
Microsoft .NET MVP http://weblogs.asp.net/psteele
Hi Pierre
The problem is that although you create a new mshtml.HTMLDocument, it is not
being initialised.
Try the following:
<code>
Dim objMSHTML As New mshtml.HTMLDocument
Dim objDocument As mshtml.IHTMLDocument2
Dim ips As IPersistStreamInit
ips = DirectCast(objMSHTML, IPersistStreamInit)
ips.InitNew()
objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr",
vbNullString)
Do Until objDocument.readyState = "complete"
Application.DoEvents()
Loop
Debug.WriteLine(objDocument.body.outerHTML)
</code>
At the end of this you can access the DOM. Note that you need to define the
IPersistStreamInit interface.
HTH
Charles
"pierre" <pr******@ina.fr> wrote in message
news:00****************************@phx.gbl...
Hi, I got a problem which may easy to resolve, but I can't
find any issue:
I want to parse html files, so, I want first get it from an
url, and I do like that:
Dim objMSHTML As New mshtml.HTMLDocument()
Dim objDocument As mshtml.HTMLDocument
objDocument =
objMSHTML.createDocumentFromUrl("http://www.google.fr",
vbNullString)
normally, this should work and I could parse the html
code... but in fact, I got this error:
"Une exception non gérée du type
'System.NullReferenceException' s'est produite dans
mscorlib.dll
Informations supplémentaires : La référence d'objet n'est
pas définie à une instance d'un objet."
(sorry, my vb version is french)
any Idea?
PS: I think this code works with VB6...
une idée?
Pierre
In case you don't have it, here is the IPersistStreamInit interface
definition
<code>
Imports System.Runtime.InteropServices
<ComVisible(True), ComImport(),
Guid("7FD52380-4E07-101B-AE2D-08002B2EC713"), _
InterfaceTypeAttribute(ComInterfaceType.InterfaceI sIUnknown)> _
Public Interface IPersistStreamInit
' IPersist interface
Sub GetClassID(ByRef pClassID As Guid)
<PreserveSig()> Function IsDirty() As Integer
<PreserveSig()> Function Load(ByVal pstm As UCOMIStream) As Integer
<PreserveSig()> Function Save(ByVal pstm As UCOMIStream, ByVal
fClearDirty As Boolean) As Integer
<PreserveSig()> Function GetSizeMax(<InAttribute(), Out(),
MarshalAs(UnmanagedType.U8)> ByRef pcbSize As Long) As Integer
<PreserveSig()> Function InitNew() As Integer
End Interface
</code>
HTH
Charles
"Charles Law" <bl**@thingummy.com> wrote in message
news:%2***************@TK2MSFTNGP11.phx.gbl... Hi Pierre
The problem is that although you create a new mshtml.HTMLDocument, it is
not being initialised.
Try the following:
<code> Dim objMSHTML As New mshtml.HTMLDocument Dim objDocument As mshtml.IHTMLDocument2 Dim ips As IPersistStreamInit
ips = DirectCast(objMSHTML, IPersistStreamInit) ips.InitNew()
objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr", vbNullString)
Do Until objDocument.readyState = "complete" Application.DoEvents() Loop
Debug.WriteLine(objDocument.body.outerHTML) </code>
At the end of this you can access the DOM. Note that you need to define
the IPersistStreamInit interface.
HTH
Charles
"pierre" <pr******@ina.fr> wrote in message news:00****************************@phx.gbl... Hi, I got a problem which may easy to resolve, but I can't find any issue:
I want to parse html files, so, I want first get it from an url, and I do like that:
Dim objMSHTML As New mshtml.HTMLDocument() Dim objDocument As mshtml.HTMLDocument objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr", vbNullString)
normally, this should work and I could parse the html code... but in fact, I got this error:
"Une exception non gérée du type 'System.NullReferenceException' s'est produite dans mscorlib.dll Informations supplémentaires : La référence d'objet n'est pas définie à une instance d'un objet."
(sorry, my vb version is french)
any Idea? PS: I think this code works with VB6...
une idée?
Charles,
Thanks, saves me a lot of time looking this up.
Cor
Thanks a lot, it works perfectly :)
P. Pierre
In case you don't have it, here is the IPersistStreamInit
interfacedefinition
<code> Imports System.Runtime.InteropServices
<ComVisible(True), ComImport(), Guid("7FD52380-4E07-101B-AE2D-08002B2EC713"), _ InterfaceTypeAttribute(ComInterfaceType.Interface IsIUnknown)>
_Public Interface IPersistStreamInit ' IPersist interface Sub GetClassID(ByRef pClassID As Guid)
<PreserveSig()> Function IsDirty() As Integer <PreserveSig()> Function Load(ByVal pstm As
UCOMIStream) As Integer <PreserveSig()> Function Save(ByVal pstm As
UCOMIStream, ByValfClearDirty As Boolean) As Integer <PreserveSig()> Function GetSizeMax(<InAttribute(), Out(), MarshalAs(UnmanagedType.U8)> ByRef pcbSize As Long) As Integer <PreserveSig()> Function InitNew() As Integer End Interface </code>
HTH
Charles
"Charles Law" <bl**@thingummy.com> wrote in message news:%2***************@TK2MSFTNGP11.phx.gbl... Hi Pierre
The problem is that although you create a new
mshtml.HTMLDocument, it isnot being initialised.
Try the following:
<code> Dim objMSHTML As New mshtml.HTMLDocument Dim objDocument As mshtml.IHTMLDocument2 Dim ips As IPersistStreamInit
ips = DirectCast(objMSHTML, IPersistStreamInit) ips.InitNew()
objDocument =
objMSHTML.createDocumentFromUrl("http://www.google.fr", vbNullString)
Do Until objDocument.readyState = "complete" Application.DoEvents() Loop
Debug.WriteLine(objDocument.body.outerHTML) </code>
At the end of this you can access the DOM. Note that you
need to definethe IPersistStreamInit interface.
HTH
Charles
"pierre" <pr******@ina.fr> wrote in message news:00****************************@phx.gbl... Hi, I got a problem which may easy to resolve, but I can't find any issue:
I want to parse html files, so, I want first get it from an url, and I do like that:
Dim objMSHTML As New mshtml.HTMLDocument() Dim objDocument As mshtml.HTMLDocument objDocument = objMSHTML.createDocumentFromUrl("http://www.google.fr", vbNullString)
normally, this should work and I could parse the html code... but in fact, I got this error:
"Une exception non gérée du type 'System.NullReferenceException' s'est produite dans mscorlib.dll Informations supplémentaires : La référence d'objet n'est pas définie à une instance d'un objet."
(sorry, my vb version is french)
any Idea? PS: I think this code works with VB6...
une idée?
. This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: Kookymon1 |
last post by:
This is an attempt to respond to an older question (several months).
Date: 2002-03-07 13:10:23 PST
Subject: On the Common DOM API and Applets.
The original message was:
>LiveConnect and the...
|
by: James Marshall |
last post by:
I'm writing a library where I want to override document.write(), but for
all document objects; thus, I want to put it in the prototype. I tried
Document.prototype.write= my_doc_write ;
but it...
|
by: plmanikandan |
last post by:
Hi,
I need to integrate the browser with my C# windows application.
When I search thru the websites for this,I found SHDocVw.dll is needed
for integrating Web browser into c# application.I...
|
by: Irfan |
last post by:
Hello,
I want to load HTML file into HTMLDocument object. I don't want to use
webbrowser object or any asyncrohonous call to load HTML into this file.
Like if I call HTTPWebRequest to download...
|
by: forcedfx |
last post by:
I'm faced with a bit of a conundrum. I'm trying to post a form using the HTMLDocument object. I've got the form posting working prefectly, however, in order to retrieve the HTML page that contains...
|
by: Jeff |
last post by:
Is there a standard way of getting the HTMLDocument object
representation of a remote page using Javascript? If I request an
HTML page, the xmlHttpRequest returns either text or an XMLDocument.
I...
|
by: nickin4u |
last post by:
I have a application that is used to automate certain task,
I have been using mshtml.HTMLDocument class but certain events like click a button do not fire. I have tried a number of combinations but...
|
by: sam6 |
last post by:
Hi,
I have developed a small dll class which extracts me the innerhtml of a htmldocument.I am using mshtml lib in achieving the same.The code runs fine when using from another EXE.When I try to...
|
by: CSharper |
last post by:
Is there a class I can use which loads HtmlDocument and performs
default html validation to see if the document is a valid html
document like XDocument? HtmlDocument seems to me only used to create...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
| |