473,385 Members | 1,279 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

MSHTML and MSXML in VB6

hi guys,
i need to parse html data that i've got from "Inet" object in vb6.
now i want to prase the html data. here i got 2 options. one is MSXML
and other is MSHTML. i tried both of them but i didnt get anything out
of them. MSXML doesnt works with some keywords and consider some
letters as operator so i cant go with MSXML. i tried MSHTML but it
doesnt provide any way to parse the HTML data you got from other
source. there is a method(fectory method) in MSHTML which can be use to
get HTMLDocument. but it has a problem. if the page you are trying to
get through code, has script to set the focus to the control and the
MSHTML object throws an error. here the problem is error comes from the
MSHTML and which can not be hadled. as it doesnt receive data after
throwing the error.
if anyone has any idea or suggestion on how can i solve this
problem, plese do share with me. this VB6 really sucks.
Thanks,
Lucky

Dec 29 '05 #1
11 2337
"Lucky" <tu************@gmail.com> schrieb:
i need to parse html data that i've got from "Inet" object in vb6.


This is a VB.NET group. I suggest to post the question to one of the groups
in the "microsoft.public.vb.*" hierarchy.

--
M S Herfried K. Wagner
M V P <URL:http://dotnet.mvps.org/>
V B <URL:http://classicvb.org/petition/>

Dec 29 '05 #2
Lucky,

First, the reason that MSXML doesn't work is because HTML is not XML, so
you will almost always get a parse error.

As for MSHTML, it really isn't directly exposable from VB, it's more of
a COM interface (as opposed to Automation, which is what VB demands).

To load a page into MSHTML, you would have to somehow provide access to
the IMoniker interface that is returned from a call to the API CreateMoniker
(passing the URL of the source you want to download). Once you do this, you
can pass it to the IPersistMoniker::Load implementation on MSHTML. This
will cause MSHTML to trigger the load and parse the document.

However, doing any of this from VB is *extremely* difficult, and you
should probably use C++ to access these interfaces and perform this work,
exposing it to VB6 in the manner you desire.

Hope this helps.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Lucky" <tu************@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...
hi guys,
i need to parse html data that i've got from "Inet" object in vb6.
now i want to prase the html data. here i got 2 options. one is MSXML
and other is MSHTML. i tried both of them but i didnt get anything out
of them. MSXML doesnt works with some keywords and consider some
letters as operator so i cant go with MSXML. i tried MSHTML but it
doesnt provide any way to parse the HTML data you got from other
source. there is a method(fectory method) in MSHTML which can be use to
get HTMLDocument. but it has a problem. if the page you are trying to
get through code, has script to set the focus to the control and the
MSHTML object throws an error. here the problem is error comes from the
MSHTML and which can not be hadled. as it doesnt receive data after
throwing the error.
if anyone has any idea or suggestion on how can i solve this
problem, plese do share with me. this VB6 really sucks.
Thanks,
Lucky

Dec 29 '05 #3
> As for MSHTML, it really isn't directly exposable from VB, it's more of
a COM interface (as opposed to Automation, which is what VB demands).

To load a page into MSHTML, you would have to somehow provide access to
the IMoniker interface that is returned from a call to the API
CreateMoniker (passing the URL of the source you want to download). Once
you do this, you can pass it to the IPersistMoniker::Load implementation
on MSHTML. This will cause MSHTML to trigger the load and parse the
document.

However, doing any of this from VB is *extremely* difficult, and you
should probably use C++ to access these interfaces and perform this work,
exposing it to VB6 in the manner you desire.


Using MSHTML in VB 2002/2005 is *extremely* simple.

http://www.vb-tips.com/default.aspx?...f-56dbb63fdf1c

The same of course for C#

However this is of course no VB6

:-)

Cor
Dec 29 '05 #4
Cor,

Yes, and unfortunately, the OP was looking for a VB6 solution.

Also, the code in the link that you sent is incorrect. Using the write
method of the document is not the correct way to feed content into MSHTML
(and it is a common misconception, you have no control over the headers that
are sent back which help with the processing of the document, and it all has
to be inferred).

The correct way to feed source to MSHTML is to create an implementation
of IMoniker, and pass that through to IPersistMoniker.Load. Then, you can
stream your content from whatever source you like. Additionally, you can
mimic the source as if it was downloaded from a site, or obtained from some
other resource.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Cor Ligthert [MVP]" <no************@planet.nl> wrote in message
news:eX**************@TK2MSFTNGP12.phx.gbl...
As for MSHTML, it really isn't directly exposable from VB, it's more
of a COM interface (as opposed to Automation, which is what VB demands).

To load a page into MSHTML, you would have to somehow provide access
to the IMoniker interface that is returned from a call to the API
CreateMoniker (passing the URL of the source you want to download). Once
you do this, you can pass it to the IPersistMoniker::Load implementation
on MSHTML. This will cause MSHTML to trigger the load and parse the
document.

However, doing any of this from VB is *extremely* difficult, and you
should probably use C++ to access these interfaces and perform this work,
exposing it to VB6 in the manner you desire.


Using MSHTML in VB 2002/2005 is *extremely* simple.

http://www.vb-tips.com/default.aspx?...f-56dbb63fdf1c

The same of course for C#

However this is of course no VB6

:-)

Cor

Dec 29 '05 #5
Nicholas,
Also, the code in the link that you sent is incorrect


Did you try it? (If not than please do it before you write things like this
next time).

When I created it, I have tested it before placing it on the website.

About the headers is that AFAIK headers are not implemented in the DOM,
while MSHTML represents the DOM (Document Object Model).

However the part you are menioned is only to get an MSHTML document.

I was today actual busy with MSHTML and when you are used to it, than it is
extremely easy in Net. (It is in a normal Net reference by the way where
there was an error in the documentation part. It is not System.MSHTML but
Microsoft.MSHTML. I have changed that now).

However feel free to do it your way. I keep it with this "exremely" easy way
in Net.

:-))

Cor
Dec 29 '05 #6
Cor,

No, I did not try it, but I do not have to, because I know the
architecture of MSHTML very well (forgive me for saying so).

Yes, the headers are not implemented in the DOM, BUT, how the DOM
interprets the stream of information that is sent to it is in part dictated
by the headers.

I'll give you an example of what doesn't work with this method.

Say for example that the document you have doesn't have absolute URLs,
but relative ones. When you create a new MSHTML document as in the example
and write the content using doc.write, it assumes a base url of
"about:blank". It doesn't know how to interpret the relative URLs, and it
will reflect that when you try an access say, the SRC property on the object
representation of an anchor (A) element.

However, if you use the IMoniker implementation, and feed the content
through that, while having the implementation of IMoniker::GetDisplayName
return the URL of the content itself, your URLs in the object model will be
absolute, not relative. The SRC property on the A element will return the
absolute URL, resolved with the base url (returned from GetDisplayName), and
not a relative one.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Cor Ligthert [MVP]" <no************@planet.nl> wrote in message
news:ur*************@TK2MSFTNGP12.phx.gbl...
Nicholas,
Also, the code in the link that you sent is incorrect


Did you try it? (If not than please do it before you write things like
this next time).

When I created it, I have tested it before placing it on the website.

About the headers is that AFAIK headers are not implemented in the DOM,
while MSHTML represents the DOM (Document Object Model).

However the part you are menioned is only to get an MSHTML document.

I was today actual busy with MSHTML and when you are used to it, than it
is extremely easy in Net. (It is in a normal Net reference by the way
where there was an error in the documentation part. It is not
System.MSHTML but Microsoft.MSHTML. I have changed that now).

However feel free to do it your way. I keep it with this "exremely" easy
way in Net.

:-))

Cor

Dec 29 '05 #7
Nicholas,

Feel free not to use it, however as a short not investigated answer.

In my idea is a relative URL always related to the Host Url in the DOM of
the document or in the parent document when frames are used.

However as I said, it does not bother you if you have another opinion as me.

I see it working.

Cor
Dec 29 '05 #8
"Cor Ligthert [MVP]" <no************@planet.nl> schrieb:
Feel free not to use it, however as a short not investigated answer.

In my idea is a relative URL always related to the Host Url in the DOM of
the document or in the parent document when frames are used.

However as I said, it does not bother you if you have another opinion as
me.

I see it working.


Well, I believe it simply depends on what you want to archieve.

--
M S Herfried K. Wagner
M V P <URL:http://dotnet.mvps.org/>
V B <URL:http://classicvb.org/petition/>

Dec 29 '05 #9
Cor,

Don't take offense, as that was not my intent. The link that you
pointed to will work for a good number of situations, but it won't work for
all of them, and I was trying to point out those situations where that is
the case.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"Cor Ligthert [MVP]" <no************@planet.nl> wrote in message
news:O0**************@tk2msftngp13.phx.gbl...
Nicholas,

Feel free not to use it, however as a short not investigated answer.

In my idea is a relative URL always related to the Host Url in the DOM of
the document or in the parent document when frames are used.

However as I said, it does not bother you if you have another opinion as
me.

I see it working.

Cor

Dec 30 '05 #10
hi guys,
thanks for your contribution for knowledge sharing. it was very
informative and wonderful. i want to inform "Herfried" that i'm also a
..Net Developer and regular visitor of this group as my core experties
in vb.net. moreover currently i'm screwed up with VB6 and i thought
only VB group can help me out and you must have seen the drops of the
knowlede sharing occured on my Query.
anyways thanks guys.

Dec 30 '05 #11
Nicholas,

I never felt it as offense, I did not want that people had the idea that Ken
and I were placing untested samples on our website (which even can than have
errors by the way because of last minute changes in the code).

Therefore I pointed primary on that sentence from you where you told that
the code was incorrect. You did not write "in my standards" incorrect.

Luckely (as I think only at security) with this kind of operations there are
a lot of situations that it will not work.

However the samples are only to show that basicly MSHTML is very simple if
you know the DOM. Without that it is extremely difficult to use (All the
different interfaces makes it as well very difficult).

The construction to get that page is really not a simple way if you cannot
copy the code from somewhere. What I did in parts from all over the
Internet. The only thing I did in that first part of the sample was
assembling it to what was needed and delete those parts which were not
needed.

:-)

Cor
Dec 30 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Dean Hallman | last post by:
I need to ensure client machine has Microsoft.mshtml installed in the GAC. And if not, deploy it. My app is a Browser Helper Object and depends on mshtml. Initially, I thought I could take care...
10
by: Hans Merkl | last post by:
Hi, I have written an pap with .NET 2.0 and c# that uses MSHTML. It works fine on my development machine but on machines I deploy it to I get the following exception when ever I call MSHTML. ...
4
by: Lars-Erik Aabech | last post by:
Hi! I've been walking in extacy since reading the article about test automation with IE in the latest MSDN mag. (http://msdn.microsoft.com/msdnmag/issues/05/10/TestRun/default.aspx) After a...
5
by: Atara | last post by:
I am trying to convert the following code to VB .Net, I still have some gaps (the lines that are marked with (*)) and also I need an ending condition for the while loop. any help would be...
0
by: Atara | last post by:
Our application was build with VS 2003. I have tried to run it on a computer with .Net 2.0 (but without .Net 1.1 , as it should be used) and I got the following error - ...
11
by: Lucky | last post by:
hi guys, i need to parse html data that i've got from "Inet" object in vb6. now i want to prase the html data. here i got 2 options. one is MSXML and other is MSHTML. i tried both of them but i...
5
by: Jason | last post by:
Hi, I'm developing an HTML Editor Control using VB.Net 2003 for an application that used to use the DHTML Editor Control that is no longer supported. Well, it's been fun but I've hit a wall with...
3
by: Sharon | last post by:
How can I find what MSXML version I have installed? -- Thanks Sharon
13
by: yawnmoth | last post by:
<http://www.quirksmode.org/book/printable/xmlhttp.txtshows two alternatives to Microsoft.XMLHTTP - Msxml2.XMLHTTP and Msxml3.XMLHTTP. If my understanding is correct, the different numbers refer to...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.