469,275 Members | 1,471 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,275 developers. It's quick & easy.

parsing javascript from local html file

Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:

<A HREF="javascript:function_declared_before("argumen ts"));>

First I though that using spidermonkey (the mozilla javascript
interpreter) should be enough, but in that case, I dont have the
document structure elements (like document, window, document.history,
document.form.element, etc), so I tried parsing the document using a
library to build a tree representation of it, but that leads me to the
same problem again, that is, I have to represent all tree nodes as
javascript entities.

Anybody here have worked on a similar problem? What tools do you
think I should take a look?

Thanks in advance!

Rodrigo.

Jan 11 '07 #1
3 1760
Rodrigo Meza said the following on 1/11/2007 2:22 PM:
Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:

<A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Jan 12 '07 #2
On Jan 12, 2:38 am, Randy Webb <HikksNotAtH...@aol.comwrote:
Rodrigo Meza said the following on 1/11/2007 2:22 PM:
Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>

Links in that form are stupid.
I didn't invented them, I just need to parse them :-)
Mar 6 '07 #3
Rodrigo Meza said the following on 3/6/2007 5:35 PM:
On Jan 12, 2:38 am, Randy Webb <HikksNotAtH...@aol.comwrote:
>Rodrigo Meza said the following on 1/11/2007 2:22 PM:
>>Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.

I didn't invented them, I just need to parse them :-)
I feel your pain. But it still a very bad way to have a link.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Mar 6 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

20 posts views Thread by Steve | last post: by
reply views Thread by bruce | last post: by
2 posts views Thread by hzgt9b | last post: by
9 posts views Thread by Asterbing | last post: by
15 posts views Thread by Asterbing | last post: by
1 post views Thread by avpkills2002 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.