473,386 Members | 1,609 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

parsing javascript from local html file

Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:

<A HREF="javascript:function_declared_before("argumen ts"));>

First I though that using spidermonkey (the mozilla javascript
interpreter) should be enough, but in that case, I dont have the
document structure elements (like document, window, document.history,
document.form.element, etc), so I tried parsing the document using a
library to build a tree representation of it, but that leads me to the
same problem again, that is, I have to represent all tree nodes as
javascript entities.

Anybody here have worked on a similar problem? What tools do you
think I should take a look?

Thanks in advance!

Rodrigo.

Jan 11 '07 #1
3 1897
Rodrigo Meza said the following on 1/11/2007 2:22 PM:
Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:

<A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Jan 12 '07 #2
On Jan 12, 2:38 am, Randy Webb <HikksNotAtH...@aol.comwrote:
Rodrigo Meza said the following on 1/11/2007 2:22 PM:
Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>

Links in that form are stupid.
I didn't invented them, I just need to parse them :-)
Mar 6 '07 #3
Rodrigo Meza said the following on 3/6/2007 5:35 PM:
On Jan 12, 2:38 am, Randy Webb <HikksNotAtH...@aol.comwrote:
>Rodrigo Meza said the following on 1/11/2007 2:22 PM:
>>Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.

I didn't invented them, I just need to parse them :-)
I feel your pain. But it still a very bad way to have a link.

--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
Mar 6 '07 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: bugbear | last post by:
Subject pretty much says it all. I'd like to parse XML (duh!) using Xerces (because its fast, and reliable, and comprehensive, and supports lots of features). I'd like to conform to standards...
20
by: Steve | last post by:
I have a web app that needs to parse through a file that is located on the client machine. I get the file string from a query string & then parse it. It is working fine on my development box but...
10
by: VictorG | last post by:
Hello, I am new to JS and am trying to add some HTML into a JS function. So that when called the script as well as the HTML will be invoked. Is there some type of embed mechanism, sort of the...
1
by: Kevin C. | last post by:
I'm going to write a client-side web app that deals with local files. This web app targets mainly public or corporate browser terminals. Normally I would write an Applet because that's what I know...
0
by: bruce | last post by:
hi... it appears that i'm running into a possible problem with mechanize/browser/python rgarding the "select_form" method. i've tried the following and get the error listed: br.select_form(nr...
2
by: hzgt9b | last post by:
I've written a simple javascript page that parses an XML file... (Actually I just modified the "Parsing an XML File" sample from http://www.w3schools.com/dom/dom_parser.asp) The page works great...
9
by: Asterbing | last post by:
Hi all, Don't know where to ask my question because the way to go is included in the possible answer itself by nature... You'll understand better below : Well, I have an HTML page containing a...
15
by: Asterbing | last post by:
Already posted in comp.lang.javascript but not found any solution :-( -- Hi all, Don't know where to ask my question because the way to go is included in the possible answer itself by...
1
by: avpkills2002 | last post by:
I seem to be getting this weird problem in Internet explorer. I have written a code for parsing a XML file and displaying the output. The code works perfectly fine with ffx(Firefox).However is not...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.