Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>
First I though that using spidermonkey (the mozilla javascript
interpreter) should be enough, but in that case, I dont have the
document structure elements (like document, window, document.history,
document.form.element, etc), so I tried parsing the document using a
library to build a tree representation of it, but that leads me to the
same problem again, that is, I have to represent all tree nodes as
javascript entities.
Anybody here have worked on a similar problem? What tools do you
think I should take a look?
Thanks in advance!
Rodrigo. 3 1897
Rodrigo Meza said the following on 1/11/2007 2:22 PM:
Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.
--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/
On Jan 12, 2:38 am, Randy Webb <HikksNotAtH...@aol.comwrote:
Rodrigo Meza said the following on 1/11/2007 2:22 PM:
Hello Everyone
For a project I am working on, I need to retrieve links from html
documents. The easy part is to obtain 'plain' links like <A
HREF="http://site/path/document">, but when those links are
javascript'ized, the only robust solution needs to load the javascript
and dom document representation in the same way that browsers do. For
example, links in the form:
<A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.
I didn't invented them, I just need to parse them :-)
Rodrigo Meza said the following on 3/6/2007 5:35 PM:
On Jan 12, 2:38 am, Randy Webb <HikksNotAtH...@aol.comwrote:
>Rodrigo Meza said the following on 1/11/2007 2:22 PM:
>>Hello Everyone For a project I am working on, I need to retrieve links from html documents. The easy part is to obtain 'plain' links like <A HREF="http://site/path/document">, but when those links are javascript'ized, the only robust solution needs to load the javascript and dom document representation in the same way that browsers do. For example, links in the form: <A HREF="javascript:function_declared_before("argumen ts"));>
Links in that form are stupid.
I didn't invented them, I just need to parse them :-)
I feel your pain. But it still a very bad way to have a link.
--
Randy
Chance Favors The Prepared Mind
comp.lang.javascript FAQ - http://jibbering.com/faq/index.html
Javascript Best Practices - http://www.JavascriptToolbox.com/bestpractices/ This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: bugbear |
last post by:
Subject pretty much says it all. I'd like to
parse XML (duh!) using Xerces (because its fast,
and reliable, and comprehensive, and supports lots
of features).
I'd like to conform to standards...
|
by: Steve |
last post by:
I have a web app that needs to parse through a file that is located on the
client machine. I get the file string from a query string & then parse it.
It is working fine on my development box but...
|
by: VictorG |
last post by:
Hello,
I am new to JS and am trying to add some HTML into a JS function. So
that when called the script as well as the HTML will be invoked. Is
there some type of embed mechanism, sort of the...
|
by: Kevin C. |
last post by:
I'm going to write a client-side web app that deals with local files. This
web app targets mainly public or corporate browser terminals. Normally I
would write an Applet because that's what I know...
|
by: bruce |
last post by:
hi...
it appears that i'm running into a possible problem with
mechanize/browser/python rgarding the "select_form" method. i've tried the
following and get the error listed:
br.select_form(nr...
|
by: hzgt9b |
last post by:
I've written a simple javascript page that parses an XML file...
(Actually I just modified the "Parsing an XML File" sample from
http://www.w3schools.com/dom/dom_parser.asp)
The page works great...
|
by: Asterbing |
last post by:
Hi all,
Don't know where to ask my question because the way to go is included in
the possible answer itself by nature... You'll understand better below :
Well, I have an HTML page containing a...
|
by: Asterbing |
last post by:
Already posted in comp.lang.javascript but not found any solution :-(
--
Hi all,
Don't know where to ask my question because the way to go is included in
the possible answer itself by...
|
by: avpkills2002 |
last post by:
I seem to be getting this weird problem in Internet explorer. I have
written a code for parsing a XML file and displaying the output. The
code works perfectly fine with ffx(Firefox).However is not...
|
by: taylorcarr |
last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: aa123db |
last post by:
Variable and constants
Use var or let for variables and const fror constants.
Var foo ='bar';
Let foo ='bar';const baz ='bar';
Functions
function $name$ ($parameters$) {
}
...
|
by: ryjfgjl |
last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
| |