473,849 Members | 1,779 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

XPath on web page with JavaScript

I have to send an XPath request on web page with JavaScript (with
XMLHttpRequest) that has to be executed before. I have no idea on how
to do that. Any pointer is welcome.
Sep 9 '08 #1
7 2405
kaer wrote:
I have to send an XPath request on web page with JavaScript (with
XMLHttpRequest) that has to be executed before. I have no idea on how
to do that.
I have no idea what you mean, and I do know my way around XPath and
XMLHttpRequest.

<http://jibbering.com/faq/>
<http://catb.org/~esr/faqs/smart-questions.html>
<http://developer.mozil la.org/en/docs/XPath>
<http://developer.mozil la.org/en/docs/AJAX>
PointedEars
--
Prototype.js was written by people who don't know javascript for people
who don't know javascript. People who don't know javascript are not
the best source of advice on designing systems that use javascript.
-- Richard Cornford, cljs, <f8************ *******@news.de mon.co.uk>
Sep 9 '08 #2
On 9 sep, 22:08, Thomas 'PointedEars' Lahn <PointedE...@we b.dewrote:
>
I have no idea what you mean, and I do know my way around XPath and
XMLHttpRequest.
I would like to be able to do something like:

import AbstractBrowser
browser=Abstrac tBrowser.Browse r()
browser.goto('w ww.somesitewith ajax.com', wait_until_onlo ad_done=True)
what_i_want= browser.XPath('/html/body/center/div/div[2]/div/div/div/
div[3]/div[2]/div[2]/div'])

This is pseudo-code (could be python but this is not important) on
pseudo-library (on whatever language, an application could do the job
as well) just to show what I want to do and what I am looking for.
Sep 9 '08 #3
kaer wrote:
On 9 sep, 22:08, Thomas 'PointedEars' Lahn <PointedE...@we b.dewrote:
>I have no idea what you mean, and I do know my way around XPath and
XMLHttpRequest .

I would like to be able to do something like:

import AbstractBrowser
browser=Abstrac tBrowser.Browse r()
browser.goto('w ww.somesitewith ajax.com', wait_until_onlo ad_done=True)
what_i_want= browser.XPath('/html/body/center/div/div[2]/div/div/div/
div[3]/div[2]/div[2]/div'])

This is pseudo-code (could be python but this is not important) on
pseudo-library (on whatever language, an application could do the job
as well) just to show what I want to do and what I am looking for.
For executing an ECMAScript program that makes use of an XHR implementation,
you need an environment which supports that.

With the Gecko DOM API, you can do:

var iframe = document.body.a ppendChild(docu ment.createElem ent("iframe"));
if (iframe)
{
iframe.addEvent Listener("load" ,
function() {
var d = this.contentDoc ument;
var what_you_want =
d.evaluate('/html/body/...', d.documentEleme nt, null, 0, null);
}, false);

iframe.contentW indow.location = "http://www.somesitewit hajax.com/";
}

If `www.somesitewi thajax.com' is not the same domain as the domain of the
URI of the accessing document resource, or the protocols or ports differ,
you will need HTTP proxying that fetches the content, because the SOP will
prevent access to the iframe document otherwise.
PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
Sep 9 '08 #4
Thomas 'PointedEars' Lahn wrote:
kaer wrote:
>On 9 sep, 22:08, Thomas 'PointedEars' Lahn <PointedE...@we b.dewrote:
>>I have no idea what you mean, and I do know my way around XPath and
XMLHttpReques t.
I would like to be able to do something like:

import AbstractBrowser
browser=Abstra ctBrowser.Brows er()
browser.goto(' www.somesitewit hajax.com', wait_until_onlo ad_done=True)
what_i_want= browser.XPath('/html/body/center/div/div[2]/div/div/div/
div[3]/div[2]/div[2]/div'])
[...]

[...]
With the Gecko DOM API, you can do:

var iframe = document.body.a ppendChild(docu ment.createElem ent("iframe"));
if (iframe)
{
iframe.addEvent Listener("load" ,
function() {
var d = this.contentDoc ument;
var what_you_want =
d.evaluate('/html/body/...', d.documentEleme nt, null, 0, null);
}, false);

iframe.contentW indow.location = "http://www.somesitewit hajax.com/";
}
Reviewing this, it is not going to work this way.

1. Loading the iframe means nothing about loading the iframe document.

2. The context node must be `d', not `d.documentElem ent', for `/html'
to work.

3. Loading the iframe document means nothing about the *A*JAX code to be
done modifying the document tree, for the very point of it is that it
is *asynchronous*.

While having the evaluation code be executed through window.setTimeo ut()
is a possibility, the reliable but non-trivial way would be to tap into
the `onreadystatech ange' listener of the XHR object. The issue then is
to find the name of the property that refers to that object.

So it would be better but yet to be improved if you used

var iframe = document.body.a ppendChild(docu ment.createElem ent("iframe"));
if (iframe)
{
var d = iframe.contentD ocument;
d.addEventListe ner("load",
function() {
var t = window.setTimeo ut(
function() {
window.clearTim eout(t);

var what_you_want =
d.evaluate('/html/body/...', d, null, 0, null);
},
1000);
},
false);

iframe.contentW indow.location = "http://www.somesitewit hajax.com/";
}

And then there is still the issue of frame-breaking scripts running on that
site.
PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
Sep 10 '08 #5
On 10 sep, 08:44, Thomas 'PointedEars' Lahn <PointedE...@we b.de>
wrote:
Thomas 'PointedEars' Lahn wrote:
kaer wrote:
On 9 sep, 22:08, Thomas 'PointedEars' Lahn <PointedE...@we b.dewrote:
I have no idea what you mean, and I do know my way around XPath and
XMLHttpRequest .
I would like to be able to do something like:
import AbstractBrowser
browser=Abstrac tBrowser.Browse r()
browser.goto('w ww.somesitewith ajax.com', wait_until_onlo ad_done=True)
what_i_want= browser.XPath('/html/body/center/div/div[2]/div/div/div/
div[3]/div[2]/div[2]/div'])
[...]
[...]
With the Gecko DOM API, you can do:
var iframe = document.body.a ppendChild(docu ment.createElem ent("iframe"));
if (iframe)
{
iframe.addEvent Listener("load" ,
function() {
var d = this.contentDoc ument;
var what_you_want =
d.evaluate('/html/body/...', d.documentEleme nt, null, 0, null);
}, false);
iframe.contentW indow.location = "http://www.somesitewit hajax.com/";
}

Reviewing this, it is not going to work this way.

1. Loading the iframe means nothing about loading the iframe document.

2. The context node must be `d', not `d.documentElem ent', for `/html'
to work.

3. Loading the iframe document means nothing about the *A*JAX code to be
done modifying the document tree, for the very point of it is that it
is *asynchronous*.

While having the evaluation code be executed through window.setTimeo ut()
is a possibility, the reliable but non-trivial way would be to tap into
the `onreadystatech ange' listener of the XHR object. The issue then is
to find the name of the property that refers to that object.

So it would be better but yet to be improved if you used

var iframe = document.body.a ppendChild(docu ment.createElem ent("iframe"));
if (iframe)
{
var d = iframe.contentD ocument;
d.addEventListe ner("load",
function() {
var t = window.setTimeo ut(
function() {
window.clearTim eout(t);

var what_you_want =
d.evaluate('/html/body/...', d, null, 0, null);
},
1000);
},
false);

iframe.contentW indow.location = "http://www.somesitewit hajax.com/";
}

And then there is still the issue of frame-breaking scripts running on that
site.

PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
Many thanks for that, I will try to go deeper inside those stuffs as
soon as I can. I have a lot to learn but very interesting anyway.

I wonder though that there is no a libraries or applications doing
that. If you think about it, what I need is just a browser without the
display stuff BUT with the ability to call functions giving back the
document tree in its actuel state.

Thanks again.
Sep 10 '08 #6
kaer wrote:
Thomas 'PointedEars' Lahn wrote:
>[...]
So it would be better but yet to be improved if you used

var iframe = document.body.a ppendChild(docu ment.createElem ent("iframe"));
if (iframe)
{
var d = iframe.contentD ocument;
d.addEventListe ner("load",
function() {
var t = window.setTimeo ut(
function() {
window.clearTim eout(t);

var what_you_want =
d.evaluate('/html/body/...', d, null, 0, null);
},
1000);
},
false);

iframe.contentW indow.location = "http://www.somesitewit hajax.com/";
}

And then there is still the issue of frame-breaking scripts running on that
site.
This can be worked around if one does not use an iframe but a popup
window/tab. It would move the issue from the target (Web site) to the
source (client), though, where there may be popup blockers.
>[...]

Many thanks for that, I will try to go deeper inside those stuffs as
soon as I can. I have a lot to learn but very interesting anyway.
You are welcome.
I wonder though that there is no a libraries or applications doing
that.
It may be because (white-hat) hackers would not support the idea of using
content that they did not create without permission; generally, this is a
copyright/author's rights issue. (IANAL; you have been warned.)
If you think about it, what I need is just a browser without the
display stuff BUT with the ability to call functions giving back the
document tree in its actuel state.
It would appear that those specifications are mutually exclusive. You can
certainly parse the `responseText' into a Document object, however ISTM you
need the "display stuff" for the tree-manipulating script code to be
executed in the context of the represented document, generally.

While I can think of a hack that evaluates all script code in the document
this way, it remains to be seen how adaptive and cross-browser such a
solution would be. For example, if the XHR code used the value of the
`offsetWidth' property to determine whether or not an element should be
created or modified in, or removed from the document tree, would that value
make sense when the element in question is not displayed?

Please trim your quotes and do not quote signatures.
PointedEars
--
Anyone who slaps a 'this page is best viewed with Browser X' label on
a Web page appears to be yearning for the bad old days, before the Web,
when you had very little chance of reading a document written on another
computer, another word processor, or another network. -- Tim Berners-Lee
Sep 10 '08 #7
"kaer" <ka*******@gmai l.comwrote in message
news:e3******** *************** ***********@e53 g2000hsa.google groups.com...
I wonder though that there is no a libraries or applications doing
that. If you think about it, what I need is just a browser without the
display stuff BUT with the ability to call functions giving back the
document tree in its actuel state.
I did not say this but have a look at jQuery it may help you, but there is
no substitute for knowing things from the ground up. jQuery has an XPath
plug in, not sure how good it is or how well documented it is.

Try Googling for "jQuery" and "jQuery XPath".

Good luck,

Aaron
Sep 10 '08 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
6110
by: Neil Zanella | last post by:
Hello, I would like to know whether the mozilla web browser has built in support for searching XML documents via XPath expressions as with IE's xmlobject's and xmlDoc's function selectNodes() or similar. Thanks! Neil
2
4935
by: Tjerk Wolterink | last post by:
IU have the following xsl root element: <xsl:stylesheet version="1.0" xmlns="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:page="http://www.wolterinkwebdesign.com/xml/page" xmlns:xc="http://www.wolterinkwebdesign.com/xml/xcontent"> Well how do i select the xhtml namespace with xpath
5
4215
by: laks | last post by:
Hi I have the following xsl stmt. <xsl:for-each select="JOB_POSTINGS/JOB_POSTING \"> <xsl:sort select="JOB_TITLE" order="ascending"/> This works fine when I use it. But when using multiple values in the where clause as below
5
7940
by: Gnic | last post by:
Hi , I have an XmlDocument instance, I want to find a node in the xml, but I don't know it's path until runtime, for example <aaa> <bbb name="x"/> <aaa attr="y"> <ccc>sometext</ccc> </aaa>
6
11481
by: Gale | last post by:
I'm working on something in jQuery with XPath What I want to do is: if checkbox is checked, set background color od label that contain input(checkbox) to red I have this code: $("label]").css("background", "red"); but it doesn't working because of nested brackets ] and it freeze the browser
3
1920
by: Greg | last post by:
Hi, I want to create a web based interface that uses a form + Javascript (in an XHTML namespace) to construct an XPath to query and modify the attributes of some SVG (in an SVG namespace). There are lots of Google hits on each of these specifications and related technology, shuch as namespaces but there's almost nothing I can find that shows useful examples of the different ways they might be combined?
2
1828
by: soren625 | last post by:
I have searched this (and other) groups extensively, in addition to the clj FAQ and the Web, and (to my surprise) this question doesn't come up as often as I thought it would. Maybe this is because either is patently impossible or I am not searching for the right thing. If either of these is the case, of course, please let me know. What I am trying to do is grab a little snippet of data from a remote page based on user input in a form....
4
12770
by: Claudio Calboni | last post by:
Hello folks, I'm having some performance issues with the client-side part of my application. Basically, it renders a huge HTML table (about 20'000 cells in my testing scenario), without content. Content is "pushed" from the back via some JS, for the only displayed portion of table. Once the user scrolls, JS updates visible cells with data. It's quite the philosophy behind GMaps and similars. So, the server says to JS "update this group...
1
2345
by: newToAjax | last post by:
I have created an ajax application which retrievs an xml file and fills in the tab fields on the form.The code works fine in IE while its does not in Mozilla. Can you please let me know if i have to install some plugins to use XPATH? <html> <head> <title>SilverLine </title> <link rel="stylesheet" href="example.css" TYPE="text/css" MEDIA="screen"> <script type="text/javascript"> /* Optional: Temporarily hide the "tabber" class so it...
0
9893
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10665
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10723
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10349
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9501
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7894
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7070
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5735
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4544
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.