473,386 Members | 1,726 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,386 software developers and data experts.

extracting part of a document

the purpose :

avoid all banners and unusefull contents of an html document the leaves
intact the part from start to body and inside the body leave only the
part where user has clicked (by mousedown -- mousemove --mouseup)).

for example a schematic document as input :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="one">div one contents </div>
<div id="two">div two contents </div>
<div id="three">div three contents </div>
</body>
<.html>

suppose the user clicked down and up into div "two", i want to
transform (in-place) the given document into :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="two">div two contents </div>
</body>
<.html>

then leaving only div two inside the body.

i've started to work about that (not successfully)

following :
<http://www.quirksmode.org/js/events_mouse.html>
and :
<http://www.quirksmode.org/dom/getElementsByTagNames.html>

where iI did extract a usefull function, to me :

--- getElementsByTagNames(list,obj) ---
function getElementsByTagNames(list,obj) {
nb_calls++;
if (!obj) var obj = document.body;
var tagNames = list.split(',');
var resultArray = new Array();
var tags;
for (var i=0;i<tagNames.length;i++) {
tags = obj.getElementsByTagName(tagNames[i]);
nb_tags+=tags.length;
for (var j=0;j<tags.length;j++) {
resultArray.push(tags[j]);
nb_loop++;
}
}
return resultArray;
}
-----------------------------------------------------

here are the probs i get )))

with an html page as mentionned above having 5 divs inside the body, in
order to simulate a "real life" document :
div banner, div left, div extract, div right and div footer.

the div extract dom structure being ;

--- div#extract ----------------------------------------
<div id="extract">
<h3 id="click">Mousedown, mouseup, click</h3>

<p>...</p>

<ol>
<li><code>...</code>...</li>
<li><code>...</code>,...</li>
<li><code>...</code>...
<code>...</code>...<code>...</code>...</li>
</ol>

<p>... <code>...</code...<code>...</code>
....<code>...</code>...<code>...</code>....<code>...</code>...</p>

<p>...<code>...</code>... <code>...</code....</p>

<p>...<code>...</code>...<code>click</code>...</p>

<p>...</p>
</div>
-----------------------------------------------------------
if, on this div extract i do :

extract=document.getElementById("extract")
then :
elts_extract=getElementsByTagNames(the_list,extrac t);
with :
var the_list="div, h1, h2, h3, h4, h5, p, img, ul, li, table, pre"

i get NO elements at all using the function
"getElementsByTagNames(list,obj)"

notice that the vars : nb_calls, nb_tags and nb_loop are there only
for debuging.

in case someone have some light abour that...
Oct 17 '06 #1
7 1516
Une Bévue <pe*******@laponie.com.invalidwrote:
the purpose :
<snip/>
in case someone have some light abour that...
found a solution here irs is :
<http://thoraval.yvon.free.fr/JavaScript/extract.html>
Oct 17 '06 #2

Une Bévue wrote:
the purpose :

avoid all banners and unusefull contents of an html document the leaves
intact the part from start to body and inside the body leave only the
part where user has clicked (by mousedown -- mousemove --mouseup)).

for example a schematic document as input :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="one">div one contents </div>
<div id="two">div two contents </div>
<div id="three">div three contents </div>
</body>
<.html>

suppose the user clicked down and up into div "two", i want to
transform (in-place) the given document into :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="two">div two contents </div>
</body>
<.html>

then leaving only div two inside the body.

i've started to work about that (not successfully)
That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:
function trimBody (htmlElement){
var docBody = document.body;
while ( docBody.firstChild ){
docBody.removeChild( docBody.firstChild );
}
docBody.appendChild( htmlElement );
}

Your method of getting elements by tag name will result in a set of
node collections, you have destroyed the structure and don't know how
to put it back. The above maintains the structure:

<script type="text/javascript">

function trimBody(htmlElement){
var docBody = document.body;
while (docBody.firstChild){
docBody.removeChild(docBody.firstChild);
}
docBody.appendChild(htmlElement);
}

</script>
<body>
<div id="one" onclick="trimBody(this);">
<p>Click here to keep just div <b>one</b>
</div>
<div id="two" onclick="trimBody(this);">
<p>Click here to keep just div <b>two</b>
</div>
<div id="three" onclick="trimBody(this);">
<p>Click here to keep just div <b>three</b>
</div>
</body>

--
Rob

Oct 17 '06 #3
RobG <rg***@iinet.net.auwrote:
>
That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:
<snip/>

yes fine thanks, i've found it see above in this thread my
auto-answer...
Oct 17 '06 #4

Une Bévue wrote:
RobG <rg***@iinet.net.auwrote:

That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:
<snip/>

yes fine thanks, i've found it see above in this thread my
auto-answer...
OK, but don't believe the junk about "Safari bug": any node that
supports the event interface can be an event target, it's just that
webkit browsers have implemented it on text nodes where other browsers
haven't.

I think my method of removing child nodes is more efficient... you are
free to chose. :-)

--
Rob

Oct 17 '06 #5
RobG <rg***@iinet.net.auwrote:
>
I think my method of removing child nodes is more efficient... you are
free to chose. :-)
yes fine, i'm able to change my mind ;-)

yes right your while loop is more clever than my for one, i do agree but
i don't understand why my version would "destroyed the structure" ?

i think i've notice i did taht in reverse order (from last to first) ???

new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>
Oct 17 '06 #6
Une Bévue wrote:
RobG <rg***@iinet.net.auwrote:
>I think my method of removing child nodes is more efficient... you are
free to chose. :-)

yes fine, i'm able to change my mind ;-)

yes right your while loop is more clever than my for one, i do agree but
i don't understand why my version would "destroyed the structure" ?
Using getElementsByTagNames created an array of elements that was not
the same as the original structure, I assumed you were going to just put
them back in the same order as the array, not where they started from.

It failed because your list of tag names was:

"div, h1, h2, h3, ..."

There were no div's inside div extract, and the other tags have leading
spaces so you were trying to match " h1" rather than "h1", etc.
i think i've notice i did taht in reverse order (from last to first) ???
It seems you were doing 0 to i'th, but that is not relevant. The array
of elements is not the same structure as the original HTML, it's been
destroyed by collecting all the elements with the same tag name together.
>
new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>
Works fine (even in Safari) :-)

--
Rob
Oct 17 '06 #7
RobG <rg***@iinet.net.auwrote:
It seems you were doing 0 to i'th, but that is not relevant. The array
of elements is not the same structure as the original HTML, it's been
destroyed by collecting all the elements with the same tag name together.
ok, i've understood now what u mean.
>

new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>

Works fine (even in Safari) :-)
yes i even try it with Webkit the latest nightly build (doesn't give the
same height for the divs...)

right now i do have to write a ruby script in order to put this line
somewhere in the head :

<script type="text/javascript" src="js/trim_body.js"></script>

not a big tuff.

and test over any given page...
Oct 17 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: Charles Law | last post by:
Hi guys A bit of curve ball here ... I have a document (Word) that contains a series of instructions in sections and subsections (and sub-subsections). There are 350 pages of them. I need to...
5
by: Michael Powe | last post by:
Hello, I wish to extract some text from certain elements on the page and process them. I've done this in the past by keying on the className but I don't have that option in this case. Below is...
1
by: v0lcan0 | last post by:
Any help on extracting the time part from the datetime field in SQL database. even though i had entered only the time part in the database when i extract the field it gives me only the date...
2
by: Chris Belcher | last post by:
First some background... The database tracks Action Items assigned to a group of 20 or so managers. Once the assignment is created it is then emailed to each of the managers that are included in...
1
by: ray well | last post by:
hi, i need to give the user the ability to extract a rectangular area of their choice from a graphic displayed in a picture box to the clipboard, so they can use it elsewhere. say the graphic...
1
by: ray well | last post by:
i need to extract a rectangular area from a graphic image displayed in a PictureBox, to another PictureBox. how can i do that? i would appreciate a piece of code if possible. thanks, ray
2
by: Kevin K | last post by:
Hi, I'm having a problem with extracting text from a Word document using StreamReader. As I'm developing a web application, I do NOT want the server to make calls to Word. I want to simply...
2
by: yssi83 | last post by:
I need a way to access the different bytes that makes up an integer. I use an image-processing library which gives the RGBA-value of a pixel as a single 32-bit integer. The first byte holds the...
2
by: mahmoodn | last post by:
Hi, I am new to python. I have a file in my $HOME (say a.tcl). I want to extract something from it with a pattern and assign that to a variable. So: Content of /home/mahmood/a.tcl is: ... set...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.