extracting part of a document

Une Bévue

the purpose :

avoid all banners and unusefull contents of an html document the leaves
intact the part from start to body and inside the body leave only the
part where user has clicked (by mousedown -- mousemove --mouseup)).

for example a schematic document as input :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="one">div one contents </div>
<div id="two">div two contents </div>
<div id="three">div three contents </div>
</body>
<.html>

suppose the user clicked down and up into div "two", i want to
transform (in-place) the given document into :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="two">div two contents </div>
</body>
<.html>

then leaving only div two inside the body.

i've started to work about that (not successfully)

following :
<http://www.quirksmode.org/js/events_mouse.html>
and :
<http://www.quirksmode.org/dom/getElementsByTagNames.html>

where iI did extract a usefull function, to me :

--- getElementsByTagNames(list,obj) ---
function getElementsByTagNames(list,obj) {
nb_calls++;
if (!obj) var obj = document.body;
var tagNames = list.split(',');
var resultArray = new Array();
var tags;
for (var i=0;i<tagNames.length;i++) {
tags = obj.getElementsByTagName(tagNames[i]);
nb_tags+=tags.length;
for (var j=0;j<tags.length;j++) {
resultArray.push(tags[j]);
nb_loop++;
}
}
return resultArray;
}
-----------------------------------------------------

here are the probs i get )))

with an html page as mentionned above having 5 divs inside the body, in
order to simulate a "real life" document :
div banner, div left, div extract, div right and div footer.

the div extract dom structure being ;

--- div#extract ----------------------------------------
<div id="extract">
<h3 id="click">Mousedown, mouseup, click</h3>

<p>...</p>

<ol>
<li><code>...</code>...</li>
<li><code>...</code>,...</li>
<li><code>...</code>...
<code>...</code>...<code>...</code>...</li>
</ol>

<p>... <code>...</code...<code>...</code>
....<code>...</code>...<code>...</code>....<code>...</code>...</p>

<p>...<code>...</code>... <code>...</code....</p>

<p>...<code>...</code>...<code>click</code>...</p>

<p>...</p>
</div>
-----------------------------------------------------------
if, on this div extract i do :

extract=document.getElementById("extract")
then :
elts_extract=getElementsByTagNames(the_list,extrac t);
with :
var the_list="div, h1, h2, h3, h4, h5, p, img, ul, li, table, pre"

i get NO elements at all using the function
"getElementsByTagNames(list,obj)"

notice that the vars : nb_calls, nb_tags and nb_loop are there only
for debuging.

in case someone have some light abour that...

Oct 17 '06 #1

Subscribe Post Reply

1516

Une Bévue

Une Bévue <pe*******@laponie.com.invalidwrote:

the purpose :

<snip/>

in case someone have some light abour that...

found a solution here irs is :
<http://thoraval.yvon.free.fr/JavaScript/extract.html>

Oct 17 '06 #2

RobG

Une Bévue wrote:

the purpose :

avoid all banners and unusefull contents of an html document the leaves
intact the part from start to body and inside the body leave only the
part where user has clicked (by mousedown -- mousemove --mouseup)).

for example a schematic document as input :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="one">div one contents </div>
<div id="two">div two contents </div>
<div id="three">div three contents </div>
</body>
<.html>

suppose the user clicked down and up into div "two", i want to
transform (in-place) the given document into :

<html><title>...<meta<<link to csss, javascript ect>
<body...>
<div id="two">div two contents </div>
</body>
<.html>

then leaving only div two inside the body.

i've started to work about that (not successfully)

That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:
function trimBody (htmlElement){
var docBody = document.body;
while ( docBody.firstChild ){
docBody.removeChild( docBody.firstChild );
}
docBody.appendChild( htmlElement );
}

Your method of getting elements by tag name will result in a set of
node collections, you have destroyed the structure and don't know how
to put it back. The above maintains the structure:

<script type="text/javascript">

function trimBody(htmlElement){
var docBody = document.body;
while (docBody.firstChild){
docBody.removeChild(docBody.firstChild);
}
docBody.appendChild(htmlElement);
}

</script>
<body>
<div id="one" onclick="trimBody(this);">
<p>Click here to keep just div <b>one</b>
</div>
<div id="two" onclick="trimBody(this);">
<p>Click here to keep just div <b>two</b>
</div>
<div id="three" onclick="trimBody(this);">
<p>Click here to keep just div <b>three</b>
</div>
</body>

--
Rob

Oct 17 '06 #3

Une Bévue

RobG <rg***@iinet.net.auwrote:

>
That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:

<snip/>

yes fine thanks, i've found it see above in this thread my
auto-answer...

Oct 17 '06 #4

RobG

Une Bévue wrote:

RobG <rg***@iinet.net.auwrote:

That's enough. Once you have a reference to div two, you can remove
all the body's child nodes, then re-attache div two:
<snip/>

yes fine thanks, i've found it see above in this thread my
auto-answer...

OK, but don't believe the junk about "Safari bug": any node that
supports the event interface can be an event target, it's just that
webkit browsers have implemented it on text nodes where other browsers
haven't.

I think my method of removing child nodes is more efficient... you are
free to chose. :-)

--
Rob

Oct 17 '06 #5

Une Bévue

RobG <rg***@iinet.net.auwrote:

>
I think my method of removing child nodes is more efficient... you are
free to chose. :-)

yes fine, i'm able to change my mind ;-)

yes right your while loop is more clever than my for one, i do agree but
i don't understand why my version would "destroyed the structure" ?

i think i've notice i did taht in reverse order (from last to first) ???

new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>

Oct 17 '06 #6

RobG

Une Bévue wrote:

RobG <rg***@iinet.net.auwrote:

>I think my method of removing child nodes is more efficient... you are
free to chose. :-)

yes fine, i'm able to change my mind ;-)

yes right your while loop is more clever than my for one, i do agree but
i don't understand why my version would "destroyed the structure" ?

Using getElementsByTagNames created an array of elements that was not
the same as the original structure, I assumed you were going to just put
them back in the same order as the array, not where they started from.

It failed because your list of tag names was:

"div, h1, h2, h3, ..."

There were no div's inside div extract, and the other tags have leading
spaces so you were trying to match " h1" rather than "h1", etc.

i think i've notice i did taht in reverse order (from last to first) ???

It seems you were doing 0 to i'th, but that is not relevant. The array
of elements is not the same structure as the original HTML, it's been
destroyed by collecting all the elements with the same tag name together.

>
new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>

Works fine (even in Safari) :-)

--
Rob

Oct 17 '06 #7

Une Bévue

RobG <rg***@iinet.net.auwrote:

It seems you were doing 0 to i'th, but that is not relevant. The array
of elements is not the same structure as the original HTML, it's been
destroyed by collecting all the elements with the same tag name together.

ok, i've understood now what u mean.

>

new version on line :
<http://thoraval.yvon.free.fr/JavaScript/trim_body.html>

Works fine (even in Safari) :-)

yes i even try it with Webkit the latest nightly build (doesn't give the
same height for the divs...)

right now i do have to write a ruby script in order to put this line
somewhere in the head :

<script type="text/javascript" src="js/trim_body.js"></script>

not a big tuff.

and test over any given page...

Oct 17 '06 #8

by: Charles Law | last post by:

Hi guys A bit of curve ball here ... I have a document (Word) that contains a series of instructions in sections and subsections (and sub-subsections). There are 350 pages of them. I need to...

.NET Framework

Extracting text via DOM

by: Michael Powe | last post by:

Hello, I wish to extract some text from certain elements on the page and process them. I've done this in the past by keying on the className but I don't have that option in this case. Below is...

Javascript

Extracting Time from the SQL DateTime field.

by: v0lcan0 | last post by:

Any help on extracting the time part from the datetime field in SQL database. even though i had entered only the time part in the database when i extract the field it gives me only the date...

Microsoft SQL Server

Extracting embedded OLE documents from Access

by: Chris Belcher | last post by:

First some background... The database tracks Action Items assigned to a group of 20 or so managers. Once the assignment is created it is then emailed to each of the managers that are included in...

Microsoft Access / VBA

extracting part of a graphic in a PictureBox to the clipboard

by: ray well | last post by:

hi, i need to give the user the ability to extract a rectangular area of their choice from a graphic displayed in a picture box to the clipboard, so they can use it elsewhere. say the graphic...

Visual Basic .NET

extracting part of a graphic in a PictureBox

by: ray well | last post by:

i need to extract a rectangular area from a graphic image displayed in a PictureBox, to another PictureBox. how can i do that? i would appreciate a piece of code if possible. thanks, ray

C# / C Sharp

Extracting text from a Word document via StreamReader - track chan

by: Kevin K | last post by:

Hi, I'm having a problem with extracting text from a Word document using StreamReader. As I'm developing a web application, I do NOT want the server to make calls to Word. I want to simply...

ASP.NET

Extracting part of integer

by: yssi83 | last post by:

I need a way to access the different bytes that makes up an integer. I use an image-processing library which gives the RGBA-value of a pixel as a single 32-bit integer. The first byte holds the...

C / C++

extracting part of a text in a file

by: mahmoodn | last post by:

Hi, I am new to python. I have a file in my $HOME (say a.tcl). I want to extract something from it with a pattern and assign that to a variable. So: Content of /home/mahmood/a.tcl is: ... set...

Python

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing

extracting part of a document

Similar topics