473,396 Members | 1,755 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Obtaining the textNode from within multiple elements.

Daz
Hi everyone.

Is there a simple way for me to get the value of the textNodes from
this piece of HTML, without iterating through the whole thing?

<table>
<tbody>
<tr>
<td>
<i><b>example text</b></i>
</td>
<td>
example text
</td>
<td>
<font color="blue">example text</font>
</td>
</tr>
<tr>
<td>
<b>example text</b>
</td>
<td>
<font color="green"><u><b>example text</b></u></font>
</td>
<td>
<b>example text</u>
</td>
</tr>
</tbody>
</table>

Please note the format of the text is different in each cell, and that
the code I need to obtain the textNodes from is not mine, so I cannot
change that format. I am simply using JavaScript to make a browser
extension that will do useful things with the page.

Many thanks.

Daz.

Dec 10 '06 #1
2 1121

Daz wrote:
Hi everyone.

Is there a simple way for me to get the value of the textNodes from
this piece of HTML, without iterating through the whole thing?
You can use a number of strategies based on feature detection: firstly
try textContent, if that is not supported, try innerText. If that
isn't supported, you have a choice of innerHTML and striping out the
tags, or you can recursively iterate over all the nodes and grab just
the text.

There are some functions posted here:

<URL:
http://groups.google.com/group/comp....f5c61c0ce91bfe
>
Copies are included below.

[...]
>
Please note the format of the text is different in each cell, and that
the code I need to obtain the textNodes from is not mine, so I cannot
change that format. I am simply using JavaScript to make a browser
extension that will do useful things with the page.
It's probably better if you say what you want the script to do, simply
getting all the text may not be what you really need.
Posted functions:

Using fallback to innerHTML and a regular expression to remove tags:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;
return el.innerHTML.replace(/<[^>]+>/g,'');
}

A better regular expression might be:

.replace( /<[^<>]+>/g, '' )

Suggested by Mike Winter:
<URL:
http://groups.google.com.au/group/co...06dda8f672ef5f
>
To avoid issues with regular expressions, use recursion - it will be
slower but that may not matter:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;

// If both fail, use recursion
return getText2(el);

// Recursive inner function
function getText2(el) {
var x = el.childNodes;
var txt = '';
for (var i=0, len=x.length; i<len; ++i){
if (3 == x[i].nodeType) {
txt += x[i].data;
} else if (1 == x[i].nodeType){
txt += getText2(x[i]);
}
}

// Collapse whitespace before returning
return txt.replace(/\s+/g,' ');
}
}
--
Rob

Dec 11 '06 #2
Daz

RobG wrote:
Daz wrote:
Hi everyone.

Is there a simple way for me to get the value of the textNodes from
this piece of HTML, without iterating through the whole thing?

You can use a number of strategies based on feature detection: firstly
try textContent, if that is not supported, try innerText. If that
isn't supported, you have a choice of innerHTML and striping out the
tags, or you can recursively iterate over all the nodes and grab just
the text.

There are some functions posted here:

<URL:
http://groups.google.com/group/comp....f5c61c0ce91bfe
Copies are included below.

[...]

Please note the format of the text is different in each cell, and that
the code I need to obtain the textNodes from is not mine, so I cannot
change that format. I am simply using JavaScript to make a browser
extension that will do useful things with the page.

It's probably better if you say what you want the script to do, simply
getting all the text may not be what you really need.
Posted functions:

Using fallback to innerHTML and a regular expression to remove tags:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;
return el.innerHTML.replace(/<[^>]+>/g,'');
}

A better regular expression might be:

.replace( /<[^<>]+>/g, '' )

Suggested by Mike Winter:
<URL:
http://groups.google.com.au/group/co...06dda8f672ef5f

To avoid issues with regular expressions, use recursion - it will be
slower but that may not matter:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;

// If both fail, use recursion
return getText2(el);

// Recursive inner function
function getText2(el) {
var x = el.childNodes;
var txt = '';
for (var i=0, len=x.length; i<len; ++i){
if (3 == x[i].nodeType) {
txt += x[i].data;
} else if (1 == x[i].nodeType){
txt += getText2(x[i]);
}
}

// Collapse whitespace before returning
return txt.replace(/\s+/g,' ');
}
}
--
Rob
All very good ideas. I tried innerText, which isn't supported by
Firefox, so I was considering recursion but hoped there may have been a
better way. I would imagine that textContent is the key that just might
help me out. As I am designing XPIs for Firefox, I don't need to worry
about other browsers not working with the code.

Many thanks again.

Daz.

Dec 11 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Jyrki Keisala | last post by:
Hi, I am trying to transform an XML file into a HTML table with XSLT. The structure of my XML file is roughly this: <profile> <command name="..." phrase="..."> <key extended="..."...
12
by: Anna | last post by:
Hi all, I posted the same question this afternoon but my message isn't showing up, so I thought I'd give it another try.... in case you should see it later I apologize for posting the same...
2
by: gregl | last post by:
Anyone know how to obtain the true size of the html document's body? The control contains the size that the control was set to. The body object appears to contain the same size. That information...
2
by: Asad | last post by:
I have a form on a page that has several textareas, and textboxes inside a table (so the table containing the textboxes is also inside the FORM tag). I want to replace the textareas with simple...
3
by: Adam | last post by:
Hey guys, I've decided to stop banging my head against the wall and just ask you guys for the answer. I can't seem to find it. I have a form in which I have multiple submit buttons; only, I'm...
11
by: Bradford Chamberlain | last post by:
I work a lot with multidimensional arrays of dynamic size in C, implementing them using a single-dimensional C array and the appropriate multipliers and offsets to index into it appropriately. I...
3
by: Biff | last post by:
Hello, I am bringing up an iFrame with a calendar control in it in coordination with a text box that holds a date field. In my code behind class I add a method call to the text box's OnFocus...
13
by: Mark | last post by:
Dear folks, In Javascript, is it possible to get all id names within, say, a <div></divpair? Like the array of "document.images", I mean. The reason I ask, is that I have a calender whose...
5
by: Shriphani | last post by:
hello all, If I have a function that loops over a few elements and is expected to throw out a few tuples as the output, then what should I be using in place of return ? Shriphani Palakodety.
2
by: helplakshmi | last post by:
Hi All, I am new to php. The form that i am designing has few input input fields with submit and reset button. The functionality of submit and reset are working properly till now. My form ...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.