Obtaining the textNode from within multiple elements.

Daz

Hi everyone.

Is there a simple way for me to get the value of the textNodes from
this piece of HTML, without iterating through the whole thing?

<table>
<tbody>
<tr>
<td>
<i><b>example text</b></i>
</td>
<td>
example text
</td>
<td>
<font color="blue">example text</font>
</td>
</tr>
<tr>
<td>
<b>example text</b>
</td>
<td>
<font color="green"><u><b>example text</b></u></font>
</td>
<td>
<b>example text</u>
</td>
</tr>
</tbody>
</table>

Please note the format of the text is different in each cell, and that
the code I need to obtain the textNodes from is not mine, so I cannot
change that format. I am simply using JavaScript to make a browser
extension that will do useful things with the page.

Many thanks.

Daz.

Dec 10 '06 #1

Subscribe Post Reply

1121

RobG

Daz wrote:

Hi everyone.

Is there a simple way for me to get the value of the textNodes from
this piece of HTML, without iterating through the whole thing?

You can use a number of strategies based on feature detection: firstly
try textContent, if that is not supported, try innerText. If that
isn't supported, you have a choice of innerHTML and striping out the
tags, or you can recursively iterate over all the nodes and grab just
the text.

There are some functions posted here:

<URL:
http://groups.google.com/group/comp....f5c61c0ce91bfe

>

Copies are included below.

[...]

>
Please note the format of the text is different in each cell, and that
the code I need to obtain the textNodes from is not mine, so I cannot
change that format. I am simply using JavaScript to make a browser
extension that will do useful things with the page.

It's probably better if you say what you want the script to do, simply
getting all the text may not be what you really need.
Posted functions:

Using fallback to innerHTML and a regular expression to remove tags:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;
return el.innerHTML.replace(/<[^>]+>/g,'');
}

A better regular expression might be:

.replace( /<[^<>]+>/g, '' )

Suggested by Mike Winter:
<URL:
http://groups.google.com.au/group/co...06dda8f672ef5f

>

To avoid issues with regular expressions, use recursion - it will be
slower but that may not matter:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;

// If both fail, use recursion
return getText2(el);

// Recursive inner function
function getText2(el) {
var x = el.childNodes;
var txt = '';
for (var i=0, len=x.length; i<len; ++i){
if (3 == x[i].nodeType) {
txt += x[i].data;
} else if (1 == x[i].nodeType){
txt += getText2(x[i]);
}
}

// Collapse whitespace before returning
return txt.replace(/\s+/g,' ');
}
}
--
Rob

Dec 11 '06 #2

Daz

RobG wrote:

Daz wrote:
Hi everyone.

Is there a simple way for me to get the value of the textNodes from
this piece of HTML, without iterating through the whole thing?

You can use a number of strategies based on feature detection: firstly
try textContent, if that is not supported, try innerText. If that
isn't supported, you have a choice of innerHTML and striping out the
tags, or you can recursively iterate over all the nodes and grab just
the text.

There are some functions posted here:

<URL:
http://groups.google.com/group/comp....f5c61c0ce91bfe

Copies are included below.

[...]

Please note the format of the text is different in each cell, and that
the code I need to obtain the textNodes from is not mine, so I cannot
change that format. I am simply using JavaScript to make a browser
extension that will do useful things with the page.

It's probably better if you say what you want the script to do, simply
getting all the text may not be what you really need.
Posted functions:

Using fallback to innerHTML and a regular expression to remove tags:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;
return el.innerHTML.replace(/<[^>]+>/g,'');
}

A better regular expression might be:

.replace( /<[^<>]+>/g, '' )

Suggested by Mike Winter:
<URL:
http://groups.google.com.au/group/co...06dda8f672ef5f

To avoid issues with regular expressions, use recursion - it will be
slower but that may not matter:

function getText(el)
{
if (el.textContent) return el.textContent;
if (el.innerText) return el.innerText;

// If both fail, use recursion
return getText2(el);

// Recursive inner function
function getText2(el) {
var x = el.childNodes;
var txt = '';
for (var i=0, len=x.length; i<len; ++i){
if (3 == x[i].nodeType) {
txt += x[i].data;
} else if (1 == x[i].nodeType){
txt += getText2(x[i]);
}
}

// Collapse whitespace before returning
return txt.replace(/\s+/g,' ');
}
}
--
Rob

All very good ideas. I tried innerText, which isn't supported by
Firefox, so I was considering recursion but hoped there may have been a
better way. I would imagine that textContent is the key that just might
help me out. As I am designing XPIs for Firefox, I don't need to worry
about other browsers not working with the code.

Many thanks again.

Daz.

Dec 11 '06 #3

Similar topics

Elements within elements

by: Jyrki Keisala | last post by:

Hi, I am trying to transform an XML file into a HTML table with XSLT. The structure of my XML file is roughly this: <profile> <command name="..." phrase="..."> <key extended="..."...

.NET Framework

problems reading the value of a textNode --- second attempt ---

by: Anna | last post by:

Hi all, I posted the same question this afternoon but my message isn't showing up, so I thought I'd give it another try.... in case you should see it later I apologize for posting the same...

Javascript

Obtaining the actual size of the html document's body

by: gregl | last post by:

Anyone know how to obtain the true size of the html document's body? The control contains the size that the control was set to. The body object appears to contain the same size. That information...

.NET Framework

Inserting textnode or table inside FORM tag

by: Asad | last post by:

I have a form on a page that has several textareas, and textboxes inside a table (so the table containing the textboxes is also inside the FORM tag). I want to replace the textareas with simple...

Javascript

Using multiple button type="submit" elements within a form

by: Adam | last post by:

Hey guys, I've decided to stop banging my head against the wall and just ask you guys for the answer. I can't seem to find it. I have a form in which I have multiple submit buttons; only, I'm...

HTML / CSS

QUERY: field offset rules within structures

by: Bradford Chamberlain | last post by:

I work a lot with multidimensional arrays of dynamic size in C, implementing them using a single-dimensional C array and the appropriate multipliers and offsets to index into it appropriately. I...

C / C++

Obtaining a ASP.NET Control's Location within the code behind class

by: Biff | last post by:

Hello, I am bringing up an iFrame with a calendar control in it in coordination with a text box that holds a date field. In my code behind class I add a method call to the text box's OnFocus...

ASP.NET

ID names within a <div></div> pair?

by: Mark | last post by:

Dear folks, In Javascript, is it possible to get all id names within, say, a <div></divpair? Like the array of "document.images", I mean. The reason I ask, is that I have a calender whose...

Javascript

obtaining multiple values from a function.

by: Shriphani | last post by:

hello all, If I have a function that loops over a few elements and is expected to throw out a few tuples as the output, then what should I be using in place of return ? Shriphani Palakodety.

Python

help me to Reset select tag with multiple options in IE

by: helplakshmi | last post by:

Hi All, I am new to php. The form that i am designing has few input input fields with submit and reset button. The functionality of submit and reset are working properly till now. My form ...

Javascript

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

Windows Server

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice