471,582 Members | 1,480 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,582 software developers and data experts.

Parsing an HTML table with XML

I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>
<tr><td>3</td><td>4</td></tr>
<tr><td>5</td><td>6</td></tr>
</table>

With an XSLT styles sheet, I can use for-each to grab the values in
each row

However, I dont want to grab the very first row - because this isnt
data!

How do I iterate throught each <trand ignore the first <tr>??

Jul 5 '06 #1
4 4704


Rick Walsh wrote:

How do I iterate throught each <trand ignore the first <tr>??
<xsl:for-each select="table/tr[position() &gt; 1]">

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 5 '06 #2
Rick Walsh wrote:
I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>
<tr><td>3</td><td>4</td></tr>
<tr><td>5</td><td>6</td></tr>
</table>

With an XSLT styles sheet, I can use for-each to grab the values in
each row

However, I dont want to grab the very first row - because this isnt
data!
Another possiblility would be to change the input by using the (X)HTML
thead and tbody elements, then selecting only tbody/tr.
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 5 '06 #3

Rick Walsh wrote:
I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>
However, I dont want to grab the very first row - because this isnt
data!
Then code it with <th>, not <td>

If this table isn't under your control, then be carweful of parsing it
with an XML parser -- HTML isn't XML (XHTML on the web usually isn't
either). It's not a good assumption to make if you're trying to build
robust code - something as simple as an embedded &nbsp; might break it.

Jul 5 '06 #4
Andy Dingley <di*****@codesmiths.comwrote:
Rick Walsh wrote:
>>I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>

>>However, I dont want to grab the very first row - because this isnt
data!


Then code it with <th>, not <td>

If this table isn't under your control, then be carweful of parsing it
with an XML parser -- HTML isn't XML (XHTML on the web usually isn't
either). It's not a good assumption to make if you're trying to build
robust code - something as simple as an embedded &nbsp; might break it.
For this purpose, use an HTML parser ; I personally use neko HTML that I
have included in the RefleX toolkit ; with RefleX, parsing an HTML file
is as simple as parsing an XML file :
http://reflex.gforge.inria.fr/tips.html#N80178E
(section : HTML to XML)

example :
<!--parse a non-well-balanced HTML file to XML-->
<xcl:parse-html name="htmlFile" source="file:///path/to/file.html"/>
<!--apply a stylesheet to it-->
<xcl:transform output="file:///path/to/new-file.html" source="{
$htmlFile }"
stylesheet="file:///path/to/stylesheet.xsl">

of course, you could select with XPath the tag to transform, say the
<bodytag of the parsed HTML ; something like this :
<xcl:transform output="file:///path/to/new-file.html" source="{
$htmlFile/html/body }"
stylesheet="file:///path/to/stylesheet.xsl">

--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
Jul 5 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

16 posts views Thread by Terry | last post: by
3 posts views Thread by Pir8 | last post: by
11 posts views Thread by Ørjan Langbakk | last post: by
reply views Thread by bruce | last post: by
3 posts views Thread by steve551979 | last post: by
4 posts views Thread by Neil.Smith | last post: by
1 post views Thread by Just Me | last post: by
1 post views Thread by Robert Neville | last post: by
13 posts views Thread by Chris Carlen | last post: by
reply views Thread by XIAOLAOHU | last post: by
reply views Thread by Vinnie | last post: by
reply views Thread by lumer26 | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.