By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,369 Members | 1,131 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,369 IT Pros & Developers. It's quick & easy.

Parsing an HTML table with XML

P: n/a
I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>
<tr><td>3</td><td>4</td></tr>
<tr><td>5</td><td>6</td></tr>
</table>

With an XSLT styles sheet, I can use for-each to grab the values in
each row

However, I dont want to grab the very first row - because this isnt
data!

How do I iterate throught each <trand ignore the first <tr>??

Jul 5 '06 #1
Share this Question
Share on Google+
4 Replies


P: n/a


Rick Walsh wrote:

How do I iterate throught each <trand ignore the first <tr>??
<xsl:for-each select="table/tr[position() &gt; 1]">

--

Martin Honnen
http://JavaScript.FAQTs.com/
Jul 5 '06 #2

P: n/a
Rick Walsh wrote:
I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>
<tr><td>3</td><td>4</td></tr>
<tr><td>5</td><td>6</td></tr>
</table>

With an XSLT styles sheet, I can use for-each to grab the values in
each row

However, I dont want to grab the very first row - because this isnt
data!
Another possiblility would be to change the input by using the (X)HTML
thead and tbody elements, then selecting only tbody/tr.
--
Johannes Koch
In te domine speravi; non confundar in aeternum.
(Te Deum, 4th cent.)
Jul 5 '06 #3

P: n/a

Rick Walsh wrote:
I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>
However, I dont want to grab the very first row - because this isnt
data!
Then code it with <th>, not <td>

If this table isn't under your control, then be carweful of parsing it
with an XML parser -- HTML isn't XML (XHTML on the web usually isn't
either). It's not a good assumption to make if you're trying to build
robust code - something as simple as an embedded &nbsp; might break it.

Jul 5 '06 #4

P: n/a
Andy Dingley <di*****@codesmiths.comwrote:
Rick Walsh wrote:
>>I have an HTML table in the following format:

<table>
<tr><td>Header 1</td><td>Header 2</td></tr>
<tr><td>1</td><td>2</td></tr>

>>However, I dont want to grab the very first row - because this isnt
data!


Then code it with <th>, not <td>

If this table isn't under your control, then be carweful of parsing it
with an XML parser -- HTML isn't XML (XHTML on the web usually isn't
either). It's not a good assumption to make if you're trying to build
robust code - something as simple as an embedded &nbsp; might break it.
For this purpose, use an HTML parser ; I personally use neko HTML that I
have included in the RefleX toolkit ; with RefleX, parsing an HTML file
is as simple as parsing an XML file :
http://reflex.gforge.inria.fr/tips.html#N80178E
(section : HTML to XML)

example :
<!--parse a non-well-balanced HTML file to XML-->
<xcl:parse-html name="htmlFile" source="file:///path/to/file.html"/>
<!--apply a stylesheet to it-->
<xcl:transform output="file:///path/to/new-file.html" source="{
$htmlFile }"
stylesheet="file:///path/to/stylesheet.xsl">

of course, you could select with XPath the tag to transform, say the
<bodytag of the parsed HTML ; something like this :
<xcl:transform output="file:///path/to/new-file.html" source="{
$htmlFile/html/body }"
stylesheet="file:///path/to/stylesheet.xsl">

--
Cordialement,

///
(. .)
--------ooO--(_)--Ooo--------
| Philippe Poulard |
-----------------------------
http://reflex.gforge.inria.fr/
Have the RefleX !
Jul 5 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.