By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,635 Members | 2,174 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,635 IT Pros & Developers. It's quick & easy.

HTML Via XSLT to Plain Text output

P: 2
Hi All.
I'm trying to transform a html document into plain text via xslt.
Simple you say! (i hope)
I have got it working, by using the magnificent <xsl:value-of select="."/>.
This returns the whole document, and <xsl:output method="text"/> ensures that the output I get is plain text.
problem:
The html I am transforming has a table, with headings and data. Whilst the output contains all the data form the table, it does not preserve any formatting, and concatenates all the data within the table.
Can you suggest how i could extract the data from the table, and present in plain text? The only formatting I require, is that the spacing between the columns is somewhat preserved.
I am, as im sure you can tell, an xslt noob still, even with many years of application development under my belt!.
Thanks for all your help!

Xslt:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select="."/>
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
</xsl:stylesheet>

Html:

<HTML xmlns="http://www.w3.org/1999/xhtml">
<HEAD></HEAD>
<BODY>
<p>Dear person,</p>
<p>The following are columns are required to preserve the formatting.</p>
<table cellpadding="0" cellspacing="0" width="50%">
<tr>
<td width="20%">Column 1</td>
<td width="25%">Column 2</td>
<td width="20%">Column 3</td>
<td width="20%">Column 4</td>
</tr>
<tr>
<td><font size="4">01/06/2008</font></td>
<td><font size="4">34.2</font></td>
<td><font size="4">A Name</font></td>
<td><font size="4">42.00</font></td>
</tr>
</table>
</BODY>
</HTML>


result:
somethign like...

Dear person, The following are columns are required to preserve the formatting. Column 1Column 2Column 3Column 401/06/200834.2A Name42.00
;
Any suggestions would be welcome:)
Oct 3 '07 #1
Share this Question
Share on Google+
3 Replies


Gaiason
P: 5
Hi I'm also noob, but I manage to find this code for space

You can try adding this within the template:
<xsl:value-of select="'&#x20;'" />
or
<xsl:text> </xsl:text>

Hope it helps.

Cheers,
Gaiason

Hi All.
I'm trying to transform a html document into plain text via xslt.
Simple you say! (i hope)
I have got it working, by using the magnificent <xsl:value-of select="."/>.
This returns the whole document, and <xsl:output method="text"/> ensures that the output I get is plain text.
problem:
The html I am transforming has a table, with headings and data. Whilst the output contains all the data form the table, it does not preserve any formatting, and concatenates all the data within the table.
Can you suggest how i could extract the data from the table, and present in plain text? The only formatting I require, is that the spacing between the columns is somewhat preserved.
I am, as im sure you can tell, an xslt noob still, even with many years of application development under my belt!.
Thanks for all your help!

Xslt:

<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0">
<xsl:output method="text"/>
<xsl:template match="/">
<xsl:value-of select="."/>
<xsl:value-of select="normalize-space(.)"/>
</xsl:template>
</xsl:stylesheet>

Html:

<HTML xmlns="http://www.w3.org/1999/xhtml">
<HEAD></HEAD>
<BODY>
<p>Dear person,</p>
<p>The following are columns are required to preserve the formatting.</p>
<table cellpadding="0" cellspacing="0" width="50%">
<tr>
<td width="20%">Column 1</td>
<td width="25%">Column 2</td>
<td width="20%">Column 3</td>
<td width="20%">Column 4</td>
</tr>
<tr>
<td><font size="4">01/06/2008</font></td>
<td><font size="4">34.2</font></td>
<td><font size="4">A Name</font></td>
<td><font size="4">42.00</font></td>
</tr>
</table>
</BODY>
</HTML>


result:
somethign like...

Dear person, The following are columns are required to preserve the formatting. Column 1Column 2Column 3Column 401/06/200834.2A Name42.00
;
Any suggestions would be welcome:)
Oct 3 '07 #2

jkmyoung
Expert 100+
P: 2,057
Instead of using <xsl:value-of select="."/>
I suggest using <xsl:apply-templates/>
Then have 2 templates like so:
Expand|Select|Wrap|Line Numbers
  1. <xsl:template match="tr">
  2.   <xsl:apply-templates/>
  3.     <xsl:text>
  4. </xsl:text><!-- add a newline -->
  5. </xsl:template>
  6.  
  7. <xsl:template match="td">
  8.   <xsl:text> </xsl:text>
  9.     <xsl:apply-templates/>
  10.   <xsl:text> </xsl:text>
  11. </xsl:template>
Oct 3 '07 #3

P: 2
Thanks for your replies, I will give it a go.
Might just use regex to strip out all html tags, seems to work ok for our needs.
Oct 4 '07 #4

Post your reply

Sign in to post your reply or Sign up for a free account.