469,600 Members | 2,176 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,600 developers. It's quick & easy.

XSLT Extract Text from Nodes

Hello,

I am new to the concept of XSL and am looking for some assistance.

Take the following XML document:

<binder>
<author>Greg</author>
<notes>
<time>11:45</time>
<content>
This would be some content... every once in a while you may run
into
<heading>A Heading!</heading>
Which could be followed by more content... and possible
<heading>More Headings.</heading>
and even more content!
</content>
</notes>
</binder>

What I would like to do is to be able to extract the value of the
<contentnode, and have special formatting for the headings.

When I do something like:

<xsl:value-of select="content" />

I receive the data within <content- including the values of the
nested <headingnodes, but what I really want to be able to do is do
is to have XSLT read the text of the <contentnode until a <heading>
node is reached, at which point the value of the heading node is
formatted correctly and displayed, and then continued by the text of
the <contentnode after the <headinguntil another <headingis
reached... etc etc...

Could someone give me some pointers as to how this can be accomplished?

Oct 10 '06 #1
9 1910


gr************@gmail.com wrote:

<content>
This would be some content... every once in a while you may run
into
<heading>A Heading!</heading>
Which could be followed by more content... and possible
<heading>More Headings.</heading>
and even more content!
</content>
What I would like to do is to be able to extract the value of the
<contentnode, and have special formatting for the headings.
Use templates and xsl:apply-templates e.g.

<xsl:template match="content">
<div>
<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="heading">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>

There is a built-in template for text nodes
<http://www.w3.org/TR/xslt#built-in-rule>
so you don't have to do anything for them, they end up in the result
tree anyway with the above approach.
--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 10 '06 #2
Thanks for your quick reply Martin,

This has brought me closer to what I would like to accomplish, however
I now have the following issue.

I was using the xsl:value-of element with disable-output-escaping="yes"
to produce HTML formatted text in the browser screen. You see within
the <contentnode there may be HTML that should be displayed as such.
Your method produces all of the text in the correct order and formatted
according to tag name, but produces HTML tags which should be hidden.

ie.

<content>
There may be some <i>italicized</itext...
<heading>Maybe even <u>formatting in a heading</u></heading>
...
</content>

Is there some way to overcome this?

Martin Honnen wrote:
gr************@gmail.com wrote:

<content>
This would be some content... every once in a while you may run
into
<heading>A Heading!</heading>
Which could be followed by more content... and possible
<heading>More Headings.</heading>
and even more content!
</content>

What I would like to do is to be able to extract the value of the
<contentnode, and have special formatting for the headings.

Use templates and xsl:apply-templates e.g.

<xsl:template match="content">
<div>
<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="heading">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>

There is a built-in template for text nodes
<http://www.w3.org/TR/xslt#built-in-rule>
so you don't have to do anything for them, they end up in the result
tree anyway with the above approach.
--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 10 '06 #3
I should say that the HTML tags within my XML document are stored as
entities (at least the < character is) i.e.

<content>
This is some &lt;i>italicized&lt;/itext...
...
</content>

Thanks.
gregmcmulli...@gmail.com wrote:
Thanks for your quick reply Martin,

This has brought me closer to what I would like to accomplish, however
I now have the following issue.

I was using the xsl:value-of element with disable-output-escaping="yes"
to produce HTML formatted text in the browser screen. You see within
the <contentnode there may be HTML that should be displayed as such.
Your method produces all of the text in the correct order and formatted
according to tag name, but produces HTML tags which should be hidden.

ie.

<content>
There may be some <i>italicized</itext...
<heading>Maybe even <u>formatting in a heading</u></heading>
...
</content>

Is there some way to overcome this?

Martin Honnen wrote:
gr************@gmail.com wrote:

<content>
This would be some content... every once in a while you may run
into
<heading>A Heading!</heading>
Which could be followed by more content... and possible
<heading>More Headings.</heading>
and even more content!
</content>
What I would like to do is to be able to extract the value of the
<contentnode, and have special formatting for the headings.
Use templates and xsl:apply-templates e.g.

<xsl:template match="content">
<div>
<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="heading">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>

There is a built-in template for text nodes
<http://www.w3.org/TR/xslt#built-in-rule>
so you don't have to do anything for them, they end up in the result
tree anyway with the above approach.
--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 10 '06 #4
I have found a solution. The following is the build in template for
text nodes:

<xsl:template match="text()|@*">
<xsl:value-of select="."/>
</xsl:template>

It can be overridden simply by creating a new custom template, which I
did as the following:

<xsl:template match="text()|@*">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>

The result is that the HTML in the text nodes outputs as desired.

gr************@gmail.com wrote:
I should say that the HTML tags within my XML document are stored as
entities (at least the < character is) i.e.

<content>
This is some &lt;i>italicized&lt;/itext...
...
</content>

Thanks.
gregmcmulli...@gmail.com wrote:
Thanks for your quick reply Martin,

This has brought me closer to what I would like to accomplish, however
I now have the following issue.

I was using the xsl:value-of element with disable-output-escaping="yes"
to produce HTML formatted text in the browser screen. You see within
the <contentnode there may be HTML that should be displayed as such.
Your method produces all of the text in the correct order and formatted
according to tag name, but produces HTML tags which should be hidden.

ie.

<content>
There may be some <i>italicized</itext...
<heading>Maybe even <u>formatting in a heading</u></heading>
...
</content>

Is there some way to overcome this?

Martin Honnen wrote:
gr************@gmail.com wrote:
>
>
<content>
This would be some content... every once in a while you may run
into
<heading>A Heading!</heading>
Which could be followed by more content... and possible
<heading>More Headings.</heading>
and even more content!
</content>
>
>
What I would like to do is to be able to extract the value of the
<contentnode, and have special formatting for the headings.
>
Use templates and xsl:apply-templates e.g.
>
<xsl:template match="content">
<div>
<xsl:apply-templates/>
</div>
</xsl:template>
>
<xsl:template match="heading">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>
>
There is a built-in template for text nodes
<http://www.w3.org/TR/xslt#built-in-rule>
so you don't have to do anything for them, they end up in the result
tree anyway with the above approach.
>
>
--
>
Martin Honnen
http://JavaScript.FAQTs.com/
Oct 10 '06 #5

Please don't top-post.

gr************@gmail.com wrote:
Martin Honnen wrote:
gr************@gmail.com wrote:
<content>
This would be some content... every once in a
while you may run into
<heading>A Heading!</heading>
Which could be followed by more content... and
possible
<heading>More Headings.</heading>
and even more content!
</content>
Use templates and xsl:apply-templates e.g.

<xsl:template match="content">
<div>
<xsl:apply-templates/>
</div>
</xsl:template>

<xsl:template match="heading">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>

This has brought me closer to what I would like to
accomplish, however I now have the following issue.

I was using the xsl:value-of element with
disable-output-escaping="yes" to produce HTML formatted
text in the browser screen. You see within the <content>
node there may be HTML that should be displayed as such.
Your method produces all of the text in the correct order
and formatted according to tag name, but produces HTML
tags which should be hidden.

ie.

<content>
There may be some <i>italicized</itext...
<heading>Maybe even <u>formatting in a
heading</u></heading>
...
</content>

Is there some way to overcome this?

I should say that the HTML tags within my XML document
are stored as entities (at least the < character is) i.e.

<content>
This is some &lt;i>italicized&lt;/itext...
...
</content>
Don't do that, it seems to lead to innumerable problems.
Store you mark-up as XML instead:

<content>
This is some <i>italicized</itext...
...
</content>

....and use the identity transformation to convert it into
HTML:

<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

This also has the virtue of fitting neatly with the
solution for your original problem that Martin Honnen has
proposed.

You might also need to write exclusion templates for some
nodes, but that's hardly a problem.

--
roy axenov

Oct 10 '06 #6
Not sure what a top-post is...

While I see what your saying Roy, the problem is that the contained
HTML is not necessarily well formed because of the way that its formed
at this time. Perhaps when I have figured out how to force it to be
well formed I can use this solution.

Thanks.

roy axenov wrote:
Please don't top-post.

gr************@gmail.com wrote:
Martin Honnen wrote:
gr************@gmail.com wrote:
<content>
This would be some content... every once in a
while you may run into
<heading>A Heading!</heading>
Which could be followed by more content... and
possible
<heading>More Headings.</heading>
and even more content!
</content>
>
Use templates and xsl:apply-templates e.g.
>
<xsl:template match="content">
<div>
<xsl:apply-templates/>
</div>
</xsl:template>
>
<xsl:template match="heading">
<h1>
<xsl:apply-templates/>
</h1>
</xsl:template>
This has brought me closer to what I would like to
accomplish, however I now have the following issue.

I was using the xsl:value-of element with
disable-output-escaping="yes" to produce HTML formatted
text in the browser screen. You see within the <content>
node there may be HTML that should be displayed as such.
Your method produces all of the text in the correct order
and formatted according to tag name, but produces HTML
tags which should be hidden.

ie.

<content>
There may be some <i>italicized</itext...
<heading>Maybe even <u>formatting in a
heading</u></heading>
...
</content>

Is there some way to overcome this?

I should say that the HTML tags within my XML document
are stored as entities (at least the < character is) i.e.

<content>
This is some &lt;i>italicized&lt;/itext...
...
</content>

Don't do that, it seems to lead to innumerable problems.
Store you mark-up as XML instead:

<content>
This is some <i>italicized</itext...
...
</content>

...and use the identity transformation to convert it into
HTML:

<xsl:template match="@*|node()">
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:template>

This also has the virtue of fitting neatly with the
solution for your original problem that Martin Honnen has
proposed.

You might also need to write exclusion templates for some
nodes, but that's hardly a problem.

--
roy axenov
Oct 10 '06 #7
gr************@gmail.com schrieb:
roy axenov wrote:
>Please don't top-post.
Not sure what a top-post is...
Then ask a search engine. It will lead you to some documents like
<http://www.catb.org/~esr/jargon/html/T/top-post.html>.

--
Johannes Koch
Spem in alium nunquam habui praeter in te, Deus Israel.
(Thomas Tallis, 40-part motet)
Oct 10 '06 #8


gr************@gmail.com wrote:

It can be overridden simply by creating a new custom template, which I
did as the following:

<xsl:template match="text()|@*">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>

The result is that the HTML in the text nodes outputs as desired.
If that works for you then you can use it. But you should be aware that
disable-output-escaping support is an optional feature during
serialization of the result tree meaning it might not be supported at
all by an XSLT processor or it is not supported when you don't serialize
the result tree (e.g. when you chain transformation or e.g. in a browser
like Mozilla where the result tree is being rendered directly without
any serialization happening).

--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 11 '06 #9
I think this will suffice for my needs as I am doing the
transformations on the server.

Thanks again.

Martin Honnen wrote:
gr************@gmail.com wrote:

It can be overridden simply by creating a new custom template, which I
did as the following:

<xsl:template match="text()|@*">
<xsl:value-of select="." disable-output-escaping="yes"/>
</xsl:template>

The result is that the HTML in the text nodes outputs as desired.

If that works for you then you can use it. But you should be aware that
disable-output-escaping support is an optional feature during
serialization of the result tree meaning it might not be supported at
all by an XSLT processor or it is not supported when you don't serialize
the result tree (e.g. when you chain transformation or e.g. in a browser
like Mozilla where the result tree is being rendered directly without
any serialization happening).

--

Martin Honnen
http://JavaScript.FAQTs.com/
Oct 11 '06 #10

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Sergio del Amo | last post: by
5 posts views Thread by inquirydog | last post: by
4 posts views Thread by Chris Kettenbach | last post: by
12 posts views Thread by Chris | last post: by
reply views Thread by suresh191 | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.