473,320 Members | 1,953 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

XSL problem

Hi,

I'm stuck with an XSL problem - can anyone give me any hints?

I have some XML with nested formatting tags like this:

<text>
this is plain
<bold>
this is bold
<italic>
this is bold-italic
</italic>
</bold>
this is plain
</text>

which I need to 'flatten out' into something like this:

<text>this is plain</text>
<text bold="true">this is bold</text>
<text bold="true" italic="true">this is bold-italic</text>
<text>this is plain</text>

It doesn't have to work with any arbitrary tags - there are only a few
possible ones - but I'm not sure how to "remember" the outer level
formatting nodes when processing the text inside. It seems to be
crying out for some kind of state variable

Andy
Jul 20 '05 #1
7 3086


Andy Fish wrote:

I'm stuck with an XSL problem - can anyone give me any hints?

I have some XML with nested formatting tags like this:

<text>
this is plain
<bold>
this is bold
<italic>
this is bold-italic
</italic>
</bold>
this is plain
</text>

which I need to 'flatten out' into something like this:

<text>this is plain</text>
<text bold="true">this is bold</text>
<text bold="true" italic="true">this is bold-italic</text>
<text>this is plain</text>

It doesn't have to work with any arbitrary tags - there are only a few
possible ones - but I'm not sure how to "remember" the outer level
formatting nodes when processing the text inside. It seems to be
crying out for some kind of state variable


Modes can help to give some kind of state in which a node is to be
processed, here is my attempt at using them to solve the problem:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" encoding="UTF-8" indent="yes" />

<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>

<xsl:template match="text">
<xsl:apply-templates select="node()" mode="flatten" />
</xsl:template>

<xsl:template match="text()" mode="flatten">
<text><xsl:value-of select="." /></text>
</xsl:template>

<xsl:template match="bold" mode="flatten">
<xsl:apply-templates select="node()" mode="flattenBold" />
</xsl:template>

<xsl:template match="text()" mode="flattenBold">
<text bold="true"><xsl:value-of select="." /></text>
</xsl:template>

<xsl:template match="italic" mode="flattenBold">
<xsl:apply-templates select="node()" mode="flattenBoldItalic" />
</xsl:template>

<xsl:template match="text()" mode="flattenBoldItalic">
<text bold="true" italic="true"><xsl:value-of select="." /></text>
</xsl:template>

</xsl:stylesheet>

The result is not quite what you want but besides a white space text
node showing up it has the right structure (note I wrapped your source
above in a <doc> element as otherwise if the result is flattened it
wouldn't have a root element):

<doc>
<text>
this is plain
</text>
<text bold="true">
this is bold
</text>
<text italic="true" bold="true">
this is bold-italic
</text>
<text bold="true">
</text>
<text>
this is plain
</text>
</doc>
Now to solve the whitespace text node issue I think the following should
help:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="xml" encoding="UTF-8" indent="yes" />

<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()" />
</xsl:copy>
</xsl:template>

<xsl:template match="text">
<xsl:apply-templates select="node()" mode="flatten" />
</xsl:template>

<xsl:template match="text()" mode="flatten">
<xsl:variable name="normalizedText" select="normalize-space(.)" />
<xsl:if test="$normalizedText">
<text><xsl:value-of select="." /></text>
</xsl:if>
</xsl:template>

<xsl:template match="bold" mode="flatten">
<xsl:apply-templates select="node()" mode="flattenBold" />
</xsl:template>

<xsl:template match="text()" mode="flattenBold">
<xsl:variable name="normalizedText" select="normalize-space(.)" />
<xsl:if test="$normalizedText">
<text bold="true"><xsl:value-of select="." /></text>
</xsl:if>
</xsl:template>

<xsl:template match="italic" mode="flattenBold">
<xsl:apply-templates select="node()" mode="flattenBoldItalic" />
</xsl:template>

<xsl:template match="text()" mode="flattenBoldItalic">
<xsl:variable name="normalizedText" select="normalize-space(.)" />
<xsl:if test="$normalizedText">
<text bold="true" italic="true"><xsl:value-of select="." /></text>
</xsl:if>
</xsl:template>

</xsl:stylesheet>
--

Martin Honnen
http://JavaScript.FAQTs.com/

Jul 20 '05 #2
Hi Andy,

aj****@blueyonder.co.uk (Andy Fish) writes:
I'm stuck with an XSL problem - can anyone give me any hints?

I have some XML with nested formatting tags like this:

<text>
this is plain
<bold>
this is bold
<italic>
this is bold-italic
</italic>
</bold>
this is plain
</text>

which I need to 'flatten out' into something like this:

<text>this is plain</text>
<text bold="true">this is bold</text>
<text bold="true" italic="true">this is bold-italic</text>
<text>this is plain</text>

It doesn't have to work with any arbitrary tags - there are only a few
possible ones - but I'm not sure how to "remember" the outer level
formatting nodes when processing the text inside. It seems to be
crying out for some kind of state variable
Here's a brute force version that uses XPath to look at the ancestor
axis. Perhaps not as elegant as Martin's solution, but shorter and
easier to extend (Martin's script will grow as the factorial of the
number of options, I think, whereas this is only linear).

This transformation:

<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"


<xsl:output indent="yes"/>

<xsl:template match="/text">
<doc>
<xsl:apply-templates select="text()|node()"/>
</doc>
</xsl:template>

<xsl:template match="text()">
<xsl:if test="normalize-space(.)">
<text>
<xsl:if test="ancestor::bold">
<xsl:attribute name="bold">true</xsl:attribute>
</xsl:if>
<xsl:if test="ancestor::italic">
<xsl:attribute name="italic">true</xsl:attribute>
</xsl:if>
<xsl:value-of select="normalize-space(.)"/>
</text>
</xsl:if>
</xsl:template>

</xsl:stylesheet>
with this input
<text>
this is plain
<bold>
this is bold
<italic>
this is bold-italic
</italic>
</bold>
this is plain
</text>
gives this output
<?xml version="1.0"?>
<doc>
<text>this is plain</text>
<text bold="true">this is bold</text>
<text bold="true" italic="true">this is bold-italic</text>
<text>this is plain</text>
</doc>

Ben

--
Ben Edgington
Mail to the address above is discarded.
Mail to ben at that address might be read.
http://www.edginet.org/
Jul 20 '05 #3
Ben Edgington <us****@edginet.org> writes:
(Martin's script will grow as the factorial of the
number of options, I think, whereas this is only linear).


Sorry, not factorial, but exponential: there will be 2^n-1
cases for n elements.

--
Ben Edgington
Mail to the address above is discarded.
Mail to ben at that address might be read.
http://www.edginet.org/
Jul 20 '05 #4
Thanks to both yourself and martin for these two solutions.

although there are "only a few" formatting elements, I certainly don't fancy
having to enumerate every possible combination of them.

Fortunately performance will not be an issue so I can use your idea.

Andy

"Ben Edgington" <us****@edginet.org> wrote in message
news:87************@edginet.org...
Ben Edgington <us****@edginet.org> writes:
(Martin's script will grow as the factorial of the
number of options, I think, whereas this is only linear).


Sorry, not factorial, but exponential: there will be 2^n-1
cases for n elements.

--
Ben Edgington
Mail to the address above is discarded.
Mail to ben at that address might be read.
http://www.edginet.org/

Jul 20 '05 #5
"Andy Fish" <aj****@blueyonder.co.uk> writes:
although there are "only a few" formatting elements, I certainly don't fancy
having to enumerate every possible combination of them.

Fortunately performance will not be an issue so I can use your idea.
You don't need to choose!

Whilst watching television with my two-year-old this morning I
realised that, of course, recursion is the proper way to maintain
state-information in XSLT. (The challenge presented by the Fimbles is
limited, you understand.)

This solution combines the best of Martin's and mine: it maintains
state information without recalculating it from scratch every time,
and its size is only linear in the number of options considered.

This XSLT,

<xsl:stylesheet
version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"


<xsl:output indent="yes"/>

<xsl:template match="/text">
<doc>
<xsl:apply-templates select="text()|node()">
<xsl:with-param name="italic" select="0"/>
<xsl:with-param name="bold" select="0"/>
</xsl:apply-templates>
</doc>
</xsl:template>

<xsl:template match="text()">
<xsl:param name="italic"/>
<xsl:param name="bold"/>
<xsl:if test="normalize-space(.)">
<text>
<xsl:if test="$bold">
<xsl:attribute name="bold">true</xsl:attribute>
</xsl:if>
<xsl:if test="$italic">
<xsl:attribute name="italic">true</xsl:attribute>
</xsl:if>
<xsl:value-of select="normalize-space(.)"/>
</text>
</xsl:if>
</xsl:template>

<xsl:template match="bold">
<xsl:param name="italic"/>
<xsl:apply-templates select="text()|node()">
<xsl:with-param name="italic" select="$italic"/>
<xsl:with-param name="bold" select="1"/>
</xsl:apply-templates>
</xsl:template>

<xsl:template match="italic">
<xsl:param name="bold"/>
<xsl:apply-templates select="text()|node()">
<xsl:with-param name="italic" select="1"/>
<xsl:with-param name="bold" select="$bold"/>
</xsl:apply-templates>
</xsl:template>

</xsl:stylesheet>
with this XML
<text>
this is plain
<bold>
this is bold
<italic>
this is bold-italic
</italic>
</bold>
this is plain
<italic>this is italic
<bold>this is italic-bold</bold>
</italic>
</text>
gives this output
<?xml version="1.0"?>
<doc>
<text>this is plain</text>
<text bold="true">this is bold</text>
<text bold="true" italic="true">this is bold-italic</text>
<text>this is plain</text>
<text italic="true">this is italic</text>
<text bold="true" italic="true">this is italic-bold</text>
</doc>
Hope you enjoy it.

Ben

--
Ben Edgington
Mail to the address above is discarded.
Mail to ben at that address might be read.
http://www.edginet.org/
Jul 20 '05 #6
"Ben Edgington" <us****@edginet.org> wrote in message
news:87************@edginet.org...

Whilst watching television with my two-year-old this morning I
realised that, of course, recursion is the proper way to maintain
state-information in XSLT. (The challenge presented by the Fimbles is
limited, you understand.)

This solution combines the best of Martin's and mine: it maintains
state information without recalculating it from scratch every time,
and its size is only linear in the number of options considered.


ah of course - I'd forgotten about apply-templates...with-param

It's very interesting how different these three approaches are. I had
already thought about something like Martin's idea but I figured it would
get out of hand so I didn't take it all the way to completion. Your original
was the lateral thinking solution I was hoping someone would come up with.
But this last one is the one I'm kicking myself for not thinking of - the
one I didn't think existed.

Andy

Jul 20 '05 #7
"Andy Fish" <aj****@blueyonder.co.uk> writes:
ah of course - I'd forgotten about apply-templates...with-param

It's very interesting how different these three approaches are. I had
already thought about something like Martin's idea but I figured it would
get out of hand so I didn't take it all the way to completion. Your original
was the lateral thinking solution I was hoping someone would come up with.
But this last one is the one I'm kicking myself for not thinking of - the
one I didn't think existed.


Exactly - I had a feeling when I submitted the original version that
there was a better way, which is why it kept bugging me. It's an
example of one of those things which are *obvious* when you spend a
couple of days thinking about them 8^)

Ben

--
Ben Edgington
Mail to the address above is discarded.
Mail to ben at that address might be read.
http://www.edginet.org/
Jul 20 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Kostatus | last post by:
I have a virtual function in a base class, which is then overwritten by a function of the same name in a publically derived class. When I call the function using a pointer to the derived class...
117
by: Peter Olcott | last post by:
www.halting-problem.com
18
by: Ian Stanley | last post by:
Hi, Continuing my strcat segmentation fault posting- I have a problem which occurs when appending two sting literals using strcat. I have tried to fix it by writing my own function that does the...
28
by: Jon Davis | last post by:
If I have a class with a virtual method, and a child class that overrides the virtual method, and then I create an instance of the child class AS A base class... BaseClass bc = new ChildClass();...
6
by: Ammar | last post by:
Dear All, I'm facing a small problem. I have a portal web site, that contains articles, for each article, the end user can send a comment about the article. The problem is: I the comment length...
16
by: Dany | last post by:
Our web service was working fine until we installed .net Framework 1.1 service pack 1. Uninstalling SP1 is not an option because our largest customer says service packs marked as "critical" by...
2
by: Mike Collins | last post by:
I cannot get the correct drop down list value from a drop down I have on my web form. I get the initial value that was loaded in the list. It was asked by someone else what the autopostback was...
0
by: =?Utf-8?B?am8uZWw=?= | last post by:
Hello All, I am developing an Input Methop (IM) for PocketPC / Windows Mobile (PPC/WM). On some devices the IM will not start. The IM appears in the IM-List but when it is selected from the...
1
by: sherifbk | last post by:
Problem description ============== - I have 4 clients and 1 server (SQL server) - 3 clients are Monitoring console 1 client is operation console - Monitoring console collects some data from...
9
by: AceKnocks | last post by:
I am working on a framework design problem in which I have to design a C++ based framework capable of solving three puzzles for now but actually it should work with a general puzzle of any kind and I...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.