By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,677 Members | 1,236 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,677 IT Pros & Developers. It's quick & easy.

This XSLT problem makes no sense to me

P: n/a
Context:

I'm trying to compare XML tree fragments and I'm doing so by
outputting the attributes of each element in the tree and outputting
it to a string then normalizing the strings. Then I'm doing a contains
of the current string against the following-sibling::* to determine if
we have duplicates. If we have a duplicate, we move to the next item,
if there is no duplicate, we output the small tree.

I'm hitting a completely ridiculous problem that I can't wrap my head
around. I spend easily 2 hours trying to understand why this happened
yesterday and I'm about to punch the computer. When I output a string
that's not directly tied to the logic (or at least apparently not tied
to the logic), the logic works. When I remove the string output, the
logic stops working. Argh!

Here's the XSLT snippet

<xsl:template match="item-wrapper" mode="string">

<xsl:variable name="current">
<xsl:apply-templates mode="string" />
</xsl:variable>

<xsl:variable name="rest">
<xsl:apply-templates select="following-sibling::*"
mode="string" />
</xsl:variable>

<xsl:variable name="current-normalized">
<xsl:value-of select="normalize-space($current)" />
</xsl:variable>

<xsl:variable name="rest-normalized">
<xsl:value-of select="normalize-space($rest)" />
</xsl:variable>

<!-- <fix/What the hell? -->
<!-- <xsl:value-of select="$current-normalized"/>-->

<xsl:choose>
<xsl:when test="contains($rest-normalized, $current-
normalized)">
<duplicate/>
</xsl:when>
<xsl:otherwise>
<noproblem/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

When <xsl:value-of select="$current-normalized"/is not commented
out, I get the string representation of the XML tree in a normalized
form in the output for each item and the correct <duplicate/and
<noproblem/get outputted in the correct location after the
normalized strings. When I comment out the <xsl:value-of
select="$current-normalized"/>, the normalized current string, as
expected, stops being outputted, but all my outputs are <noproblem/>.
What the hell? What am I missing here?

Regards
Jean-Francois Michaud
Mar 11 '08 #1
Share this Question
Share on Google+
11 Replies


P: n/a
Hard to be certain without seeing a full runnable copy, but this sounds
more like a bug in your XSLT processor than an expected behavior of XSLT.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Mar 11 '08 #2

P: n/a
On Mar 11, 10:24 am, Joseph Kesselman <keshlam-nos...@comcast.net>
wrote:
Hard to be certain without seeing a full runnable copy, but this sounds
more like a bug in your XSLT processor than an expected behavior of XSLT.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Thanks much. I thought I was going insane ;-). I tried with SaxonB9
and I get the same behavior. I'd have to try with Xalan, but I'd have
to modify the code a bit because Xalan doesn't behave exactly the same
way Saxon8/B9 does.

Regards
Jean-Francois Michaud
Mar 11 '08 #3

P: n/a
Jean-François Michaud wrote:
Context:

I'm trying to compare XML tree fragments and I'm doing so by
outputting the attributes of each element in the tree and outputting
it to a string then normalizing the strings. Then I'm doing a contains
of the current string against the following-sibling::* to determine if
we have duplicates. If we have a duplicate, we move to the next item,
if there is no duplicate, we output the small tree.

I'm hitting a completely ridiculous problem that I can't wrap my head
around. I spend easily 2 hours trying to understand why this happened
yesterday and I'm about to punch the computer. When I output a string
that's not directly tied to the logic (or at least apparently not tied
to the logic), the logic works. When I remove the string output, the
logic stops working. Argh!

Here's the XSLT snippet

<xsl:template match="item-wrapper" mode="string">

<xsl:variable name="current">
<xsl:apply-templates mode="string" />
</xsl:variable>

<xsl:variable name="rest">
<xsl:apply-templates select="following-sibling::*"
mode="string" />
</xsl:variable>

<xsl:variable name="current-normalized">
<xsl:value-of select="normalize-space($current)" />
</xsl:variable>

<xsl:variable name="rest-normalized">
<xsl:value-of select="normalize-space($rest)" />
</xsl:variable>

<!-- <fix/What the hell? -->
<!-- <xsl:value-of select="$current-normalized"/>-->

<xsl:choose>
<xsl:when test="contains($rest-normalized, $current-
normalized)">
<duplicate/>
</xsl:when>
<xsl:otherwise>
<noproblem/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>

When <xsl:value-of select="$current-normalized"/is not commented
out, I get the string representation of the XML tree in a normalized
form in the output for each item and the correct <duplicate/and
<noproblem/get outputted in the correct location after the
normalized strings. When I comment out the <xsl:value-of
select="$current-normalized"/>, the normalized current string, as
expected, stops being outputted, but all my outputs are <noproblem/>.
What the hell? What am I missing here?

Regards
Jean-Francois Michaud
without seeing full input and code it's hard to really tell, but the
behaviour seems expected to me.
with the code as it is (with the value-of commented out) the template
never puts any text into the result tree, just empty elements
<duplicate/or <noproblem/so if this template is typical, then
<xsl:variable name="rest">
<xsl:apply-templates select="following-sibling::*" mode="string"/>
</xsl:variable>

will just produce a sequence of empty elements, so the string value will
be "" so

<xsl:when test="contains($rest-normalized, $current-normalized)">

is testing if "" contains as a substring the normalised string value of
the current node, which will presumably test as false so you get
<noproblem/every time.

As you are using saxon9 (thus xslt2) you probably want to be looking as
xsl:for-each-group to do duplicate removal, which is likely to be much
more efficient.

David

--
http://dpcarlisle.blogspot.com
Mar 11 '08 #4

P: n/a
In article <f7**********************************@e23g2000prf. googlegroups.com>,
Jean-François Michaud <co*****@comcast.netwrote:
>When I output a string
that's not directly tied to the logic (or at least apparently not tied
to the logic), the logic works. When I remove the string output, the
logic stops working. Argh!
Look carefully:
<xsl:template match="item-wrapper" mode="string">

<xsl:variable name="current">
<xsl:apply-templates mode="string" />
</xsl:variable>
$current is set to the result of applying the string-mode templates to
the children of the current node. And this *is* the string-mode template.
So when you comment out this:
><!-- <xsl:value-of select="$current-normalized"/>-->
you are changing $current!

-- Richard
--
:wq
Mar 11 '08 #5

P: n/a
On Mar 11, 3:01 pm, David Carlisle <david-n...@dcarlisle.demon.co.uk>
wrote:
Jean-François Michaud wrote:
Context:
I'm trying to compare XML tree fragments and I'm doing so by
outputting the attributes of each element in the tree and outputting
it to a string then normalizing the strings. Then I'm doing a contains
of the current string against the following-sibling::* to determine if
we have duplicates. If we have a duplicate, we move to the next item,
if there is no duplicate, we output the small tree.
I'm hitting a completely ridiculous problem that I can't wrap my head
around. I spend easily 2 hours trying to understand why this happened
yesterday and I'm about to punch the computer. When I output a string
that's not directly tied to the logic (or at least apparently not tied
to the logic), the logic works. When I remove the string output, the
logic stops working. Argh!
Here's the XSLT snippet
<xsl:template match="item-wrapper" mode="string">
<xsl:variable name="current">
<xsl:apply-templates mode="string" />
</xsl:variable>
<xsl:variable name="rest">
<xsl:apply-templates select="following-sibling::*"
mode="string" />
</xsl:variable>
<xsl:variable name="current-normalized">
<xsl:value-of select="normalize-space($current)" />
</xsl:variable>
<xsl:variable name="rest-normalized">
<xsl:value-of select="normalize-space($rest)" />
</xsl:variable>
<!-- <fix/What the hell? -->
<!-- <xsl:value-of select="$current-normalized"/>-->
<xsl:choose>
<xsl:when test="contains($rest-normalized, $current-
normalized)">
<duplicate/>
</xsl:when>
<xsl:otherwise>
<noproblem/>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
When <xsl:value-of select="$current-normalized"/is not commented
out, I get the string representation of the XML tree in a normalized
form in the output for each item and the correct <duplicate/and
<noproblem/get outputted in the correct location after the
normalized strings. When I comment out the <xsl:value-of
select="$current-normalized"/>, the normalized current string, as
expected, stops being outputted, but all my outputs are <noproblem/>.
What the hell? What am I missing here?
Regards
Jean-Francois Michaud

without seeing full input and code it's hard to really tell
Understandably but I can't really produce a subset at this time, it
would be too time consuming so I opted for the problem area.

, but the
behaviour seems expected to me.
Hmmm.
with the code as it is (with the value-of commented out) the template
never puts any text into the result tree, just empty elements
<duplicate/or <noproblem/>
That's what I had in mind and is the behavior that I expected to test
out the logic and see if duplicates vs non-duplicates were recognized
correctly.

so if this template is typical, then
<xsl:variable name="rest">
<xsl:apply-templates select="following-sibling::*" mode="string"/>
</xsl:variable>

will just produce a sequence of empty elements, so the string value will
be "" so
Nope! The string templates actually take the attribute names and
values of each element found and outputs it's content so we end up
with a long string of concatenated attribute names and attribute
values. The string is used to effectively "condense" an XML tree into
a string form for comparison purposes. The idea is to test and verify
that two XML trees are identical or not.

The rest variable then contains a similar soup of long string of
attribute names and values (of ALL the siblings). The content is then
normalized to avoid potential whitespace inconsistencies (same with
the current node) and then I simply verify that the current node in a
string form isn't found anywhere in the following siblings string
form, effectively determining that the current node and all it's
children isn't found in any of it's siblings.
<xsl:when test="contains($rest-normalized, $current-normalized)">

is testing if "" contains as a substring the normalized string value of
the current node, which will presumably test as false so you get
<noproblem/every time.
This seems to be a reasonable explanation for <noproblembeing
outputted everywhere.

The magic bit though is that when you uncomment out the <xsl:value-of
select="$current-normalized">, which, as I understand, has nothing to
do with the evaluation of $rest against the following siblings in a
string form or $rest-normalized or the logic layed down to output
<noproblem/or <duplicate/>, the logic actually outputs <noproblem>
where I expect and <duplicateswhere I expect ;-).

If you're explanation was right, $rest being empty and by extension,
$rest-normalized being empty, it would imply that outputting value-of
$current-normalized somehow made $rest get populated correctly. That
is not what I would call expectable behavior. Or maybe I'm missing
something else?
As you are using saxon9 (thus xslt2) you probably want to be looking as
xsl:for-each-group to do duplicate removal, which is likely to be much
more efficient.
Actually using Saxon8 and we dont' really have the option of hopping
to SaxonB9 at this point in time ;-).

Regards
Jean-Francois Michaud
Mar 12 '08 #6

P: n/a
Jean-François Michaud wrote:
Understandably but I can't really produce a subset at this time, it
would be too time consuming so I opted for the problem area.
Unfortunately, without adequate context it takes a lot of *our* type to
try to diagnose what's going on. If it isn't worth your time, is it
worth ours?

For example: You're producing current by recursion in string mode. The
actual change may be happening when an inner item-wrapper is processed;
it now contains something that wasn't expected and may be changing
behavior in another template you haven't shown us, thus changing
current's value.

Without seeing the rest of the stylesheet, we can't do more than guess
at that kind of interaction.

Spend the time to reduce your problem to a runnable minimal testcase. In
the course of doing so you may solve it yourself. If not, you'll at
least have something other people can help you with.
--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Mar 12 '08 #7

P: n/a
On Mar 11, 3:52 pm, rich...@cogsci.ed.ac.uk (Richard Tobin) wrote:
In article <f7b1c26b-f0c3-466b-987a-a28e77521...@e23g2000prf.googlegroups.com>,
Jean-François Michaud <come...@comcast.netwrote:
When I output a string
that's not directly tied to the logic (or at least apparently not tied
to the logic), the logic works. When I remove the string output, the
logic stops working. Argh!

Look carefully:
<xsl:template match="item-wrapper" mode="string">
<xsl:variable name="current">
<xsl:apply-templates mode="string" />
</xsl:variable>

$current is set to the result of applying the string-mode templates to
the children of the current node. And this *is* the string-mode template.
So when you comment out this:
<!-- <xsl:value-of select="$current-normalized"/>-->

you are changing $current!
An interesting idea, but since the whole stylesheet isn't around, from
my understanding, this is not the case (this is an interresting twist
through and something might be happening there that I'm not perceiving
correctly). item-wrapper contains other elements and templates exist
(mode="string") for those elements also. As mentioned in the answer to
David Carlisle, those templates only output the name and value of each
of their attributes, effectively creating a long string that uniquely
identifies (hopefully) the XML tree under item-wrapper.

Also, because of the hierarchical relationship of nodes, I would
expect $current, to be isolated from the $current of other calls. If
it wasn't, we'd certainly be in trouble ;-).

Regards
Jean-Francois Michaud
Mar 12 '08 #8

P: n/a
On Mar 12, 10:43 am, Joseph Kesselman <keshlam-nos...@comcast.net>
wrote:
Jean-François Michaud wrote:
Understandably but I can't really produce a subset at this time, it
would be too time consuming so I opted for the problem area.

Unfortunately, without adequate context it takes a lot of *our* type to
try to diagnose what's going on. If it isn't worth your time, is it
worth ours?
I understand, I just thought I'd get some thoughts flowing that could
help me pinpoint the problem. I'll spend some time and try and extract
a subset. This would further my understanding of XSLT in any case If
there is something wrong with my code or would help us pinpoint the
Saxon bug if this is the real underlying problem.

I implemented another solution from scratch through string reduction
prior to right hand XML tree creation and hit a silly node-set/for-
each scope limitation. I had to settle for a less graceful solution
(variant on the node-set/for-each solution) that seems to now be
working. I still want to understand what happened on this one though.
For example: You're producing current by recursion in string mode. The
actual change may be happening when an inner item-wrapper is processed;
it now contains something that wasn't expected and may be changing
behavior in another template you haven't shown us, thus changing
current's value.

Without seeing the rest of the stylesheet, we can't do more than guess
at that kind of interaction.

Spend the time to reduce your problem to a runnable minimal testcase. In
the course of doing so you may solve it yourself. If not, you'll at
least have something other people can help you with.

--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Regards
Jean-Francois Michaud
Mar 12 '08 #9

P: n/a
Jean-François Michaud wrote:

>
>with the code as it is (with the value-of commented out) the template
never puts any text into the result tree, just empty elements
<duplicate/or <noproblem/>

That's what I had in mind and is the behavior that I expected to test
out the logic and see if duplicates vs non-duplicates were recognized
correctly.

so if this template is typical, then
> <xsl:variable name="rest">
<xsl:apply-templates select="following-sibling::*" mode="string"/>
</xsl:variable>

will just produce a sequence of empty elements, so the string value will
be "" so

Nope! The string templates actually take the attribute names and
values of each element found and outputs it's content
The teplate you posted outputs nothing, which is teh cause of your
problem. When you remove the comment then it outputs something, and your
problem goes.

so we end up
with a long string of concatenated attribute names and attribute
values. The string is used to effectively "condense" an XML tree into
a string form for comparison purposes. The idea is to test and verify
that two XML trees are identical or not.
why not use deep-equal ?

>
The rest variable then contains a similar soup of long string of
attribute names and values (of ALL the siblings). The content is then
normalized to avoid potential whitespace inconsistencies (same with
the current node) and then I simply verify that the current node in a
string form isn't found anywhere in the following siblings string
form, effectively determining that the current node and all it's
children isn't found in any of it's siblings.
> <xsl:when test="contains($rest-normalized, $current-normalized)">

is testing if "" contains as a substring the normalized string value of
the current node, which will presumably test as false so you get
<noproblem/every time.

This seems to be a reasonable explanation for <noproblembeing
outputted everywhere.

The magic bit though is that when you uncomment out the <xsl:value-of
select="$current-normalized">, which, as I understand, has nothing to
do with the evaluation of $rest against the following siblings in a
yes it has everything to do with that, as I tried to explain. the
siblings are all evaluate elements, and so if you leave the comment in
$rest is evaluated using this template and will have empty string value.
string form or $rest-normalized or the logic layed down to output
<noproblem/or <duplicate/>, the logic actually outputs <noproblem>
where I expect and <duplicateswhere I expect ;-).

If you're explanation was right, $rest being empty and by extension,
$rest-normalized being empty, it would imply that outputting value-of
$current-normalized somehow made $rest get populated correctly. That
is not what I would call expectable behavior. Or maybe I'm missing
something else?
Sorry I'm not sure how else I can explain it, but just look carefully at
your template for evaluate; it generates no output.
>
>As you are using saxon9 (thus xslt2) you probably want to be looking as
xsl:for-each-group to do duplicate removal, which is likely to be much
more efficient.

Actually using Saxon8 and we dont' really have the option of hopping
to SaxonB9 at this point in time ;-).
saxon8 is the same as 9 for this purposes they both implement teh same
version of XSLt (2.0) saxon 6 was the xslt 1 processor.
>
Regards
Jean-Francois Michaud

--
http://dpcarlisle.blogspot.com
Mar 12 '08 #10

P: n/a
The only thing that needs 2.0 is the redirection -- which is
irrelevant. Commented it out for debugging under 1.0.

Fixing the various line wraps was a pain. Generally it's easier for
everyone if you put the files on a website somewhere so they can be
downloaded intact, rather than risking the newsgroup tools mucking
with them.

You say the failure's in the template for item-wrapper in string mode.
The appropriate test is to see what the output of that template is.
Quickest approach is changing the root to just apply-templates
against //item-wrapper in that mode.

Sure enough, that template produces no output; your change forces some
non-empty output into it. David is absolutely right.

So the thing to investigate is why you expected this template to
produce non-empty results, not why forcing non-empty results has an
effect. You asked the wrong question. Go back and ask the right one.

Divide-and-conquer works wonders.

xsl:message can also be a lifesaver, for displaying where you've
gotten to and what some of the values are. Standard pointer to my
DeveloperWorks articles on "styling stylesheets" for a way to
automatically insert some stylesheet tracing information.

Mar 13 '08 #11

P: n/a
On Mar 12, 5:18 pm, keshlam <keshlam-nos...@comcast.netwrote:
The only thing that needs 2.0 is the redirection -- which is
irrelevant. Commented it out for debugging under 1.0.

Fixing the various line wraps was a pain. Generally it's easier for
everyone if you put the files on a website somewhere so they can be
downloaded intact, rather than risking the newsgroup tools mucking
with them.

You say the failure's in the template for item-wrapper in string mode.
The appropriate test is to see what the output of that template is.
Quickest approach is changing the root to just apply-templates
against //item-wrapper in that mode.

Sure enough, that template produces no output; your change forces some
non-empty output into it. David is absolutely right.

So the thing to investigate is why you expected this template to
produce non-empty results, not why forcing non-empty results has an
effect. You asked the wrong question. Go back and ask the right one.

Divide-and-conquer works wonders.

xsl:message can also be a lifesaver, for displaying where you've
gotten to and what some of the values are. Standard pointer to my
DeveloperWorks articles on "styling stylesheets" for a way to
automatically insert some stylesheet tracing information.
Alright, I'll give it some further thought. Thanks a bunch to
everybody for taking the time to review the code. I really appreciate
the help ;-).

Regards
Jean-Francois Michaud
Mar 19 '08 #12

This discussion thread is closed

Replies have been disabled for this discussion.