468,720 Members | 1,675 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,720 developers. It's quick & easy.

Histogram in XSLT 1.0

I should like to count the frequency of strings embedded in a longer
string, space separated. Specifically, I have:
<phiModule>
5 5 5 5 6 6 6 6 7 7 7
7 8 8 8 8 8 5 5 5 6 6
6 7 7 7 7 7 7 7 7 8 8
8 8 8 8 8 8 8 9 9 9 9
6 7 7 7 8 8 8 8 9 9 9
9 9 9 9 9 10 10 10 10 10 10
11 11 11 11 11 9 9 9 9 9 9
9 10 10 10 10 10 10 11 11 11 11
11 11 11 11 11 11 11 12 12 13 13
13 13 13 13 13 13
</phiModule>

And I should like to count the number of each phi value, eventually
outputting a text like:

Phi module 6 was hit 4 times.
(and so on for all the other phi values)

THe phi values are limited to a range 0-51, but I dont know what phi
values will appear in a given file.

Has anyone tackled something like this? I have to use xslt 1.0, so
tokenize, grouping etc becomes a bit tedious...

cheers

shaun
Nov 27 '06 #1
4 1686
I have to use xslt 1.0

Why?
tokenize, grouping etc becomes a bit tedious...
Sure does. Kinda sounds like an arbitrary academic
homework assignment, in which case an arbitrary
academic solution should suffice.

I guess I would look into transforming the numeric
list into a set of XML nodes (e.g. "<Foo Phi='xx'/>")
and then for-each of the possible values of phi, just
emit a count of the nodes having the attribute of
the corresponding value.

Good luck,
Ron Burk
www.xmlator.com

Nov 27 '06 #2
xm*****@gmail.com wrote:
>>I have to use xslt 1.0
Why?
Presumably because a 2.0 processor isn't available in the target
environment (not very surprising).

Since XSLT's string (as opposed to structural) manipulation capabilities
are relatively limited, I agree that the two-stage approach (convert it
into something XSLT can count easily, then count) may be simplest. That
second pass can be done in 1.0 without a separate styling pass with a
bit of help from the exslt nodeset extension function; this isn't
actually part of 1.0 but it is widely supported for exactly this sort of
two-pass solution.

Of course you're going to have to do a recursive parse pass to pull the
individual integers out of that text string and convert them. So another
approach would be to write the recursion to count them directly and
generate a report when it runs out of input. You know there's a limited
range, so you can have an explicit parameter for each value to carry the
count (so far) down through the recursion. Since this would be
tail-recursion, a good XSLT processor would be able to optimize it into
a tolerably efficient loop.

Another fix, of course, would be to change whatever is generating this
data to produce it in a more XML/XSLT-friendly format in the first
place, avoiding the need for conversion or tokenizing.


--
Joe Kesselman / Beware the fury of a patient man. -- John Dryden
Nov 27 '06 #3
Using FXSL 1 this is straightforward:

When this transformation:

<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:ext="http://exslt.org/common"
exclude-result-prefixes="ext"
>
<xsl:import href="strSplit-to-Words.xsl"/>

<xsl:output indent="yes" omit-xml-declaration="yes"/>

<xsl:key name="kWordByVal" match="word" use="."/>

<xsl:template match="/">
<xsl:variable name="vrtfwordNodes">
<words>
<xsl:call-template name="str-split-to-words">
<xsl:with-param name="pStr" select="/"/>
<xsl:with-param name="pDelimiters"
select="', '"/>
</xsl:call-template>
</words>
</xsl:variable>

<xsl:variable name="vwordNodes"
select="ext:node-set($vrtfwordNodes)"/>

<xsl:for-each select="$vwordNodes">
<xsl:for-each select="$vwordNodes/*/*[.]
[generate-id()
=
generate-id(key('kWordByVal',.)[1])
]">
<xsl:sort data-type="number"/>

<xsl:value-of select=
"concat('Phi module ', .,
' was hit ',
count(key('kWordByVal',.)),
' times&#xA;'
)"
/>
</xsl:for-each>
</xsl:for-each>
</xsl:template>

</xsl:stylesheet>

is applied on your input document:

<phiModule>
5 5 5 5 6 6 6 6 7 7 7
7 8 8 8 8 8 5 5 5 6 6
6 7 7 7 7 7 7 7 7 8 8
8 8 8 8 8 8 8 9 9 9 9
6 7 7 7 8 8 8 8 9 9 9
9 9 9 9 9 10 10 10 10 10 10
11 11 11 11 11 9 9 9 9 9 9
9 10 10 10 10 10 10 11 11 11 11
11 11 11 11 11 11 11 12 12 13 13
13 13 13 13 13 13
</phiModule>

the wanted result is produced:

Phi module was hit 1 times
Phi module 5 was hit 7 times
Phi module 6 was hit 8 times
Phi module 7 was hit 15 times
Phi module 8 was hit 18 times
Phi module 9 was hit 19 times
Phi module 10 was hit 12 times
Phi module 11 was hit 16 times
Phi module 12 was hit 2 times
Phi module 13 was hit 8 times
Cheers,
Dimitre Novatchev
"shaun roe" <sh*******@wanadoo.frwrote in message
news:sh*****************************@cernne03.cern .ch...
>I should like to count the frequency of strings embedded in a longer
string, space separated. Specifically, I have:
<phiModule>
5 5 5 5 6 6 6 6 7 7 7
7 8 8 8 8 8 5 5 5 6 6
6 7 7 7 7 7 7 7 7 8 8
8 8 8 8 8 8 8 9 9 9 9
6 7 7 7 8 8 8 8 9 9 9
9 9 9 9 9 10 10 10 10 10 10
11 11 11 11 11 9 9 9 9 9 9
9 10 10 10 10 10 10 11 11 11 11
11 11 11 11 11 11 11 12 12 13 13
13 13 13 13 13 13
</phiModule>

And I should like to count the number of each phi value, eventually
outputting a text like:

Phi module 6 was hit 4 times.
(and so on for all the other phi values)

THe phi values are limited to a range 0-51, but I dont know what phi
values will appear in a given file.

Has anyone tackled something like this? I have to use xslt 1.0, so
tokenize, grouping etc becomes a bit tedious...

cheers

shaun

Nov 28 '06 #4
In article <456b7103$1@kcnews01>,
Joseph Kesselman <ke************@comcast.netwrote:
xm*****@gmail.com wrote:
>I have to use xslt 1.0
Why?

Presumably because a 2.0 processor isn't available in the target
environment (not very surprising).

Since XSLT's string (as opposed to structural) manipulation capabilities
are relatively limited, I agree that the two-stage approach (convert it
into something XSLT can count easily, then count) may be simplest. That
second pass can be done in 1.0 without a separate styling pass with a
bit of help from the exslt nodeset extension function; this isn't
actually part of 1.0 but it is widely supported for exactly this sort of
two-pass solution.

Of course you're going to have to do a recursive parse pass to pull the
individual integers out of that text string and convert them. So another
approach would be to write the recursion to count them directly and
generate a report when it runs out of input. You know there's a limited
range, so you can have an explicit parameter for each value to carry the
count (so far) down through the recursion. Since this would be
tail-recursion, a good XSLT processor would be able to optimize it into
a tolerably efficient loop.

Another fix, of course, would be to change whatever is generating this
data to produce it in a more XML/XSLT-friendly format in the first
place, avoiding the need for conversion or tokenizing.
Thanks for the ingenious suggestions and solutions. I should explain the
context, maybe you will find it interesting; I am working on the Silicon
Tracker for the Atlas experiment at CERN. I am seriously considering
proposing XSLT (dare I say Ajax?) as a remote monitoring solution for
the experiment, the idea being that only the Firefox web browser would
be needed to see the results. An example of a cosmic ray event is here:

http://sroe.home.cern.ch/sroe/svg/combined.svg

(generated by XSLT)

Thus I am restricted to what might be achievable in Firefox, with or
without some scripting incorporated. The file I showed generates the
kind of event display I link to, so is not *really* ideal for
histogramming, but seems (at present) to be the only xml result file
available as an RPC request from our analysis program. I'm trying to
discover how much I can do with it. In particular, where SVG production
is not an option I want a text summary of the event.
I can see I should make an effort to get the more amenable result files
available on request via a web service, but your suggestions have given
me some ideas for working with what I've got...

cheers

shaun
Nov 28 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by Oracle3001 | last post: by
1 post views Thread by bleh | last post: by
27 posts views Thread by ext_u | last post: by
12 posts views Thread by KraftDiner | last post: by
2 posts views Thread by Daniel Nogradi | last post: by
5 posts views Thread by arnuld | last post: by
1 post views Thread by avenger3200 | last post: by
1 post views Thread by CARIGAR | last post: by
1 post views Thread by Oskars | last post: by
9 posts views Thread by bryonone | last post: by
reply views Thread by zhoujie | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.