By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,784 Members | 1,182 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,784 IT Pros & Developers. It's quick & easy.

Need help remake xsl transformation

P: n/a
Hi everyone
From one of our systems an xml file is produced. I need to validate

this file before we send it to an external system for a very lenghty
process. I cannot change the xml file layout.
The solution i got today is very slow, and i need help to find another
solution.

Here is the xml file. It consists of a list of position ids (ESTOXX50
INDEX_BM_E and FTSE INDEX_BM_E), and below that a list of tags for each
position id. What i want to do is see that each entry not being in the
<groupCustomBucketList> list has an entry in each of the
<groupCustomBucket> tags below. And vice versa; that each position id
from each tag exists in the list of <equity>. See xsl transformation
below.

<?xml version="1.0" encoding="utf-8"?>
<?xml-stylesheet type="text/xsl" href="t.xsl"?>
<positions>
<equity>
<positionId>ESTOXX50 INDEX_BM_E</positionId>
</equity>
<equity>
<positionId>FTSE INDEX_BM_E</positionId>
</equity>

<groupCustomBucketList>
<groupCustomBucket>
<customDimensionName>Branch</customDimensionName>
<customBucketValue>BENCHMARK</customBucketValue>
<positionIdList>
<positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
<positionId>BMK ZENIT FTSE INDEX_BM_E</positionId>
</positionIdList>
</groupCustomBucket>
<groupCustomBucket>
<customDimensionName>Folder</customDimensionName>
<customBucketValue>BZ_ESTOX50</customBucketValue>
<positionIdList>
<positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
</positionIdList>
</groupCustomBucket>
<groupCustomBucket>
<customDimensionName>Folder</customDimensionName>
<customBucketValue>BZ_FTSE</customBucketValue>
<positionIdList>
<positionId>BMK ZENIT FTSE INDEX_BM_E</positionId>
</positionIdList>
</groupCustomBucket>
<groupCustomBucket>
<customDimensionName>Portfolio</customDimensionName>
<customBucketValue>BMK_ZENIT</customBucketValue>
<positionIdList>
<positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
<positionId>BMK ZENIT FTSE INDEX_BM_E</positionId>
</positionIdList>
</groupCustomBucket>
<groupCustomBucket>
<customDimensionName>CurrencyRegion</customDimensionName>
<customBucketValue>EUR</customBucketValue>
<positionIdList>
<positionId>BMK ZENIT ESTOXX50 INDEX_BM_E</positionId>
</positionIdList>
</groupCustomBucket>
</groupCustomBucketList>
</positions>

-----------------
Here is the xsl file. What i use is loads of call-template executes
which i guess is the performance issue. The code below works, but it's
really messy. And slow.
I have two "functions" loop_position and loop_tag that validates each
tag type against the position ids.
<?xml version="1.0"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:variable
name="tagstoscan">Branch,Portfolio,Folder,Currency Region,</xsl:variable>

<xsl:template match="/">
<xsl:element name="positions">
<xsl:attribute name="nbofcolumns">
<xsl:call-template name="count_nb_of_tags">
<xsl:with-param name="tags"><xsl:value-of select="$tagstoscan"
/></xsl:with-param>
<xsl:with-param name="count">0</xsl:with-param>
</xsl:call-template>
</xsl:attribute>
<!-- Find tags that are illegal -->
<xsl:call-template name="loop">
<xsl:with-param name="tags"><xsl:value-of select="$tagstoscan"
/></xsl:with-param>
</xsl:call-template>
</xsl:element>
</xsl:template>

<!-- Count the number of tags we are processing -->
<xsl:template name="count_nb_of_tags">
<xsl:param name="tags" />
<xsl:param name="tag" select="substring-before($tags, ',')" />
<xsl:param name="count" />

<xsl:if test="string-length($tag) = 0"><xsl:value-of select="$count"
/> </xsl:if>

<xsl:if test="string-length($tags) > 0">
<xsl:call-template name="count_nb_of_tags">
<xsl:with-param name="tags" select="substring-after($tags, ',')" />
<xsl:with-param name="count" select="$count + 1" />
</xsl:call-template>
</xsl:if>
</xsl:template>

<!-- Loop all tags we are processing, parsing the xml. Check two
directions: positions to tags, and reverse -->
<xsl:template name="loop">
<xsl:param name="tags" />
<xsl:param name="tag" select="substring-before($tags, ',')" />

<xsl:if test="string-length($tag) > 0">
<xsl:element name="position">
<xsl:attribute name="positionId"></xsl:attribute>
<xsl:call-template name="loop_position">
<xsl:with-param name="tags" select="$tag" />
</xsl:call-template>
<xsl:call-template name="loop_tag">
<xsl:with-param name="tags" select="$tag" />
</xsl:call-template>
</xsl:element>
</xsl:if>

<xsl:if test="string-length($tags) > 0">
<xsl:call-template name="loop">
<xsl:with-param name="tags" select="substring-after($tags, ',')" />
</xsl:call-template>
</xsl:if>
</xsl:template>

<!-- Tag parsing -->
<xsl:template name="loop_tag">
<xsl:param name="tags" />
<xsl:for-each select="positions/*/positionId">
<xsl:call-template name="find_id_in_taglist">
<xsl:with-param name="id" select="." />
<xsl:with-param name="tag" select="$tags" />
</xsl:call-template>
</xsl:for-each>
</xsl:template>

<xsl:template name="find_id_in_taglist">
<xsl:param name="id" />
<xsl:param name="tag" />
<xsl:if
test="string-length(/positions/groupCustomBucketList/groupCustomBucket/customDimensionName[.
= $tag]/../positionIdList/positionId[. = $id]) = 0">

<xsl:attribute name="positionId"><xsl:value-of select="$id"
/></xsl:attribute>
<xsl:variable name="fixedid"><xsl:call-template
name="remove_space"><xsl:with-param name="string" select="$tag"
/></xsl:call-template></xsl:variable>
<xsl:attribute name="{$fixedid}">1</xsl:attribute>
</xsl:if>
</xsl:template>

<!-- Position parsing -->
<xsl:template name="loop_position">
<xsl:param name="tags" />
<xsl:for-each
select="/positions/groupCustomBucketList/groupCustomBucket/customDimensionName[.
= $tags]/../positionIdList/positionId">
<xsl:call-template name="find_id_in_positionlist">
<xsl:with-param name="id" select="." />
<xsl:with-param name="tag" select="$tags" />
</xsl:call-template>
</xsl:for-each>
</xsl:template>

<xsl:template name="find_id_in_positionlist">
<xsl:param name="id" />
<xsl:param name="tag" />
<xsl:if test="string-length(/positions/*/positionId[. = $id]) = 0">
<xsl:attribute name="positionId"><xsl:value-of select="$id"
/></xsl:attribute>
<xsl:variable name="fixedid"><xsl:call-template
name="remove_space"><xsl:with-param name="string" select="$tag"
/></xsl:call-template></xsl:variable>
<xsl:attribute name="{$fixedid}">1</xsl:attribute>
</xsl:if>
</xsl:template>

<!-- Remove spaces -->
<xsl:template name="remove_space">
<xsl:param name="string" />
<xsl:choose>
<xsl:when test="contains($string, ' ')">
<xsl:call-template name="remove_space">
<xsl:with-param name="string">
<xsl:value-of select="substring-before($string, ' ')"
/><xsl:value-of select="substring-after($string, ' ')" />
</xsl:with-param>
</xsl:call-template>
</xsl:when>
<xsl:otherwise>
<xsl:value-of select="$string" />
</xsl:otherwise>
</xsl:choose>
</xsl:template>

<!-- Override default template rules -->

<xsl:template match="*|/" mode="m">
<!-- Do nothing. Override default rule -->
</xsl:template>

<xsl:template match="processing-instruction()|comment()" >
<!-- Do nothing. Override default rule -->
</xsl:template>

<xsl:template match="text() | @*">
<!-- Do nothing. Override default rule -->
</xsl:template>

</xsl:stylesheet>
Regards,
/Johan

Jun 19 '06 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Convolving sets against each other is expensive. Try recasting the problem.

For example: your second constraint is that the union of the two index
lists is precisely equal to the list of entries, after duplicates are
eliminated. That can be computed by collecting the sets, sorting them,
ensuring no dupes exist, and then doing a comparison of the result. That
may be faster (especially if you know a priori that some of these
subsets are already sorted.)

Establishing that the intersection of the two index sets is empty,
similarly, might be run faster if you test it by establishing that the
length of the sorted-unique union of the two is equal to the sum of the
sorted-unique lengths of each index set.

But I suspect the fastest way to do this particular set of tests would
be to drop down to a lower level and handle it in SAX or DOM, building
hashtables or similar content-addressable retrieval mechanisms. The fact
that XSLT is a complete programming language for manipulating XML
doesn't necessarily mean it's the optimal one for all tasks.

--
() ASCII Ribbon Campaign | Joe Kesselman
/\ Stamp out HTML e-mail! | System architexture and kinetic poetry
Jun 22 '06 #2

This discussion thread is closed

Replies have been disabled for this discussion.