Hello,
I have a Real-World-XSLT-Problem.(tm)
That is: A XML-Document of about 75.000 elements contains
<person>-elements at various depths of the tree. Each person consists of
a <name>, <vorname> and <titel>. There are about 8000 persons in this tree.
A lot of them exist more than once. (That means that there are
person-Elements with the same content.)
The tree consists of some dozen subtrees wich are identified by
one special element at the root of the subtree. The name of this element
varies between different subtrees.
I need an alphabetically sorted list of persons. In this list I have to
use the first occurrance of a person because they are used in a
FOP-generated PDF as an anchor for a page-number. The <titel>-element
does not matter in this list.
I appy the Muench-Method three times. The problem which arisis is both
memory-usage and cpu-time. My P4-2.6 takes 10 minutes and 1.4 GB RAM.
Both is a "little bit" too much. All other transformations are done in a
fraction of seconds.
I am using libxsl 1.1.5.
Here is my xslt-template (sorry for the long lines):
#v+
<?xml version="1.0" encoding="iso-8859-1"?>
<!-- allgemeine Namespaces für alle template.fo-Dateien -->
<xsl:styleshe et version="1.0"
xmlns:xsl="http ://www.w3.org/1999/XSL/Transform"
xmlns:foberon=" http://rnvs.informatik .tu-chemnitz.de/foberon"
xmlns:date="htt p://exslt.org/dates-and-times"
xmlns:str="http ://exslt.org/strings"
xmlns:dynamic=" http://exslt.org/dynamic"
xmlns:exsl="htt p://exslt.org/common"
xmlns:func="htt p://exslt.org/functions"
xmlns:fo="http://www.w3.org/1999/XSL/Format"
xmlns:fox="http ://xml.apache.org/fop/extensions"
exclude-result-prefixes="xsl foberon date str dynamic exsl">
<xsl:output method="xml" indent="yes"/>
<!-- get the identifier for the subtree recursively towards the root -->
<func:functio n name="foberon:g roupid">
<xsl:param name="node"/>
<xsl:variable name="res">
<xsl:choose>
<xsl:when test="$node/foberon:meta/foberon:struktu rnummer">
<xsl:value-of select="$node/foberon:meta/foberon:struktu rnummer"/>
</xsl:when>
<xsl:when test="$node/foberon:meta/foberon:nummer" >
<xsl:value-of select="$node/foberon:meta/foberon:nummer"/>
</xsl:when>
<xsl:when test="$node/foberon:meta/foberon:jahr">
<xsl:value-of select="$node/foberon:meta/foberon:jahr"/>
</xsl:when>
<xsl:otherwis e>
<xsl:value-of select="foberon :groupid($node/..)"/>
</xsl:otherwise>
</xsl:choose>
</xsl:variable>
<func:result select="$res"/>
</func:function>
<xsl:key name="kDistinct Name" match="foberon: person" use="foberon:na me"/>
<xsl:key name="kDistinct NameAndVorname" match="foberon: person" use="concat(fob eron:name,'||', foberon:vorname )"/>
<xsl:key name="kDistinct NameAndVornameA ndSNR" match="foberon: person" use="concat(fob eron:name,'||', foberon:vorname ,'||',foberon:g roupid(.))"/>
<xsl:template name="foberon:p ersonenverzeich nis">
<fo:block xsl:use-attribute-sets="profstart ">Personenverze ichnis</fo:block>
<xsl:for-each select="//foberon:person[generate-id() = generate-id(key('kDistin ctName', foberon:name))]">
<!-- sort by name -->
<xsl:sort select="foberon :name"/>
<xsl:variable name="name" select="foberon :name"/>
<!-- find all different Name-Vorname-Pairs -->
<xsl:for-each select="//foberon:person[generate-id() = generate-id(key('kDistin ctNameAndVornam e',concat($name ,'||',foberon:v orname)))]">
<!-- sort by vorname -->
<xsl:sort select="foberon :vorname"/>
<xsl:variable name="vorname" select="foberon :vorname"/>
<fo:block>
<xsl:value-of select="foberon :name"/>, <xsl:value-of select="substri ng(foberon:vorn ame,0,2)"/><xsl:text>. </xsl:text>
<!-- find first occurance of name-vorname in a subtree -->
<xsl:for-each select="//foberon:person[generate-id() = generate-id(key('kDistin ctNameAndVornam eAndSNR', concat($name,'| |',$vorname,'|| ',foberon:group id(.))))]">
<fo:basic-link internal-destination="{g enerate-id(.)}">
<xsl:choose>
<xsl:when test="not(posit ion()=last())">
<fo:page-number-citation ref-id="{generate-id(.)}"/><xsl:text>, </xsl:text>
</xsl:when>
<xsl:otherwis e>
<fo:page-number-citation ref-id="{generate-id(.)}"/><xsl:text>. </xsl:text>
</xsl:otherwise>
</xsl:choose>
</fo:basic-link>
</xsl:for-each>
</fo:block>
</xsl:for-each>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
#v-
I tried the following:
1) Instead of using three times //foberon:person I create a variable
$persons=//foberon:person and applyed the for-each-loops on this
variable (for-each select=$persons[....]) -> no success
2) Not to use a variable in function groupid but writing:
<func:result><x sl:choose>...</xsl:choose></func:result> -> no success
There seems to be a memory-leak in libxslt.
Can you give me a hint or idea how to improve the performance of this
xslt?
Thank you in advance.
Chris
--
Chris Huebsch
www.hübsch-gemacht.de | TU Chemmnitz, Informatik, RNVS
GPG-Encrypted mail welcome! ID:7F2B4DBA | Str. d. Nationen 62, B204
Chemnitzer Linux-Tage 2005, 5.-6.März | D-09107 Chemnitz
http://www.tu-chemnitz.de/linux/tag/ | +49 371 531-1377, Fax -1803