Here is a solution my earlier post. I used the Saxon8.7b parser.
I don't know if the solution relies on any XSLT 2.0 capabilities,
I need to test it with a XSLT 1.0 compliant parser.
The setup is as follows: A parent "container" element holds a
number of children elements with the same tag name. You want to
make it easy for a program to randomly select a child element with
a frequency that varies for each child. So in the example 'XML
input file' below, the first parent "container" element is named
'people' and thre are three children with the tag name 'person'.
The weights for the three children are '80, '10' and '40'. So
80/(80+10+40)% of the time I want to select the first 'person'
element. Likewise, within the first person element, I want to
select the first 'given' element 35/(35+25+10)% of the time.
Notes:
- The solution seems to work on nested weightSum-weight
combinations.
- For reasons I don't understand, simply applying the
transformation to the XML input file results in extra blank
lines. I use awk in a shell script to get rid of the blank
lines.
- Referring to the 'XML output file', a program would randomly
select (say) a 'person' by
1- reading the value of the 'weightSum'attribute for the
parent element 'persons'
2- randomly drawing between 0 and weightSum-1
3- locating the 'person' element s.t. the random number is
= the 'lower' attribute value and < the 'upper'
attribute value.
%------------------- XML input file ------------------------------
<?xml version="1.0"?>
<people weightSum="100">
<person weight="80">
<givens weightSum="0">
<given weight="35">Alfred</given>
<given weight="25">Fred</given>
<given weight="10">Wilfred</given>
</givens>
<family>Newman</family>
</person>
<person weight="10">
<givens>
<given>Leslie</given>
</givens>
<family>Newman</family>
</person>
<person weight="40">
<givens>
<given>Maria</given>
</givens>
<family>Newman</family>
</person>
</people>
%------------------- XML output file -----------------------------
<?xml version="1.0" encoding="UTF-8"?>
<people weightSum="130">
<person weight="80" lower="0" upper="80">
<givens weightSum="70">
<given weight="35" lower="0" upper="35">Alfred</given>
<given weight="25" lower="35" upper="60">Fred</given>
<given weight="10" lower="60" upper="70">Wilfred</given>
</givens>
<family>Newman</family>
</person>
<person weight="10" lower="80" upper="90">
<givens>
<given>Leslie</given>
</givens>
<family>Newman</family>
</person>
<person weight="40" lower="90" upper="130">
<givens>
<given>Maria</given>
</givens>
<family>Newman</family>
</person>
</people>
%------------------- XSLT file -----------------------------------
<?xml version="1.0"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent="yes"/>
<!-- The xsl:choose statement is used have this default template
match -->
<!-- everything EXCEPT elements with a 'weight' attribute.
-->
<xsl:template match="@*|node()">
<xsl:choose>
<xsl:when test="@weight"></xsl:when>
<xsl:otherwise>
<xsl:copy>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:otherwise>
</xsl:choose>
</xsl:template>
<!-- Here we match the element nodes that have a 'weight' attribute
-->
<xsl:template match="attribute::weightSum">
<xsl:attribute name="weightSum">
<xsl:value-of select="sum(../child::*/attribute::weight)" />
</xsl:attribute>
<xsl:for-each select="../child::*">
<xsl:variable name="weight" select="attribute::weight" />
<xsl:variable name="from"
select="sum(./preceding-sibling::*/attribute::weight)" />
<xsl:variable name="to"
select="sum(./preceding-sibling::*/attribute::weight)+$weight" />
<xsl:copy>
<xsl:attribute name="weight" >
<xsl:value-of select="$weight" />
</xsl:attribute>
<xsl:attribute name="lower">
<xsl:value-of select="$from" />
</xsl:attribute>
<xsl:attribute name="upper">
<xsl:value-of select="$to" />
</xsl:attribute>
<xsl:apply-templates select="@*|node()"/>
</xsl:copy>
</xsl:for-each>
</xsl:template>
</xsl:stylesheet>
%------------- Script to remove extra blank lines ----------------
#!/bin/bash
argc="$#"
if [ \( "$argc" -lt 1 \) -o \( "$argc" -gt 2 \) ]; then
printf "\n\n"
printf " Usage: NormalizeWeights.sh data.xml [output_file]"
printf "\n\n"
exit 1
fi
if [ "$argc" -eq 1 ]; then
inputXmlFname=$1;
/usr/bin/java -jar $HOME/sbox/software/lib/saxon8.7/saxon8.jar -t
$inputXmlFname NormalizeWeights.xsl | /usr/bin/awk '!/^( )+$/{print
$0;}'
elif [ "$argc" -eq 2 ]; then
inputXmlFname=$1;
outputXmlFname=$2;
if [ -f "$outputXmlFname" ]; then
backupName=$(printf "%s%s" $outputXmlFname ".bac" )
echo "File $outputXmlFname exists, making backup named
$backupName"
/bin/cp $outputXmlFname $backupName
fi
/usr/bin/java -jar $HOME/sbox/software/lib/saxon8.7/saxon8.jar -t
-o $outputXmlFname $inputXmlFname NormalizeWeights.xsl
/bin/cat $outputXmlFname | /usr/bin/awk '!/^( )+$/{print $0;}' >
tmp$$
/bin/mv tmp$$ $outputXmlFname
/bin/rm -f tmp$$
fi