469,958 Members | 1,929 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,958 developers. It's quick & easy.

.NET XSLT Transform - Optimization

das
Hello all,
I am using .NET XSLT to transform an XML into another XML file. All
this is fine with small files, but when tested with big files (30MB) it
is taking between 1hr-2hrs to just transform the file.

Here is the code snippet:

XPathDocument xdoc = new XPathDocument(fs);
XPathNavigator nav = xdoc.CreateNavigator();
xslt.Transform(nav, null, strWriter, null);

I read in this forum (Oleg T) that using keys will speed up the
transformation. Apart from this the other suggestion was to use MSXML
which is much faster (Which I cannot use). Below are the Input XML,
XSLT and a sample of the Output XML. I truncated part of Input and
Output XML for clarity purposes. Can someone please give me suggestions
on how to use Keys in the XSLT or any other method of Optimizing this
tranformation? Any ideas will be really helpful.

Input XML:
---------------------------------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8" ?>
<POC xmlns="www.poctest.com">

<Cars>
<West>
<row>
<Dept>6000</Dept>
<DeptDir>A</DeptDir>
<Adny>apart</Adny>
<Closd>10/20/2003</Closd>
<Open>2/15/1996</Open>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<FileNumber>12352</FileNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>W. 117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Staff>WXW</Staff>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
</row>

<row>
<Dept>6001</Dept>
<DeptDir>A</DeptDir>
<Adny>apart</Adny>
<Closd>10/20/2003</Closd>
<Open>2/15/1996</Open>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<FileNumber>12352</FileNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>W. 117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Staff>WXW</Staff>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
</row>

</West>

<South>
<row>
<Dept>7000</Dept>
<DeptDir>B</DeptDir>
<Adny>apart</Adny>
<Closd>10/20/2003</Closd>
<Open>2/15/1996</Open>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<FileNumber>12352</FileNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656ss565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Staff>WXW</Staff>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
</row>
</South>
</Cars>

</POC>

---------------------------------------------------------------------------------------------------------
XSLT:

<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:poc="www.poctest.com"
exclude-result-prefixes="poc"
version="1.0">
<xsl:output encoding="UTF-8"/>
<xsl:strip-space elements="*"/>
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="poc:Cars">
<Fields>
<xsl:apply-templates/>
</Fields>
</xsl:template>

<xsl:template match="poc:row">
<Record>
<LineNumber><xsl:number level="any" /></LineNumber>
<AreaPath>
<xsl:for-each select="(ancestor::*)">
<xsl:value-of select="local-name()"/>
<xsl:if test="not(position()=last())">\</xsl:if>
</xsl:for-each>
</AreaPath>

<xsl:apply-templates select="node()|@*" />
<xsl:if test="normalize-space(.//poc:Closd)=''">
<Status>O</Status>
</xsl:if>
<xsl:if test="normalize-space(.//poc:Closd)!=''">
<Status>C</Status>
</xsl:if>
</Record>
</xsl:template>

<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>

<xsl:template match="poc:POC">
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="poc:West">
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="poc:East">
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="poc:South">
<xsl:apply-templates/>
</xsl:template>

<xsl:template match="poc:North">
<xsl:apply-templates/>
</xsl:template>

<!-- Replace Element Adny with TermPost -->
<xsl:template match="poc:Adny">
<TermPost>
<xsl:apply-templates select="@* | node()" />
</TermPost>
</xsl:template>
<!-- Replace Element FileNumber with TyreNumber -->
<xsl:template match="poc:FileNumber">
<TyreNumber>
<xsl:apply-templates select="@* | node()" />
</TyreNumber>
</xsl:template>
<!-- Replace Element Staff with Bat -->
<xsl:template match="poc:Staff">
<Bat>
<xsl:apply-templates select="@* | node()" />
</Bat>
</xsl:template>
<!-- Replace Element Open with Date -->
<xsl:template match="poc:Open">
<Date>
<xsl:apply-templates select="@* | node()" />
</Date>
</xsl:template>
<!-- Replace Element Closd with Closed -->
<xsl:template match="poc:Closd">
<Closed>
<xsl:apply-templates select="@* | node()" />
</Closed>
</xsl:template>

</xsl:stylesheet>

---------------------------------------------------------------------------------------------------------
Output XML

<?xml version="1.0" encoding="utf-8" ?>
<Fields>
<Record>
<LineNumber>1</LineNumber>
<AreaPath>POC\Cars\West</AreaPath>
<Dept>6000</Dept>
<DeptDir>A</DeptDir>
<TermPost>apart</TermPost>
<Closed>10/20/2003</Closed>
<Date>2/15/1996</Date>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<TyreNumber>12352</TyreNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>W. 117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Bat>WXW</Bat>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
<Status>O</Status>
</Record>
</Fields>

Sep 14 '06 #1
12 2697


das wrote:

I am using .NET XSLT to transform an XML into another XML file. All
this is fine with small files, but when tested with big files (30MB) it
is taking between 1hr-2hrs to just transform the file.

Here is the code snippet:

XPathDocument xdoc = new XPathDocument(fs);
XPathNavigator nav = xdoc.CreateNavigator();
xslt.Transform(nav, null, strWriter, null);
What is xslt, a .NET 2.0 XslCompiledTransform or a .NET 1.x
XslTransform? If you use .NET 1.x, can you change to .NET 2.0 and
XslCompiledTransform?


--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 14 '06 #2
"das" <Ad*******@gmail.comwrote in message
news:11**********************@d34g2000cwd.googlegr oups.com...
Hello all,
I am using .NET XSLT to transform an XML into another XML file. All
this is fine with small files, but when tested with big files (30MB) it
is taking between 1hr-2hrs to just transform the file.

Here is the code snippet:

XPathDocument xdoc = new XPathDocument(fs);
XPathNavigator nav = xdoc.CreateNavigator();
xslt.Transform(nav, null, strWriter, null);

I read in this forum (Oleg T) that using keys will speed up the
transformation. Apart from this the other suggestion was to use MSXML
which is much faster (Which I cannot use). Below are the Input XML,
XSLT and a sample of the Output XML. I truncated part of Input and
Output XML for clarity purposes. Can someone please give me suggestions
on how to use Keys in the XSLT or any other method of Optimizing this
tranformation? Any ideas will be really helpful.
Assuming that you can't use one of the several tools which support profiling
XSL transforms, you should do this the "old-fashioned way". Optimize it the
same way as you would a program. In particular, you'll first want to find
out where most of the time is being spent, then optimize that section.

One very quick way is to try removing chunks of code, run the transform, and
see how much it is sped up. I jokingly call this technique "binary destroy",
to contrast it with "binary search".

1) Remove half the code
2) If the slow part is in the remaining code, then go back to step 1
3) If the slow part was in the code you removed, then put it back, then
4) Remove half of the code you just put back and go to step 2

Repeat until you find the slowest part.

John
Sep 14 '06 #3
das
Thanks for the reply,
I will try your approach, meanwhile I am still looking for the
"keys' method in XSLT.
And yes I cannot use .NET 2.0 for XslCompiledTransform.
John Saunders wrote:

Sep 14 '06 #4
das
I am also trying my hands on MSXML 4.0 Can someone tell me if this is a
good way to speed up the transformation?

MSXML2.DOMDocument40Class sourceMSXml = new
MSXML2.DOMDocument40Class();
sourceMSXml.load("Feed.xml");

MSXML2.DOMDocument40Class sourceMSXsl = new
MSXML2.DOMDocument40Class();
sourceMSXsl.async = false;
sourceMSXsl.load(@"C:\Temp\MYXSLT.xslt");
MSXML2.DOMDocument40Class transformMSXml = new
MSXML2.DOMDocument40Class();
sourceMSXml.transformNodeToObject(sourceMSXsl,tran sformMSXml) ;
transformMSXml.save(@"C:\Temp\Output.xml");

This line: sourceMSXml.transformNodeToObject - I am getting stalled on
this line, is the tranformation taking place in memory? - (High CPU and
memory usage)

thanks for any insights, if possible can anyone guide me to using keys
in XSL?
das wrote:
Thanks for the reply,
I will try your approach, meanwhile I am still looking for the
"keys' method in XSLT.
And yes I cannot use .NET 2.0 for XslCompiledTransform.
John Saunders wrote:
Sep 15 '06 #5


das wrote:

XPathDocument xdoc = new XPathDocument(fs);
XPathNavigator nav = xdoc.CreateNavigator();
xslt.Transform(nav, null, strWriter, null);
Are you building the complete result in memory with a strWriter
(StringWriter)? Have you tried whether writing directly to a stream is
faster?
I read in this forum (Oleg T) that using keys will speed up the
transformation.
Has Oleg said that in relation to your particular stylesheet?

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 15 '06 #6
das
sorry If I was not clear, strWriter is a stream Writer:

StreamWriter strWriter = new StreamWriter(tempFileTemp, false);

and Oleg was not referring to my XSL, it was someone else:

http://groups.google.com/group/micro...8839e8/?hl=en#

thanks.

Martin Honnen wrote:
Are you building the complete result in memory with a strWriter
(StringWriter)? Have you tried whether writing directly to a stream is
faster?

Has Oleg said that in relation to your particular stylesheet?

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 15 '06 #7
"das" <Ad*******@gmail.comwrote in message
news:11*********************@k70g2000cwa.googlegro ups.com...
>I am also trying my hands on MSXML 4.0 Can someone tell me if this is a
good way to speed up the transformation?
I want to reiterate that you should first determine what the problem is
before you try to solve it. You will otherwise risk solving the wrong
problem.

John
Sep 15 '06 #8


das wrote:

<xsl:if test="normalize-space(.//poc:Closd)=''">
<Status>O</Status>
</xsl:if>
<xsl:if test="normalize-space(.//poc:Closd)!=''">
<Status>C</Status>
</xsl:if>
This could be optimized to e.g.
<Status>
<xsl:choose>
<xsl:when test="normalize-space(.//poc:Closd)=''">
<xsl:text>O</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:text>C</xsl:text>
</xsl:otherwise>
</xsl:choose>
</Status>
The main optimization is to evaluate
normalize-space(.//poc:Closd)
only once and not twice as your stylesheet does.

For the example XML input shown it should also suffice to use e.g.
<xsl:when test="normalize-space(poc:Closd)=''">
as in that example the row elements only have Closd as direct children.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 16 '06 #9
das
thanks Martin, I will try your idea and Jon's idea to isolate the
issue, I will post my results soon.

regards.

Martin Honnen wrote:
The main optimization is to evaluate
normalize-space(.//poc:Closd)
only once and not twice as your stylesheet does.

For the example XML input shown it should also suffice to use e.g.
<xsl:when test="normalize-space(poc:Closd)=''">
as in that example the row elements only have Closd as direct children.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 18 '06 #10


das wrote:

Can someone please give me suggestions
on how to use Keys in the XSLT or any other method of Optimizing this
tranformation?
I think what is exensive in your original stylesheet is xsl:number
level="any" and walking the ancestors.

Here is an attempt to simply process the elements you seem to be
interested in (poc:row), allowing the use of position() instead of
xsl:number, and trying to avoid walking ancestors by passing the names
of elements down as parameters. As your sample input did not have any
attributes on elements but your stylesheet often did xsl:apply-templates
select="@* I have also removed any use of that:

<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:poc="www.poctest.com"
exclude-result-prefixes="poc"
version="1.0">

<xsl:output encoding="UTF-8" indent="yes"/>

<xsl:strip-space elements="*"/>

<xsl:template match="poc:POC">
<xsl:apply-templates select="poc:Cars">
<xsl:with-param name="path" select="local-name()"/>
</xsl:apply-templates>
</xsl:template>

<xsl:template match="poc:Cars">
<xsl:param name="path"/>
<Fields>
<xsl:apply-templates select="*/poc:*">
<xsl:with-param name="path" select="concat($path, '\',
local-name())"/>
</xsl:apply-templates>
</Fields>
</xsl:template>

<xsl:template match="poc:row">
<xsl:param name="path"/>
<Record>
<LineNumber><xsl:value-of select="position()"/></LineNumber>
<AreaPath><xsl:value-of select="concat($path, '\',
local-name(..))"/></AreaPath>

<xsl:apply-templates select="*" />

<Status>
<xsl:choose>
<xsl:when test="normalize-space(poc:Closd[1])=''">
<xsl:text>O</xsl:text>
</xsl:when>
<xsl:otherwise>
<xsl:text>C</xsl:text>
</xsl:otherwise>
</xsl:choose>
</Status>
</Record>
</xsl:template>

<xsl:template match="*">
<xsl:element name="{name()}">
<xsl:apply-templates/>
</xsl:element>
</xsl:template>
<!-- Replace Element Adny with TermPost -->
<xsl:template match="poc:Adny">
<TermPost>
<xsl:apply-templates/>
</TermPost>
</xsl:template>

<!-- Replace Element FileNumber with TyreNumber -->
<xsl:template match="poc:FileNumber">
<TyreNumber>
<xsl:apply-templates/>
</TyreNumber>
</xsl:template>

<!-- Replace Element Staff with Bat -->
<xsl:template match="poc:Staff">
<Bat>
<xsl:apply-templates/>
</Bat>
</xsl:template>

<!-- Replace Element Open with Date -->
<xsl:template match="poc:Open">
<Date>
<xsl:apply-templates/>
</Date>
</xsl:template>

<!-- Replace Element Closd with Closed -->
<xsl:template match="poc:Closd">
<Closed>
<xsl:apply-templates/>
</Closed>
</xsl:template>

</xsl:stylesheet>
Not sure that helps but if you are testing what causes the long
processing then I would at least try to test whether taking out
xsl:number level="any" improves things clearly. If so then try the above
approach to avoid using xsl:number but still be able to fill the
LineNumber output element.
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Sep 18 '06 #11
das
I think what is exensive in your original stylesheet is xsl:number
level="any" and walking the ancestors.
Thanks Martin, reason I was using number "any" is to generate a
Sequential number for each of the <Recordelements. For example my
output XML will look like this with multiple <Recordelements in it:

<?xml version="1.0" encoding="utf-8" ?>
<Fields>
<Record>
<LineNumber>1</LineNumber>
<AreaPath>POC\Cars\West</AreaPath>
<Dept>6000</Dept>
<DeptDir>A</DeptDir>
<TermPost>apart</TermPost>
<Closed>10/20/2003</Closed>
<Date>2/15/1996</Date>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<TyreNumber>12352</TyreNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>W. 117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Bat>WXW</Bat>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
<Status>O</Status>
</Record>
<Record>
<LineNumber>2</LineNumber>
<AreaPath>POC\Cars\West</AreaPath>
<Dept>6001</Dept>
<DeptDir>A</DeptDir>
<TermPost>apart</TermPost>
<Closed>10/20/2003</Closed>
<Date>2/15/1996</Date>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<TyreNumber>12352</TyreNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>W. 117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Bat>WXW</Bat>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
<Status>O</Status>
</Record>
<Record>
<LineNumber>3</LineNumber>
<AreaPath>POC\Cars\West</AreaPath>
<Dept>7000</Dept>
<DeptDir>A</DeptDir>
<TermPost>apart</TermPost>
<Closed>10/20/2003</Closed>
<Date>2/15/1996</Date>
<CarType>0</CarType>
<Desc1>Sum 1 blew it</Desc1>
<TyreNumber>12352</TyreNumber>
<DevAmt>0.00</DevAmt>
<DevDate />
<StageAddr1>W. 117TH ST.</StageAddr1>
<StageCity>ALL</StageCity>
<StageCnty>County</StageCnty>
<StageCode>RRS</StageCode>
<StageSt>OH</StageSt>
<SOrderNo>656565</SOrderNo>
<SPolDate>10/4/1994</SPolDate>
<Bat>WXW</Bat>
<StBrdCompl />
<StBrdNumbr />
<BootComp>N</BootComp>
<BootIns>Y</BootIns>
<Status>O</Status>
</Record>
</Fields>

-----------------------------------------------------------------------------------------------------------

My input XML file has serveral brances of the 2nd node, for example
<Westhas <rowchild and <easthas its own.. To get one universal
sequential number I had to use the
<LineNumber><xsl:number level="any" /></LineNumber>

<Cars>
<West>
<row>
.....
</row>
<row>
........
</row>
</west>
<east>
<row>
.....
</row>
<row>
........
</row>
</east>
I will try your XSL with a few changes and will let you know. Thanks a
lot for your effort and help.

Sep 20 '06 #12
Is this being done in 1.1 or 2.0? 2.0 has greatly increased transform speed
(although I was able to XSLT 10MB files in 1.1 that ran in under a minute).
I was just glancing at your transform, though, and it does look (on the
surface; again, I was just glancing at it) to be overly complex for what
your output looks like, which could contribute to the time. The best speed
optimizations are high-level algorithms, whereas moving to 2.0 is a
low-level optimization in addition to being one that you have no control
over.
Sep 21 '06 #13

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Ola Natvig | last post: by
reply views Thread by Christopher M. Lauer | last post: by
2 posts views Thread by rviray | last post: by
7 posts views Thread by Harolds | last post: by
3 posts views Thread by thomas.porschberg | last post: by
1 post views Thread by Nick | last post: by
reply views Thread by rainxy | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.