473,327 Members | 2,118 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,327 software developers and data experts.

Garbage Collection Problems: Performance and Optimization for WebService XmlDocument XPath query


I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per minute.

However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be < 1ms.

I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/
Nov 22 '05 #1
5 2489
You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading
the information from the XML file into a normal class and use a hash-map for
the lookup.

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:2q*************@uni-berlin.de...

I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per minute.
However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be < 1ms.
I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/

Nov 22 '05 #2
Jonathan Allen wrote:
You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading
the information from the XML file into a normal class and use a hash-map for
the lookup.
I used a string array.

But what do you mean -- "by it's very nature" -- that is meaningless. An
XmlDocument object should be a b-tree -- in code essentially -- and
hence fast. And my tracing showed that it would sometimes be fast --
0ms and sometimes slow - 20-30ms. Why would it be random -- unless the
..Net memory model is severely flawed.

Why? Also, the performance of the code did not change when I moved it
from a single proc with .5G memory with hyperthreading to a dual proc
with hyperthreading and 2G memory.

The performance was /exactly/ the same! How can that be ? Does .NET
have inherent limitations in terms of accessing system resources ?!

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:2q*************@uni-berlin.de...
I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per


minute.
However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be <


1ms.
I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/


--
http://www.texeme.com
Nov 22 '05 #3
Jonathan Allen wrote:
You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading
the information from the XML file into a normal class and use a hash-map for
the lookup.


Here's some sample code that shows exactly what I mean.

I've compiled this code, and run it against the attached xml file.

My results, from running on a P4 workstation are as below.

I have compiled this code both using .Net's compiler and the mono
compiler for Windows ( www.go-mono.com ). The results are exactly the same.

What you see is that the same query, executed over and over again,
sometimes takes 0 seconds and then randomly 16 seconds.

Why would such a thing happen?

16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999

--
http://www.texeme.com/

using System;
using System.Xml;
using System.Xml.XPath;

namespace XMLSamps
{

public class readwrite {


static void Main(string[] args)
{
// Load your Xml document
XmlDocument mydoc = new XmlDocument();
mydoc.Load(args[0]);
int beg = 0;
//Use selectNodes to get the book node where the attribute "id=1"
//and write out the response
for(int i=0; i<1000; i++)
{
beg = DateTime.Now.Second*1000 + DateTime.Now.Millisecond;
XmlNode xmn = mydoc.SelectSingleNode("//book[@id='9999']");
Console.WriteLine((DateTime.Now.Second*1000 +DateTime.Now.Millisecond)-beg);
Console.WriteLine(xmn.InnerText);
}

}

static string getPath()
{
string path;
path = System.IO.Path.GetDirectoryName(
System.Reflection.Assembly.GetExecutingAssembly(). GetName().CodeBase );
return path;
}
}
}
Nov 22 '05 #4
Thanks for the sample. One note, you might want to try using "Ticks" instead
of seconds and milliseconds. (It doesn't change the result, it just find it
helpful.)

Jonathan
Nov 22 '05 #5
> But what do you mean -- "by it's very nature" -- that is meaningless.

This is the best explaination I've read about why you should avoid using XML
as much as possible.

http://www.joelonsoftware.com/articl...000000319.html

That said, I would like to tell you the lesson I keep forgetting, "Don't
worry about performance until it becomes an issue". If using XML internally
is "fast enough", the don't go off and start building your own classes.
Concentrate on areas where making improvements will actually be noticeable
to the user.
And my tracing showed that it would sometimes be fast --
0ms and sometimes slow - 20-30ms. Why would it be random -- unless the
.Net memory model is severely flawed.
I think it is because you are running multiple applications. That 16 ms
could be the amount of time it takes Windows to check to see if any other
programs want to run.

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:mA****************@newsread3.news.pas.earthli nk.net... Jonathan Allen wrote:
You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading the information from the XML file into a normal class and use a hash-map for the lookup.


I used a string array.

But what do you mean -- "by it's very nature" -- that is meaningless. An
XmlDocument object should be a b-tree -- in code essentially -- and
hence fast. And my tracing showed that it would sometimes be fast --
0ms and sometimes slow - 20-30ms. Why would it be random -- unless the
.Net memory model is severely flawed.

Why? Also, the performance of the code did not change when I moved it
from a single proc with .5G memory with hyperthreading to a dual proc
with hyperthreading and 2G memory.

The performance was /exactly/ the same! How can that be ? Does .NET
have inherent limitations in terms of accessing system resources ?!

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:2q*************@uni-berlin.de...
I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per


minute.
However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be <


1ms.
I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/


--
http://www.texeme.com

Nov 22 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Bob | last post by:
Are there any known applications out there used to test the performance of the .NET garbage collector over a long period of time? Basically I need an application that creates objects, uses them, and...
5
by: John Bailo | last post by:
I wrote a webservice to output a report file. The fields of the report are formatted based on information in an in-memory XmlDocument. As each row of a SqlDataReader are looped through, a...
1
by: Srini | last post by:
I have written two simple webservice functions and trying to consume them through a client piece. Both the webservice functions have similar signature....
5
by: Kurt Bauer | last post by:
I have an ASP group calendar application which pulls calendar data from Exchange via webdav into an XML string. I then loop the XML nodes to populate a collection of appointments. Finally I use...
1
by: John A Grandy | last post by:
I've got an app that has hundreds of medium-sized (100s of elements) XML files on disk (not in db). Right now these are loaded via XMLDocument.Load and searched with XPATH. The performance has...
0
by: billb | last post by:
Hi all... I have a web method that doesvaildation but does not return just the xml from the invoke method, it retruns both the schema and the xml in the difgram .. The webmethod I'm using looks...
0
by: John Bailo | last post by:
Wow, I just figured out some cool stuff thanks to Ken Kolda in m.p.d.f.remoting Now I have my multiuser XmlDocument set to load once at startup using a static invocation of a class. Next I'll...
4
by: hellrazor | last post by:
Hi there, I'm trying to consume a web-service that is supposed to return a collection of a custom object. The web-service was not created with C# or VS.net. It was created with IBM VisualAge...
2
by: AllenM | last post by:
I have a project that requires an XML webservice that will receive an xml file. The webservice will need to read the XML file, download a pdf file from a URL stored in the XML file then store the...
350
by: Lloyd Bonafide | last post by:
I followed a link to James Kanze's web site in another thread and was surprised to read this comment by a link to a GC: "I can't imagine writing C++ without it" How many of you c.l.c++'ers use...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.