Garbage Collection Problems: Performance and Optimization for WebService XmlDocument XPath query

John Bailo

I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per minute.

However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be < 1ms.

I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/

Nov 22 '05 #1

Subscribe Post Reply

2491

Jonathan Allen

You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading
the information from the XML file into a normal class and use a hash-map for
the lookup.

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:2q*************@uni-berlin.de...

I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per minute.
However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be < 1ms.
I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/

Nov 22 '05 #2

John Bailo

Jonathan Allen wrote:

You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading
the information from the XML file into a normal class and use a hash-map for
the lookup.
I used a string array.

But what do you mean -- "by it's very nature" -- that is meaningless. An
XmlDocument object should be a b-tree -- in code essentially -- and
hence fast. And my tracing showed that it would sometimes be fast --
0ms and sometimes slow - 20-30ms. Why would it be random -- unless the
..Net memory model is severely flawed.

Why? Also, the performance of the code did not change when I moved it
from a single proc with .5G memory with hyperthreading to a dual proc
with hyperthreading and 2G memory.

The performance was /exactly/ the same! How can that be ? Does .NET
have inherent limitations in terms of accessing system resources ?!

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:2q*************@uni-berlin.de...
I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per

minute.
However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be <

1ms.
I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/

--
http://www.texeme.com

Nov 22 '05 #3

John Bailo

Jonathan Allen wrote:

You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading
the information from the XML file into a normal class and use a hash-map for
the lookup.

Here's some sample code that shows exactly what I mean.

I've compiled this code, and run it against the attached xml file.

My results, from running on a P4 workstation are as below.

I have compiled this code both using .Net's compiler and the mono
compiler for Windows ( www.go-mono.com ). The results are exactly the same.

What you see is that the same query, executed over and over again,
sometimes takes 0 seconds and then randomly 16 seconds.

Why would such a thing happen?

16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
15
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
0
This is book9999
16
This is book9999
0
This is book9999
0
This is book9999

--
http://www.texeme.com/

using System;
using System.Xml;
using System.Xml.XPath;

namespace XMLSamps
{

public class readwrite {

static void Main(string[] args)
{
// Load your Xml document
XmlDocument mydoc = new XmlDocument();
mydoc.Load(args[0]);
int beg = 0;
//Use selectNodes to get the book node where the attribute "id=1"
//and write out the response
for(int i=0; i<1000; i++)
{
beg = DateTime.Now.Second*1000 + DateTime.Now.Millisecond;
XmlNode xmn = mydoc.SelectSingleNode("//book[@id='9999']");
Console.WriteLine((DateTime.Now.Second*1000 +DateTime.Now.Millisecond)-beg);
Console.WriteLine(xmn.InnerText);
}

}

static string getPath()
{
string path;
path = System.IO.Path.GetDirectoryName(
System.Reflection.Assembly.GetExecutingAssembly(). GetName().CodeBase );
return path;
}
}
}

Nov 22 '05 #4

Jonathan Allen

Thanks for the sample. One note, you might want to try using "Ticks" instead
of seconds and milliseconds. (It doesn't change the result, it just find it
helpful.)

Jonathan

Nov 22 '05 #5

Jonathan Allen

> But what do you mean -- "by it's very nature" -- that is meaningless.

This is the best explaination I've read about why you should avoid using XML
as much as possible.

http://www.joelonsoftware.com/articl...000000319.html

That said, I would like to tell you the lesson I keep forgetting, "Don't
worry about performance until it becomes an issue". If using XML internally
is "fast enough", the don't go off and start building your own classes.
Concentrate on areas where making improvements will actually be noticeable
to the user.

And my tracing showed that it would sometimes be fast --
0ms and sometimes slow - 20-30ms. Why would it be random -- unless the
.Net memory model is severely flawed.
I think it is because you are running multiple applications. That 16 ms
could be the amount of time it takes Windows to check to see if any other
programs want to run.

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:mA****************@newsread3.news.pas.earthli nk.net... Jonathan Allen wrote:
You shouldn't be using an in-memory XML document and XPath when you care
about performance. XML, by its very nature, is slow. You should be loading the information from the XML file into a normal class and use a hash-map for the lookup.

I used a string array.

But what do you mean -- "by it's very nature" -- that is meaningless. An
XmlDocument object should be a b-tree -- in code essentially -- and
hence fast. And my tracing showed that it would sometimes be fast --
0ms and sometimes slow - 20-30ms. Why would it be random -- unless the
.Net memory model is severely flawed.

Why? Also, the performance of the code did not change when I moved it
from a single proc with .5G memory with hyperthreading to a dual proc
with hyperthreading and 2G memory.

The performance was /exactly/ the same! How can that be ? Does .NET
have inherent limitations in terms of accessing system resources ?!

Jonathan

"John Bailo" <ja*****@earthlink.net> wrote in message
news:2q*************@uni-berlin.de...
I wrote a webservice to output a report file. The fields of the report
are formatted based on information in an in-memory XmlDocument. As
each row of a SqlDataReader are looped through, a lookup is done, and
format information retrieved.

The performance was extremely poor -- producing about 1000 rows per

minute.
However, when I used tracing/logging, my results were inconclusive.
First of all, based on the size of the data and the size of the
XmlDocument, I would have expected the whole process per record to be <

1ms.
I put a statement to record the time, to the millesecond, before each
call to the XmlDocument, and in the routine, before and after each XPath
query. Then I put a statement after each line was written to the text
stream.

What was odd, was that I could see milleseconds being chewed up in the
code, that contributed to the poor performance, the time where it was
chewed up was random! Sometimes the XmlDocument was 0 ms, sometimes
20-30s per lookup. Sometimes, the clock would add ms in the loop that
retrieved the record from the dataset.

Another thing that puzzled me is that as the program ran, performance
*degraded* -- the whole loop and all the individual processes ran slower
and slower!

To me, this indicates severe problems with Ms .NET garbage collection
and memory management.
--
http://www.texeme.com/

--
http://www.texeme.com

Nov 22 '05 #6

Similar topics

Garbage collection performance test app

by: Bob | last post by:

Are there any known applications out there used to test the performance of the .NET garbage collector over a long period of time? Basically I need an application that creates objects, uses them, and...

.NET Framework

Garbage Collection Problems: Performance and Optimization for WebService XmlDocument XPath query

by: John Bailo | last post by:

I wrote a webservice to output a report file. The fields of the report are formatted based on information in an in-memory XmlDocument. As each row of a SqlDataReader are looped through, a...

.NET Framework

XML as parameter in webservice

by: Srini | last post by:

I have written two simple webservice functions and trying to consume them through a client piece. Both the webservice functions have similar signature....

.NET Framework

loading XML into collection slow

by: Kurt Bauer | last post by:

I have an ASP group calendar application which pulls calendar data from Exchange via webdav into an XML string. I then loop the XML nodes to populate a collection of appointments. Finally I use...

.NET Framework

XMLDocument.Load() performance

by: John A Grandy | last post by:

I've got an app that has hundreds of medium-sized (100s of elements) XML files on disk (not in db). Right now these are loaded via XMLDocument.Load and searched with XPATH. The performance has...

.NET Framework

webservice xsd / xml validator?

by: billb | last post by:

Hi all... I have a web method that doesvaildation but does not return just the xml from the invoke method, it retruns both the schema and the xml in the difgram .. The webmethod I'm using looks...

.NET Framework

XMLPool: The c# XML Database Webservice

by: John Bailo | last post by:

Wow, I just figured out some cool stuff thanks to Ken Kolda in m.p.d.f.remoting Now I have my multiuser XmlDocument set to load once at startup using a static invocation of a class. Next I'll...

.NET Framework

Web-Services with a Collection of a custom object

by: hellrazor | last post by:

Hi there, I'm trying to consume a web-service that is supposed to return a collection of a custom object. The web-service was not created with C# or VS.net. It was created with IBM VisualAge...

.NET Framework

Help with webservice to read, store data to and write xml using vb

by: AllenM | last post by:

I have a project that requires an XML webservice that will receive an xml file. The webservice will need to read the XML file, download a pdf file from a URL stored in the XML file then store the...

.NET Framework

350

Do you use a garbage collector?

by: Lloyd Bonafide | last post by:

I followed a link to James Kanze's web site in another thread and was surprised to read this comment by a link to a GC: "I can't imagine writing C++ without it" How many of you c.l.c++'ers use...

C / C++

Wordpress or something else?

by: Faith0G | last post by:

I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

Content Management Systems

Access Europe: Command bars, the Access Shortcut Tool and a simple Audit Log - Wed 3 April

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

General

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Basic Javascript concepts

by: aa123db | last post by:

Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...

Javascript

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware