Call to XmlNode.GetElementsByTagName returns XmlNodeList that stays in
sync with XmlDocument thanks to events fired by XmlDocument. Once this
list is created there is no way to remove its event handlers from the
document. Calling GetElementsByTagName second time for the same tag
name will create new list and add more event handlers.
As result over time these handlers accumulate and reach pretty high
number (millions). Every modification done to the DOM fires event and
XmlDocument calls all these handlers. This significantly slows down all
modifications to the DOM.
To me it looks like a bug. Did I overlook somethning? Any feedback will
be appreciated.
Thank you
Dima 7 7099
Dima wrote: Call to XmlNode.GetElementsByTagName returns XmlNodeList that stays in sync with XmlDocument thanks to events fired by XmlDocument. Once this list is created there is no way to remove its event handlers from the document. Calling GetElementsByTagName second time for the same tag name will create new list and add more event handlers.
If you already know that DOM collections returned by
GetElementsByTagName are "live collections" kept in sync with the
document, why do you then call the method with the same tag name again
in your code? Can't you simply store the result of the first call and
use that collection returned in the rest of your code?
As result over time these handlers accumulate and reach pretty high number (millions).
Have you run tests that show that even for collections gone out of scope
(e.g. local variables created in a method and not returned by the
method) those event handlers are still fired?
As long as your code keeps using a collection an implementation needs to
keep it in sync.
To me it looks like a bug. Did I overlook somethning? Any feedback will be appreciated.
It is not clear whether you have a test case where you observe the
performance loss or whether you are just speculating whether there might
be a performance loss due to the need to keep collections in sync.
Do you have code where you experience performance problems?
--
Martin Honnen --- MVP XML http://JavaScript.FAQTs.com/
Dima,
You might want to read Erik Saltwell's article on GetElementsByTagName: http://blogs.msdn.com/eriksalt/archi...ByTagName.aspx
Erik is a dev lead for the system.xml team.
--
Stan Kitsis
Program Manager, XML Technologies
Microsoft Corporation
This posting is provided "AS IS" with no warranties, and confers no rights.
Use of included script samples are subject to the terms specified at http://www.microsoft.com/info/cpyright.htm
"Dima" <dm*****@phaseforward.com> wrote in message
news:11*********************@f14g2000cwb.googlegro ups.com... Call to XmlNode.GetElementsByTagName returns XmlNodeList that stays in sync with XmlDocument thanks to events fired by XmlDocument. Once this list is created there is no way to remove its event handlers from the document. Calling GetElementsByTagName second time for the same tag name will create new list and add more event handlers.
As result over time these handlers accumulate and reach pretty high number (millions). Every modification done to the DOM fires event and XmlDocument calls all these handlers. This significantly slows down all modifications to the DOM.
To me it looks like a bug. Did I overlook somethning? Any feedback will be appreciated. Thank you Dima
Martin Honnen wrote: Dima wrote:
Call to XmlNode.GetElementsByTagName returns XmlNodeList that stays in sync with XmlDocument thanks to events fired by XmlDocument. Once this list is created there is no way to remove its event handlers from the document. Calling GetElementsByTagName second time for the same tag name will create new list and add more event handlers. If you already know that DOM collections returned by GetElementsByTagName are "live collections" kept in sync with the document, why do you then call the method with the same tag name again in your code? Can't you simply store the result of the first call and use that collection returned in the rest of your code?
Martin, thank you for your respose!
I use it because it is faster that using xpath. I fixed the problem by
switching to xpath. I could store the result, but I believe DOM
implementation has much better position to store this result: if
collection is live and once created it cannot be easily disposed, why
second call to GetElementsByTagName returns new collection? As result over time these handlers accumulate and reach pretty high number (millions). Have you run tests that show that even for collections gone out of scope (e.g. local variables created in a method and not returned by the method) those event handlers are still fired?
All collections I used in my code were local variables. I use C#, so
going out of scope will not free anything. I tried to set collection to
null and it predictably did not unregistered handlers. XmlNodeList is
not IDisposable. What else can I do? I probably could cast it to
XmlElementList (undocumented), get its OnListChanged handler
(undocumented) and unregister it, but so far I am trying to use only
documented features of .NET 1.1. As long as your code keeps using a collection an implementation needs to keep it in sync.
I agree, but I use collection once and would like to dispose it, but I
do not see a way to do it. To me it looks like a bug. Did I overlook somethning? Any feedback will be appreciated. It is not clear whether you have a test case where you observe the performance loss or whether you are just speculating whether there might be a performance loss due to the need to keep collections in sync. Do you have code where you experience performance problems?
Yes, I profiled my app with ANTS profiler, and all AppendChild calls
are slow, because of call to XmlDocument.AfterEvent that invokes ~1.2
million handlers. Interestigly performance does not depend much on the
size of the document, but more on how many times GetElementsByTagName
was called.
I would not spend my time on speculation. I hope MS will fix it.
Again, I partially solved my problem by switching to XPath (which is
slower than GetElementsByTagName, according to ANTS profiler), but to
me behaviour of GetElementsByTagName seems simply dangerous. The only
functions that cannot be called twice are ctors and dtors.
GetElementsByTagName does not fit into this category, yet it behaives
that way. --
Martin Honnen --- MVP XML http://JavaScript.FAQTs.com/
Stan, thank you for response!
The posting is very interesting, however I think the performance
problem is not rooted in the conformance to the standard. I think the
real problem with GetElementsByTagName is not live collection and event
handlers themselves, but inability to remove/dispose collection when it
is not need anymore. XmlNodeList is not IDisposable, so once it is
created and handlers are registered, it's forever (for the document
lifetime). Maybe CG will clean it, but it might be too late already.
Second issues is that second call to GetElementsByTagName for the same
name returns new live collection and register more event handleres. In
my experience these 2 factors hurt performance the most, not the live
property of the collection alone.
If these issues will be addressed, performance will imporve and
standard will not be violated.
Dima wrote: I use it because it is faster that using xpath. I fixed the problem by switching to xpath.
How do you use XPath, simply with SelectNodes instead of
GetElementsByTagName called one a node in an XmlDocument? Doesn't that
give an XmlNodeList too?
Or have you switched to XPathDocument?
I could store the result, but I believe DOM implementation has much better position to store this result: if collection is live and once created it cannot be easily disposed, why second call to GetElementsByTagName returns new collection?
Well DOM with live collections has been around before .NET and is also a
W3C standard, I am not sure it would fit in with other implementations
or the standard if each call to the method on a certain node with the
same argument would return the same cached object.
For instance the W3C DOM Level 2 Core specification
<http://www.w3.org/TR/DOM-Level-2-Core/core.html#i-Document>
says about getElementsByTagName:
Return Value NodeList
A new NodeList object containing all the matched Elements.
so returning the same object is not what that standard suggests.
All collections I used in my code were local variables. I use C#, so going out of scope will not free anything. I tried to set collection to null and it predictably did not unregistered handlers. XmlNodeList is not IDisposable. What else can I do? I probably could cast it to XmlElementList (undocumented), get its OnListChanged handler (undocumented) and unregister it, but so far I am trying to use only documented features of .NET 1.1.
Yes, I profiled my app with ANTS profiler, and all AppendChild calls are slow, because of call to XmlDocument.AfterEvent that invokes ~1.2 million handlers. Interestigly performance does not depend much on the size of the document, but more on how many times GetElementsByTagName was called.
Good that we know details about what you have tested. I will try to look
into this tomorrow.
--
Martin Honnen --- MVP XML http://JavaScript.FAQTs.com/
Martin Honnen wrote: Dima wrote:
I use it because it is faster that using xpath. I fixed the problem by switching to xpath. How do you use XPath, simply with SelectNodes instead of GetElementsByTagName called one a node in an XmlDocument? Doesn't that give an XmlNodeList too? Or have you switched to XPathDocument?
In my tests I used SelectNodes, that returns XmlNodeList, but it does
not register event handlers, so I guess this list is not live
collection, whatever standard says it should be. In real app I use
XPathNavigator created on XmlNode (it is another story why, now I
suspect the root cause is the same), so this approach does not use
XmlNodeList at all. I could store the result, but I believe DOM implementation has much better position to store this result: if collection is live and once created it cannot be easily disposed, why second call to GetElementsByTagName returns new collection? Well DOM with live collections has been around before .NET and is also a W3C standard, I am not sure it would fit in with other implementations or the standard if each call to the method on a certain node with the same argument would return the same cached object.
For instance the W3C DOM Level 2 Core specification <http://www.w3.org/TR/DOM-Level-2-Core/core.html#i-Document> says about getElementsByTagName: Return Value NodeList A new NodeList object containing all the matched Elements.
so returning the same object is not what that standard suggests.
Well, if there are 2 collections and they are live and thus they
contain exactly the same objects, what makes them different? The only
new aspect of second collection is newly wasted memory. But I will not
go deep into the standards.
Not all implementations conform to the standard (see Stan Kitsis post
in this thread), for example Sun Java, and probably for a good reason! All collections I used in my code were local variables. I use C#, so going out of scope will not free anything. I tried to set collection to null and it predictably did not unregistered handlers. XmlNodeList is not IDisposable. What else can I do? I probably could cast it to XmlElementList (undocumented), get its OnListChanged handler (undocumented) and unregister it, but so far I am trying to use only documented features of .NET 1.1.
Yes, I profiled my app with ANTS profiler, and all AppendChild calls are slow, because of call to XmlDocument.AfterEvent that invokes ~1.2 million handlers. Interestigly performance does not depend much on the size of the document, but more on how many times GetElementsByTagName was called.
Good that we know details about what you have tested. I will try to look into this tomorrow.
--
Martin Honnen --- MVP XML http://JavaScript.FAQTs.com/
Dima wrote: Not all implementations conform to the standard (see Stan Kitsis post in this thread), for example Sun Java,
A bit off topic, but as far as I know and test it the DOM implementation
in Sun's Java 1.4 (org.apache.crimson.tree.XmlDocument) and in Sun's
Java 1.5 (com.sun.org.apache.xerces.internal.dom.DocumentIm pl) both give
live NodeLists on getElementsByTagName calls.
--
Martin Honnen --- MVP XML http://JavaScript.FAQTs.com/ This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: mmike |
last post by:
How does the ASP.NET output cache affect an httpS:// (SSL) connection?
Suppose I have a default.aspx page that has the following:
<%@OutputCache Duration="3600" VaryByParam="None" %>
...
|
by: Robin Tucker |
last post by:
I'm considering adding domain integrity checks to some of my database table
items. How does adding such constraints affect SQL Server performance? For
example, I have a simple constraint that...
|
by: Peter Bär |
last post by:
A Question to the C#/.Net Gods of this forum:
are there performance penalties when i compile (C#, FW1.1, ASP.NET,
Studio2003) a central baseclass in a different assembly than all the
derived...
|
by: Peter Bär |
last post by:
A Question to the C#/.Net Gods of this forum:
are there performance penalties when i compile (C#, FW1.1, ASP.NET,
Studio2003) a central baseclass in a different assembly than all the
derived...
|
by: Raja Chandrasekaran |
last post by:
Hai folks,
I have a question to get exact answer from you people. My question is
How Static class is differ from instance class and If you use static
class in ASP.NET, ll it affect speed or...
|
by: Dasn |
last post by:
Hi, there.
'lines' is a large list of strings each of which is seperated by '\t'
I wanna split each string into a list. For speed, using map() instead
of 'for' loop. 'map(str.split, lines)'...
|
by: John |
last post by:
Does the length of my C variable names have any affect, performance-wise, on
my final executable program? I mean, once compiled, etc., is there any
difference between these two:
number = 3;
n =...
|
by: BillGatesFan |
last post by:
I have a web service which calls a .NET queued serviced component in
COM+. I turned statistics on for the component. I call the component
10 times, 10 objects get created but they do not go away....
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: nemocccc |
last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
|
by: Sonnysonu |
last post by:
This is the data of csv file
1 2 3
1 2 3
1 2 3
1 2 3
2 3
2 3
3
the lengths should be different i have to store the data by column-wise with in the specific length.
suppose the i have to...
|
by: Hystou |
last post by:
There are some requirements for setting up RAID:
1. The motherboard and BIOS support RAID configuration.
2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: Hystou |
last post by:
Overview:
Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new...
| |