473,811 Members | 3,687 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Extending XmlDocument and associated classes to provide character positions.

OK here's is what I wish to do. I have an XML file that I want to read
into an XmlDocument. I then want to be able to interrogate the
XmlNodes to find both their start AND end character positions within
the original file.

So e.g.

<tagA><tagB>som etext</tagB></tagA>
^ ^ ^ ^ ^ ^
0 6 12 19 26 33
tagA: start=0, end=33
tagB: start=6, end=26
sometext: start=12, end=19
I have seen the LineInfo example within the .net docs, see:

"Extending the DOM"
ms-help://MS.VSCC/MS.MSDNVS/cpguide/html/cpconextendingd om.htm

and

www.gotdotnet.com/userfiles/XMLDom/extendDOM.zip
This goes someway to doing what I want, but it only stores the start
position of each xml node, not the end. Also this information is in
line/column number format (via System.Xml.IXml LineInfo). I could work
out the character index from the line/column, but prefereably I would
like to store the positions as the XML is being read.

My first thought was to extend System.IO.Strin gReader (StreamReaderEx )
to keep track of it's current position by overriding the two Read()
methods. I can then extend XmlReader to somehow provide me with the
character position, perhaps by keeping a reference to the
StreamReaderEx. This is a bit messy but should work (I think!). It
also limits me to loading an XmlDocument via a StreamReaderEx.

The remaining problem I have is that I can store the start character
position in the overriden CreateElement()/CreateAttribute () methods,
but where should i plug into the XmlDocument to store the end
positions?

Perhaps I am going about this the wrong way? Surely this position info
is already there somewhere, and I just need to extend the node classes
to store it?
As background I have recently been using a JavaCC/JJTree generated
(javascript)par ser. The parse tree generated gives me a tree of nodes,
each node then has a reference to it's first and last tokens (that
make up that node). Each token knows it's start & end position within
the original input stream (because I extended the code to store this
info when the token was created). Using this approach gives me all the
info I want. I want to avoid using JavaCC for my Xml as it is a
non-standard way of handling Xml. Future maintainers of the code will
wonder what the heck I was doing!

Thanks for reading this far,

Colin
Nov 11 '05 #1
1 2221
Colin Green wrote:
OK here's is what I wish to do. I have an XML file that I want to read
into an XmlDocument. I then want to be able to interrogate the
XmlNodes to find both their start AND end character positions within
the original file.

May be it's easier to calculate end position based on
start position + length
or
next-node-start-position - 1
?
--
Oleg Tkachenko
http://www.tkachenko.com/blog
Multiconn Technologies, Israel

Nov 11 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

17
8298
by: Phil Powell | last post by:
Where can I find an online PHP form validator script library to use? I have tried hacking the one here at work for weeks now and it's getting more and more impossible to customize, especially now with form elements that turn out to be arrays that have to be compared with one another! I have one form element, languages, a checkbox group. Beside each checkbox is a dropdown, proficiency (which will become proficiency alongside languages)....
0
1661
by: James Thurley | last post by:
I'm creating an XmlDocument manually, adding content using the Xml classes such as XmlElement and XmlText, and I then write it out as as "text/xml" to the HttpResponse.Output TextWriter object under IIS. The problem I get is when I create an XmlElement with an XmlAttribute whose value contains the "&" character. When the xml is written the "&" has become "&amp;amp". This happens when I use XmlDocument.InnerXML or...
0
1046
by: Gregory.Spencer | last post by:
Summary (still get coffee) explanation: I have added a new "Sessions" table to a DB because the original design could not handle a scenario where an entity "class" had a number of sessions. Originally there was just one class which would be attended by people, as they got more popular they had to split out classes over different sessions because too many people would attend a class. However, the database used to store what people...
5
2337
by: jen_designs | last post by:
Is there a way to return the character position on a page? Not the x and y coordinates, but the number of characters on a page. For instance i have a html page with the following text: This is my string. Then character postion for m would be 9. Any thoughts?
3
3778
by: todd | last post by:
Simply trying to load xml into a DOM without the dom converting my escape sequence. **code snippet** XmlDocument xmlDoc = new XmlDocument() ; xmlDoc.LoadXml("<x>hello world</x>"); **results**
5
2944
by: needin4mation | last post by:
Hi, I read this in a book about the Xml classes in c#: "These classes are abstract and therefore must be extended." I just wanted to know what this statement means. I know it is not in context, but the author gave it in such a matter that it appears experience folks will know what it means to say a class is abstract and extended. Thanks.
1
3630
by: Joe Monnin | last post by:
I have a web service that takes an XmlDocument as a parameter, performs some processing on it, and saves it to a database. The web service signature looks similar to this: public void SaveDocument(XmlDocument doc) { //Method logic omitted } The web service worked great in .net 1.1, but upon upgrading to .net 2.0 IIS
4
6412
by: Divick | last post by:
Hi all, I want to subclass std::exception so as to designate the type of error that I want to throw, out of my classes, and for that I need to store the messages inside the exception classes. I want to use std::string to do that so that I don't have to deal with all the hustle of dealing with char *'s but as listed in the page (see link) below, it is not advisable to use std::string in my exception classes. The rational given is not...
10
13981
by: lamxing | last post by:
Dear all, I've spent a long time to try to get the xmldocument.load method to handle UTF-8 characters, but no luck. Every time it loads a document contains european characters (such as the one below, output from google map API), it always said invalid character at position 229, which I believe is the "ß" character. Can anyone point me to the right direction of how to load such documents using the xmldocument.load() method, or...
0
9726
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10384
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10395
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
6887
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5553
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5692
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4338
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3865
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3017
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.