Extending XmlDocument and associated classes to provide character positions.

Colin Green

OK here's is what I wish to do. I have an XML file that I want to read
into an XmlDocument. I then want to be able to interrogate the
XmlNodes to find both their start AND end character positions within
the original file.

So e.g.

<tagA><tagB>som etext</tagB></tagA>
^ ^ ^ ^ ^ ^
0 6 12 19 26 33
tagA: start=0, end=33
tagB: start=6, end=26
sometext: start=12, end=19
I have seen the LineInfo example within the .net docs, see:

"Extending the DOM"
ms-help://MS.VSCC/MS.MSDNVS/cpguide/html/cpconextendingd om.htm

and

www.gotdotnet.com/userfiles/XMLDom/extendDOM.zip
This goes someway to doing what I want, but it only stores the start
position of each xml node, not the end. Also this information is in
line/column number format (via System.Xml.IXml LineInfo). I could work
out the character index from the line/column, but prefereably I would
like to store the positions as the XML is being read.

My first thought was to extend System.IO.Strin gReader (StreamReaderEx )
to keep track of it's current position by overriding the two Read()
methods. I can then extend XmlReader to somehow provide me with the
character position, perhaps by keeping a reference to the
StreamReaderEx. This is a bit messy but should work (I think!). It
also limits me to loading an XmlDocument via a StreamReaderEx.

The remaining problem I have is that I can store the start character
position in the overriden CreateElement()/CreateAttribute () methods,
but where should i plug into the XmlDocument to store the end
positions?

Perhaps I am going about this the wrong way? Surely this position info
is already there somewhere, and I just need to extend the node classes
to store it?
As background I have recently been using a JavaCC/JJTree generated
(javascript)par ser. The parse tree generated gives me a tree of nodes,
each node then has a reference to it's first and last tokens (that
make up that node). Each token knows it's start & end position within
the original input stream (because I extended the code to store this
info when the token was created). Using this approach gives me all the
info I want. I want to avoid using JavaCC for my Xml as it is a
non-standard way of handling Xml. Future maintainers of the code will
wonder what the heck I was doing!

Thanks for reading this far,

Colin

Nov 11 '05 #1

Subscribe Reply

2221

Oleg Tkachenko

Colin Green wrote:

OK here's is what I wish to do. I have an XML file that I want to read
into an XmlDocument. I then want to be able to interrogate the
XmlNodes to find both their start AND end character positions within
the original file.

May be it's easier to calculate end position based on
start position + length
or
next-node-start-position - 1
?
--
Oleg Tkachenko
http://www.tkachenko.com/blog
Multiconn Technologies, Israel

Nov 11 '05 #2

Similar topics

8298

Recommendations for PHP Form Validation Script

by: Phil Powell | last post by:

Where can I find an online PHP form validator script library to use? I have tried hacking the one here at work for weeks now and it's getting more and more impossible to customize, especially now with form elements that turn out to be arrays that have to be compared with one another! I have one form element, languages, a checkbox group. Beside each checkbox is a dropdown, proficiency (which will become proficiency alongside languages)....

PHP

1661

When writing XmlDocument to HttpResponse.Output the & goes to &amp;

by: James Thurley | last post by:

I'm creating an XmlDocument manually, adding content using the Xml classes such as XmlElement and XmlText, and I then write it out as as "text/xml" to the HttpResponse.Output TextWriter object under IIS. The problem I get is when I create an XmlElement with an XmlAttribute whose value contains the "&" character. When the xml is written the "&" has become "&amp". This happens when I use XmlDocument.InnerXML or...

.NET Framework

1046

Extending a old DB - clever experts required for advice

by: Gregory.Spencer | last post by:

Summary (still get coffee) explanation: I have added a new "Sessions" table to a DB because the original design could not handle a scenario where an entity "class" had a number of sessions. Originally there was just one class which would be attended by people, as they got more popular they had to split out classes over different sessions because too many people would attend a class. However, the database used to store what people...

MySQL Database

2337

return character position on page?

by: jen_designs | last post by:

Is there a way to return the character position on a page? Not the x and y coordinates, but the number of characters on a page. For instance i have a html page with the following text: This is my string. Then character postion for m would be 9. Any thoughts?

Javascript

3778

HELP:   and XmlDocument

by: todd | last post by:

Simply trying to load xml into a DOM without the dom converting my escape sequence. **code snippet** XmlDocument xmlDoc = new XmlDocument() ; xmlDoc.LoadXml("<x>hello world</x>"); **results**

.NET Framework

2944

What does extending a class mean?

by: needin4mation | last post by:

Hi, I read this in a book about the Xml classes in c#: "These classes are abstract and therefore must be extended." I just wanted to know what this statement means. I know it is not in context, but the author gave it in such a matter that it appears experience folks will know what it means to say a class is abstract and extended. Thanks.

C# / C Sharp

3630

Web Service XmlDocument Parameter

by: Joe Monnin | last post by:

I have a web service that takes an XmlDocument as a parameter, performs some processing on it, and saves it to a database. The web service signature looks similar to this: public void SaveDocument(XmlDocument doc) { //Method logic omitted } The web service worked great in .net 1.1, but upon upgrading to .net 2.0 IIS

.NET Framework

6412

Help with Extending std::exception class

by: Divick | last post by:

Hi all, I want to subclass std::exception so as to designate the type of error that I want to throw, out of my classes, and for that I need to store the messages inside the exception classes. I want to use std::string to do that so that I don't have to deal with all the hustle of dealing with char *'s but as listed in the page (see link) below, it is not advisable to use std::string in my exception classes. The rational given is not...

C / C++

13981

Can XmlDocument.Load() method handle unicode characters?

by: lamxing | last post by:

Dear all, I've spent a long time to try to get the xmldocument.load method to handle UTF-8 characters, but no luck. Every time it loads a document contains european characters (such as the one below, output from google map API), it always said invalid character at position 229, which I believe is the "ß" character. Can anyone point me to the right direction of how to load such documents using the xmldocument.load() method, or...

.NET Framework

9726

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...

General

10384

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...

Online Marketing

10395

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

6887

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5553

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

5692

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

4338

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

3865

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP

3017

Comprehensive Guide to Website Development in Toronto: Expert Insights from BSMN Consultancy

by: bsmnconsultancy | last post by:

In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

General