472,784 Members | 1,056 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,784 software developers and data experts.

XmlTextReader, parsing, space as data

I just ran across this.
#1 <DBColumn> 1 </DBColumn>
#2 <DBColumn> </DBColumn>
The data for #1 will be parsed and returned as " 1 ". I get a sequence of
Element/Text/EndElement.
The data for #2 will not be returned. I get a sequence of
Element/Whitespace/EndElement.

Why is the data (which happens to be spaces) between my start and end tags
being misinterpreted?
TIA

Nov 12 '05 #1
11 1883
Hi Kenneth,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to preserve whitespace in your
Xml document. If there is any misunderstanding, please feel free to let me
know.

The XmlTextReader only preserves white space that occurs within an
xml:space="preserve" context. So you need to add it to the parent node as
an attibute. If you're using an XmlDocument, you can simply set
PreserveWhitespace property to true before calling Load or LoadXml method.
You can check the following link for more information.

http://msdn.microsoft.com/library/de...us/cpguide/htm
l/cpconHandlingWhiteSpaceWithXmlTextReader.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #2
Kevin,

I only want the space (" ") within a data element, not between elements.
When my data has a non-space character in it, the space data is retrieved.
If I just have one or more spaces it gets written to XML but not pulled out.

#1 <DBColumn> 1 </DBColumn> <--- spaces retrieved
I get a sequence of Element/Text/EndElement.

#2 <DBColumn> </DBColumn> <--- just spaces, no data
retrieved
I get a sequence of Element/Whitespace/EndElement.
"Kevin Yu [MSFT]" wrote:
Hi Kenneth,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to preserve whitespace in your
Xml document. If there is any misunderstanding, please feel free to let me
know.

The XmlTextReader only preserves white space that occurs within an
xml:space="preserve" context. So you need to add it to the parent node as
an attibute. If you're using an XmlDocument, you can simply set
PreserveWhitespace property to true before calling Load or LoadXml method.
You can check the following link for more information.

http://msdn.microsoft.com/library/de...us/cpguide/htm
l/cpconHandlingWhiteSpaceWithXmlTextReader.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #3
Hi Ken,

I'm afraid this is by design for the XmlTextReader. If we don't have any
context within an element the whitespaces are abandoned if we don't have
xml:space="preserve" attribute set in the element. Please check the MSDN
document I posted in my last post for more information. Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #4
ke*****@nospam.nospam wrote:
I just ran across this.
#1 <DBColumn> 1 </DBColumn>
#2 <DBColumn> </DBColumn>
The data for #1 will be parsed and returned as " 1 ". I get a sequence of
Element/Text/EndElement.
The data for #2 will not be returned. I get a sequence of
Element/Whitespace/EndElement.

Why is the data (which happens to be spaces) between my start and end tags
being misinterpreted?


Because that gets treated as insignificant whitespace.
Set Whitespacehandling property of an XmlTextReader to
WhitespaceHandling.All to avoid this behaviour.

--
Oleg Tkachenko [XML MVP, MCAD]
http://blog.tkachenko.com
Nov 12 '05 #5
That property was set to WhitespaceHandling.All

"Oleg Tkachenko [MVP]" wrote:
ke*****@nospam.nospam wrote:
I just ran across this.
#1 <DBColumn> 1 </DBColumn>
#2 <DBColumn> </DBColumn>
The data for #1 will be parsed and returned as " 1 ". I get a sequence of
Element/Text/EndElement.
The data for #2 will not be returned. I get a sequence of
Element/Whitespace/EndElement.

Why is the data (which happens to be spaces) between my start and end tags
being misinterpreted?


Because that gets treated as insignificant whitespace.
Set Whitespacehandling property of an XmlTextReader to
WhitespaceHandling.All to avoid this behaviour.

--
Oleg Tkachenko [XML MVP, MCAD]
http://blog.tkachenko.com

Nov 12 '05 #6
hi,

Looking at the referenced document:
<test>•
••••<item>•
••••••••<item xml:space="preserve">º
ºººººººººººº<item/>º
ºººººººº</item>•
••••</item>•
••••<book>º
ºººººººº<b>This<b>º
ºººººººº<i>is</i>º
ºººººººº<b>a test</b>º
ºººº</book>•
</test>•
The white space shown as (•) is insignificant white space. The white
space shown as (º) is significant white space.
Note The scope of the xml:space attribute changes what would normally
be considered insignificant white space to be significant white space.
Notice that <b>a test</b> is not shown to be affected by the space=preserve
command.

I did try adding this to my xml (two ways), but it had no effect on the
xmltextreader.
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn xml:space="preserve"> </DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn> </DBColumn>
</DBRow>
Let try this from a different direction:
- I am not using validating readers, xsl, dom, etc.
- I want to output text data and read it back in, and the data can contain
spaces or even conceivably be all spaces
- How do I make sure a space out becomes a space in?
- Note: it seems that having nonspace characters causes space characters to
be read in. Is this always true?

Thanks
"Kevin Yu [MSFT]" wrote:
Hi Ken,

I'm afraid this is by design for the XmlTextReader. If we don't have any
context within an element the whitespaces are abandoned if we don't have
xml:space="preserve" attribute set in the element. Please check the MSDN
document I posted in my last post for more information. Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #7
Hi Kenneth,

I tried using XmlTextReader to parse your Xml document. I set the reader's
WhitespaceHandling property to WhitespaceHandling.All and all the white
spaces are preserved. If that doesn't work for you, could you please post a
code snippet and a part of the Xml document here, so that I can reproduce
it? Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #8
In what manner are they preserved for you?
I am looking to get a XmlNodeType.Text returned with the elements text
data. Is that what you get? or are you getting whitespace returned?
Here are two data snippet examples I have tried:
<DataBase>
<Index>1</Index>
<Enabled>True</Enabled>
<Series>2</Series>
<DBValues>
<dummy />
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn>1</DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn>1</DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn xml:space="preserve"> </DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn />
</DBRow>
</DBValues>
</DataBase>


<DataBase>
<Index>1</Index>
<Enabled>True</Enabled>
<Series>2</Series>
<DBValues>
<dummy />
<DBRow>
<DBColumn />
<DBColumn>1.1</DBColumn>
<DBColumn>-2</DBColumn>
<DBColumn>3</DBColumn>
<DBColumn>rrr</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>-1.1</DBColumn>
<DBColumn>2.0</DBColumn>
<DBColumn>-3</DBColumn>
<DBColumn>sss</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>10</DBColumn>
<DBColumn> </DBColumn>
<DBColumn>1</DBColumn>
<DBColumn>ttt</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>one</DBColumn>
<DBColumn>two</DBColumn>
<DBColumn>three</DBColumn>
<DBColumn>vvv</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>11</DBColumn>
<DBColumn> </DBColumn>
<DBColumn />
<DBColumn />
</DBRow>
</DBValues>
</DataBase>

//here is a sample code snippet,
//all the storing of data and error checking is gone,
//as well as getting data from a string instead of a file
using System;
using System.Collections;
using System.Xml;
namespace ConsoleApplication1
{

public class MyClass
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
GetData();
RL();
}

private static void RL()
{
Console.ReadLine();
}
private static void GetData()
{
string xmlFrag = " <DataBase>\n"+
" <Index>1</Index>\n"+
" <Enabled>True</Enabled>\n"+
" <Series>2</Series>\n"+
" <DBValues>\n"+
" <dummy />\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>1.1</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>3</DBColumn>\n"+
" <DBColumn>rrr</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>-1.1</DBColumn>\n"+
" <DBColumn>2.0</DBColumn>\n"+
" <DBColumn>-3</DBColumn>\n"+
" <DBColumn>sss</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>10</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>1</DBColumn>\n"+
" <DBColumn>ttt</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>one</DBColumn>\n"+
" <DBColumn>two</DBColumn>\n"+
" <DBColumn>three</DBColumn>\n"+
" <DBColumn>vvv</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>11</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn />\n"+
" <DBColumn />\n"+
" </DBRow>\n"+
" </DBValues>\n"+
" </DataBase>\n"
;

//Create the XmlNamespaceManager.
NameTable nt = new NameTable();
XmlNamespaceManager nsmgr = new XmlNamespaceManager(nt);
nsmgr.AddNamespace("bk", "urn:sample");

//Create the XmlParserContext.
XmlParserContext context = new XmlParserContext(null, nsmgr,
null, XmlSpace.None);

System.Xml.XmlTextReader xr = new
System.Xml.XmlTextReader(xmlFrag, XmlNodeType.Element, context);
xr.WhitespaceHandling = WhitespaceHandling.All;
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBValues":
_ParseDBValues(xr,1);
break;
}
}
break;
}
}
}
catch
{
}
}

public static void _ParseDBValues(System.Xml.XmlTextReader xr, int
_nDBInst)
{
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBRow":
Console.WriteLine("ROW");

_ParseDBColObject(xr, _nDBInst);
break;
}
}
break;

case XmlNodeType.Text:
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBValues"))
{
return;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
}

public static void _ParseDBColObject(System.Xml.XmlTextReader xr,
int _nDBInst)
{
// Get data values & range check

string element = "";
int _nColIndex = -1;
// Parse input stream
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
element = xr.Name;
if(element.Equals("DBColumn"))
{
_nColIndex++; // track which column we are on
}
}
break;

case XmlNodeType.Text:
{
switch(element)
{
case "DBColumn":
Console.WriteLine("DBColumn = \"" +
xr.Value + "\"");
break;
}
}
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBRow"))
{
Console.WriteLine("EndRow");
return ;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
return ;
}
}
}
"Kevin Yu [MSFT]" wrote:
Hi Kenneth,

I tried using XmlTextReader to parse your Xml document. I set the reader's
WhitespaceHandling property to WhitespaceHandling.All and all the white
spaces are preserved. If that doesn't work for you, could you please post a
code snippet and a part of the Xml document here, so that I can reproduce
it? Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #9
When you say the whitespace is preserved, how exactly do you mean that?
Is the data coming in as XmlNodeType.Text or as whitespace?
// the data was written with code like the following
xw.WriteStartElement("DBValues");
xw.WriteStartElement("dummy");
xw.WriteEndElement();
xw.WriteStartElement("DBRow");
string _strColVal = something;
xw.WriteElementString("DBColumn", _strColVal);
xw.WriteEndElement();
xw.WriteEndElement();
xw.WriteEndElement();
// sample code snippet,
// all the data storing and error checking is gone
// and we are getting data from a string instead of a file
using System;
using System.Collections;
using System.Xml;
namespace ConsoleApplication1
{

public class MyClass
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
GetData();
RL();
}

private static void RL()
{
Console.ReadLine();
}
private static void GetData()
{
string xmlFrag = " <DataBase>\n"+
" <Index>1</Index>\n"+
" <Enabled>True</Enabled>\n"+
" <Series>2</Series>\n"+
" <DBValues>\n"+
" <dummy />\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>1.1</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>3</DBColumn>\n"+
" <DBColumn>rrr</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>-1.1</DBColumn>\n"+
" <DBColumn>2.0</DBColumn>\n"+
" <DBColumn>-3</DBColumn>\n"+
" <DBColumn>sss</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>10</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>1</DBColumn>\n"+
" <DBColumn>ttt</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>one</DBColumn>\n"+
" <DBColumn>two</DBColumn>\n"+
" <DBColumn>three</DBColumn>\n"+
" <DBColumn>vvv</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>11</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn />\n"+
" <DBColumn />\n"+
" </DBRow>\n"+
" </DBValues>\n"+
" </DataBase>\n"
;

//Create the XmlNamespaceManager.
NameTable nt = new NameTable();
XmlNamespaceManager nsmgr = new XmlNamespaceManager(nt);
nsmgr.AddNamespace("bk", "urn:sample");
//Create the XmlParserContext.
XmlParserContext context = new XmlParserContext(null, nsmgr,
null, XmlSpace.None);

System.Xml.XmlTextReader xr = new
System.Xml.XmlTextReader(xmlFrag, XmlNodeType.Element, context);
xr.WhitespaceHandling = WhitespaceHandling.All;
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBValues":
_ParseDBValues(xr,1);
break;
}
}
break;
}
}
}
catch
{
}
}

public static void _ParseDBValues(System.Xml.XmlTextReader xr, int
_nDBInst)
{
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBRow":
Console.WriteLine("ROW");

_ParseDBColObject(xr, _nDBInst);
break;
}
}
break;

case XmlNodeType.Text:
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBValues"))
{
return;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
}

public static void _ParseDBColObject(System.Xml.XmlTextReader xr,
int _nDBInst)
{
// Get data values & range check

string element = "";
int _nColIndex = -1;
// Parse input stream
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
element = xr.Name;
if(element.Equals("DBColumn"))
{
_nColIndex++; // track which column we are on
}
}
break;

case XmlNodeType.Text:
{
switch(element)
{
case "DBColumn":
Console.WriteLine("DBColumn = \"" +
xr.Value + "\"");
break;
}
}
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBRow"))
{
Console.WriteLine("EndRow");
return ;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
return ;
}
}
}
"Kevin Yu [MSFT]" wrote:
Hi Kenneth,

I tried using XmlTextReader to parse your Xml document. I set the reader's
WhitespaceHandling property to WhitespaceHandling.All and all the white
spaces are preserved. If that doesn't work for you, could you please post a
code snippet and a part of the Xml document here, so that I can reproduce
it? Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #10
Thanks for the further response Kenneth,

We'll have a futher investigation on the code and will update you soon.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Nov 12 '05 #11
Hi Kenneth,

The NodeType I got is SignificantWhitespace. All the whitespaces between
two nodes known as insignificant whitespaces are noted as Whitespace. We
add the xml:space="preserve" to make sure that the value of the node is
preserved. But the NodeType is not Text. You can just look for the
SignificantWhitespace for the blank node values and ignore the Whitespace
nodes. Some SignificantWhitespace you see might belong to the parent node.

This is by design.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Jesper Stocholm | last post by:
I have som XML like this: <root> <Course CourseCode="id1"> <Teacher Name="Some name"/> <Title Titlde="Dansk Titel 1"/> <Title Title="English Title 1"/> <Location Place="Some place"/>...
1
by: Hang Cheng | last post by:
I've been trying to parse the following XML file with the .NET XMLTextReader: <?xml...
3
by: Raghu | last post by:
Does XmlTextReader class in .net represent SAX implementation? If yes, are there any performance gains if I use C++ SAX implementation in msxml4.dll versus XmlTextReader in .net? Did any one try...
4
by: Andy Neilson | last post by:
I've run across a strange behaviour with XmlSerializer that I'm unable to explain. I came across this while trying to use XmlSerializer to deserialize from a the details of a SoapException. This...
5
by: Geoff Bennett | last post by:
While parsing an XML document, my TextReader instance skips nodes. For example, in this fragment: <Person Sex="Male" FirstHomeBuyer="No" YearsInCurrentProfession="14"> <RelatedEntityRef...
2
by: ecomputerdelicacy | last post by:
Compact Framework SP3 (and I think also SP2 and SP1) XmlTextReader does not handle !DOCTYPE and (maybe?) rdf elements. How can I get the XmlTextReader to read these without issuing a...
2
by: Q | last post by:
I am feeding XmlTextReader a URL that returns the XML that then gets parsed. The URL forms a query that affects how much data is returned in XML but not the format of the data. The problem is...
1
by: Alexander Gnauck | last post by:
Hello, i get my XML data over TCP/IP Sockets from a Server. And i must parse the XML Data with the XmlTextReader because ist the fastest Method i found. The size of my Xml files is variable. The...
17
by: Slonocode | last post by:
Is there any relation between the NodeType and the Text when using the XmlTextReader? The documentation uses select/case while using the xmltextreader. While reader.Read() Select Case...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 2 August 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: erikbower65 | last post by:
Using CodiumAI's pr-agent is simple and powerful. Follow these steps: 1. Install CodiumAI CLI: Ensure Node.js is installed, then run 'npm install -g codiumai' in the terminal. 2. Connect to...
0
by: erikbower65 | last post by:
Here's a concise step-by-step guide for manually installing IntelliJ IDEA: 1. Download: Visit the official JetBrains website and download the IntelliJ IDEA Community or Ultimate edition based on...
0
by: Taofi | last post by:
I try to insert a new record but the error message says the number of query names and destination fields are not the same This are my field names ID, Budgeted, Actual, Status and Differences ...
14
DJRhino1175
by: DJRhino1175 | last post by:
When I run this code I get an error, its Run-time error# 424 Object required...This is my first attempt at doing something like this. I test the entire code and it worked until I added this - If...
0
by: Rina0 | last post by:
I am looking for a Python code to find the longest common subsequence of two strings. I found this blog post that describes the length of longest common subsequence problem and provides a solution in...
5
by: DJRhino | last post by:
Private Sub CboDrawingID_BeforeUpdate(Cancel As Integer) If = 310029923 Or 310030138 Or 310030152 Or 310030346 Or 310030348 Or _ 310030356 Or 310030359 Or 310030362 Or...
0
by: Mushico | last post by:
How to calculate date of retirement from date of birth
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.