473,379 Members | 1,253 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,379 software developers and data experts.

XmlTextReader, parsing, space as data

I just ran across this.
#1 <DBColumn> 1 </DBColumn>
#2 <DBColumn> </DBColumn>
The data for #1 will be parsed and returned as " 1 ". I get a sequence of
Element/Text/EndElement.
The data for #2 will not be returned. I get a sequence of
Element/Whitespace/EndElement.

Why is the data (which happens to be spaces) between my start and end tags
being misinterpreted?
TIA

Nov 12 '05 #1
11 1908
Hi Kenneth,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to preserve whitespace in your
Xml document. If there is any misunderstanding, please feel free to let me
know.

The XmlTextReader only preserves white space that occurs within an
xml:space="preserve" context. So you need to add it to the parent node as
an attibute. If you're using an XmlDocument, you can simply set
PreserveWhitespace property to true before calling Load or LoadXml method.
You can check the following link for more information.

http://msdn.microsoft.com/library/de...us/cpguide/htm
l/cpconHandlingWhiteSpaceWithXmlTextReader.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #2
Kevin,

I only want the space (" ") within a data element, not between elements.
When my data has a non-space character in it, the space data is retrieved.
If I just have one or more spaces it gets written to XML but not pulled out.

#1 <DBColumn> 1 </DBColumn> <--- spaces retrieved
I get a sequence of Element/Text/EndElement.

#2 <DBColumn> </DBColumn> <--- just spaces, no data
retrieved
I get a sequence of Element/Whitespace/EndElement.
"Kevin Yu [MSFT]" wrote:
Hi Kenneth,

First of all, I would like to confirm my understanding of your issue. From
your description, I understand that you need to preserve whitespace in your
Xml document. If there is any misunderstanding, please feel free to let me
know.

The XmlTextReader only preserves white space that occurs within an
xml:space="preserve" context. So you need to add it to the parent node as
an attibute. If you're using an XmlDocument, you can simply set
PreserveWhitespace property to true before calling Load or LoadXml method.
You can check the following link for more information.

http://msdn.microsoft.com/library/de...us/cpguide/htm
l/cpconHandlingWhiteSpaceWithXmlTextReader.asp

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #3
Hi Ken,

I'm afraid this is by design for the XmlTextReader. If we don't have any
context within an element the whitespaces are abandoned if we don't have
xml:space="preserve" attribute set in the element. Please check the MSDN
document I posted in my last post for more information. Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #4
ke*****@nospam.nospam wrote:
I just ran across this.
#1 <DBColumn> 1 </DBColumn>
#2 <DBColumn> </DBColumn>
The data for #1 will be parsed and returned as " 1 ". I get a sequence of
Element/Text/EndElement.
The data for #2 will not be returned. I get a sequence of
Element/Whitespace/EndElement.

Why is the data (which happens to be spaces) between my start and end tags
being misinterpreted?


Because that gets treated as insignificant whitespace.
Set Whitespacehandling property of an XmlTextReader to
WhitespaceHandling.All to avoid this behaviour.

--
Oleg Tkachenko [XML MVP, MCAD]
http://blog.tkachenko.com
Nov 12 '05 #5
That property was set to WhitespaceHandling.All

"Oleg Tkachenko [MVP]" wrote:
ke*****@nospam.nospam wrote:
I just ran across this.
#1 <DBColumn> 1 </DBColumn>
#2 <DBColumn> </DBColumn>
The data for #1 will be parsed and returned as " 1 ". I get a sequence of
Element/Text/EndElement.
The data for #2 will not be returned. I get a sequence of
Element/Whitespace/EndElement.

Why is the data (which happens to be spaces) between my start and end tags
being misinterpreted?


Because that gets treated as insignificant whitespace.
Set Whitespacehandling property of an XmlTextReader to
WhitespaceHandling.All to avoid this behaviour.

--
Oleg Tkachenko [XML MVP, MCAD]
http://blog.tkachenko.com

Nov 12 '05 #6
hi,

Looking at the referenced document:
<test>•
••••<item>•
••••••••<item xml:space="preserve">º
ºººººººººººº<item/>º
ºººººººº</item>•
••••</item>•
••••<book>º
ºººººººº<b>This<b>º
ºººººººº<i>is</i>º
ºººººººº<b>a test</b>º
ºººº</book>•
</test>•
The white space shown as (•) is insignificant white space. The white
space shown as (º) is significant white space.
Note The scope of the xml:space attribute changes what would normally
be considered insignificant white space to be significant white space.
Notice that <b>a test</b> is not shown to be affected by the space=preserve
command.

I did try adding this to my xml (two ways), but it had no effect on the
xmltextreader.
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn xml:space="preserve"> </DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn> </DBColumn>
</DBRow>
Let try this from a different direction:
- I am not using validating readers, xsl, dom, etc.
- I want to output text data and read it back in, and the data can contain
spaces or even conceivably be all spaces
- How do I make sure a space out becomes a space in?
- Note: it seems that having nonspace characters causes space characters to
be read in. Is this always true?

Thanks
"Kevin Yu [MSFT]" wrote:
Hi Ken,

I'm afraid this is by design for the XmlTextReader. If we don't have any
context within an element the whitespaces are abandoned if we don't have
xml:space="preserve" attribute set in the element. Please check the MSDN
document I posted in my last post for more information. Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #7
Hi Kenneth,

I tried using XmlTextReader to parse your Xml document. I set the reader's
WhitespaceHandling property to WhitespaceHandling.All and all the white
spaces are preserved. If that doesn't work for you, could you please post a
code snippet and a part of the Xml document here, so that I can reproduce
it? Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #8
In what manner are they preserved for you?
I am looking to get a XmlNodeType.Text returned with the elements text
data. Is that what you get? or are you getting whitespace returned?
Here are two data snippet examples I have tried:
<DataBase>
<Index>1</Index>
<Enabled>True</Enabled>
<Series>2</Series>
<DBValues>
<dummy />
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn>1</DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn>1</DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn xml:space="preserve"> </DBColumn>
</DBRow>
<DBRow xml:space="preserve">
<DBColumn />
<DBColumn />
</DBRow>
</DBValues>
</DataBase>


<DataBase>
<Index>1</Index>
<Enabled>True</Enabled>
<Series>2</Series>
<DBValues>
<dummy />
<DBRow>
<DBColumn />
<DBColumn>1.1</DBColumn>
<DBColumn>-2</DBColumn>
<DBColumn>3</DBColumn>
<DBColumn>rrr</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>-1.1</DBColumn>
<DBColumn>2.0</DBColumn>
<DBColumn>-3</DBColumn>
<DBColumn>sss</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>10</DBColumn>
<DBColumn> </DBColumn>
<DBColumn>1</DBColumn>
<DBColumn>ttt</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>one</DBColumn>
<DBColumn>two</DBColumn>
<DBColumn>three</DBColumn>
<DBColumn>vvv</DBColumn>
</DBRow>
<DBRow>
<DBColumn />
<DBColumn>11</DBColumn>
<DBColumn> </DBColumn>
<DBColumn />
<DBColumn />
</DBRow>
</DBValues>
</DataBase>

//here is a sample code snippet,
//all the storing of data and error checking is gone,
//as well as getting data from a string instead of a file
using System;
using System.Collections;
using System.Xml;
namespace ConsoleApplication1
{

public class MyClass
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
GetData();
RL();
}

private static void RL()
{
Console.ReadLine();
}
private static void GetData()
{
string xmlFrag = " <DataBase>\n"+
" <Index>1</Index>\n"+
" <Enabled>True</Enabled>\n"+
" <Series>2</Series>\n"+
" <DBValues>\n"+
" <dummy />\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>1.1</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>3</DBColumn>\n"+
" <DBColumn>rrr</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>-1.1</DBColumn>\n"+
" <DBColumn>2.0</DBColumn>\n"+
" <DBColumn>-3</DBColumn>\n"+
" <DBColumn>sss</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>10</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>1</DBColumn>\n"+
" <DBColumn>ttt</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>one</DBColumn>\n"+
" <DBColumn>two</DBColumn>\n"+
" <DBColumn>three</DBColumn>\n"+
" <DBColumn>vvv</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>11</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn />\n"+
" <DBColumn />\n"+
" </DBRow>\n"+
" </DBValues>\n"+
" </DataBase>\n"
;

//Create the XmlNamespaceManager.
NameTable nt = new NameTable();
XmlNamespaceManager nsmgr = new XmlNamespaceManager(nt);
nsmgr.AddNamespace("bk", "urn:sample");

//Create the XmlParserContext.
XmlParserContext context = new XmlParserContext(null, nsmgr,
null, XmlSpace.None);

System.Xml.XmlTextReader xr = new
System.Xml.XmlTextReader(xmlFrag, XmlNodeType.Element, context);
xr.WhitespaceHandling = WhitespaceHandling.All;
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBValues":
_ParseDBValues(xr,1);
break;
}
}
break;
}
}
}
catch
{
}
}

public static void _ParseDBValues(System.Xml.XmlTextReader xr, int
_nDBInst)
{
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBRow":
Console.WriteLine("ROW");

_ParseDBColObject(xr, _nDBInst);
break;
}
}
break;

case XmlNodeType.Text:
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBValues"))
{
return;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
}

public static void _ParseDBColObject(System.Xml.XmlTextReader xr,
int _nDBInst)
{
// Get data values & range check

string element = "";
int _nColIndex = -1;
// Parse input stream
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
element = xr.Name;
if(element.Equals("DBColumn"))
{
_nColIndex++; // track which column we are on
}
}
break;

case XmlNodeType.Text:
{
switch(element)
{
case "DBColumn":
Console.WriteLine("DBColumn = \"" +
xr.Value + "\"");
break;
}
}
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBRow"))
{
Console.WriteLine("EndRow");
return ;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
return ;
}
}
}
"Kevin Yu [MSFT]" wrote:
Hi Kenneth,

I tried using XmlTextReader to parse your Xml document. I set the reader's
WhitespaceHandling property to WhitespaceHandling.All and all the white
spaces are preserved. If that doesn't work for you, could you please post a
code snippet and a part of the Xml document here, so that I can reproduce
it? Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #9
When you say the whitespace is preserved, how exactly do you mean that?
Is the data coming in as XmlNodeType.Text or as whitespace?
// the data was written with code like the following
xw.WriteStartElement("DBValues");
xw.WriteStartElement("dummy");
xw.WriteEndElement();
xw.WriteStartElement("DBRow");
string _strColVal = something;
xw.WriteElementString("DBColumn", _strColVal);
xw.WriteEndElement();
xw.WriteEndElement();
xw.WriteEndElement();
// sample code snippet,
// all the data storing and error checking is gone
// and we are getting data from a string instead of a file
using System;
using System.Collections;
using System.Xml;
namespace ConsoleApplication1
{

public class MyClass
{
/// <summary>
/// The main entry point for the application.
/// </summary>
[STAThread]
static void Main(string[] args)
{
GetData();
RL();
}

private static void RL()
{
Console.ReadLine();
}
private static void GetData()
{
string xmlFrag = " <DataBase>\n"+
" <Index>1</Index>\n"+
" <Enabled>True</Enabled>\n"+
" <Series>2</Series>\n"+
" <DBValues>\n"+
" <dummy />\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>1.1</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>3</DBColumn>\n"+
" <DBColumn>rrr</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>-1.1</DBColumn>\n"+
" <DBColumn>2.0</DBColumn>\n"+
" <DBColumn>-3</DBColumn>\n"+
" <DBColumn>sss</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>10</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn>1</DBColumn>\n"+
" <DBColumn>ttt</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>one</DBColumn>\n"+
" <DBColumn>two</DBColumn>\n"+
" <DBColumn>three</DBColumn>\n"+
" <DBColumn>vvv</DBColumn>\n"+
" </DBRow>\n"+
" <DBRow>\n"+
" <DBColumn />\n"+
" <DBColumn>11</DBColumn>\n"+
" <DBColumn> </DBColumn>\n"+
" <DBColumn />\n"+
" <DBColumn />\n"+
" </DBRow>\n"+
" </DBValues>\n"+
" </DataBase>\n"
;

//Create the XmlNamespaceManager.
NameTable nt = new NameTable();
XmlNamespaceManager nsmgr = new XmlNamespaceManager(nt);
nsmgr.AddNamespace("bk", "urn:sample");
//Create the XmlParserContext.
XmlParserContext context = new XmlParserContext(null, nsmgr,
null, XmlSpace.None);

System.Xml.XmlTextReader xr = new
System.Xml.XmlTextReader(xmlFrag, XmlNodeType.Element, context);
xr.WhitespaceHandling = WhitespaceHandling.All;
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBValues":
_ParseDBValues(xr,1);
break;
}
}
break;
}
}
}
catch
{
}
}

public static void _ParseDBValues(System.Xml.XmlTextReader xr, int
_nDBInst)
{
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
switch(xr.Name)
{
case "DBRow":
Console.WriteLine("ROW");

_ParseDBColObject(xr, _nDBInst);
break;
}
}
break;

case XmlNodeType.Text:
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBValues"))
{
return;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
}

public static void _ParseDBColObject(System.Xml.XmlTextReader xr,
int _nDBInst)
{
// Get data values & range check

string element = "";
int _nColIndex = -1;
// Parse input stream
try
{
while (xr.Read())
{
switch(xr.NodeType)
{
case XmlNodeType.Element:
{
element = xr.Name;
if(element.Equals("DBColumn"))
{
_nColIndex++; // track which column we are on
}
}
break;

case XmlNodeType.Text:
{
switch(element)
{
case "DBColumn":
Console.WriteLine("DBColumn = \"" +
xr.Value + "\"");
break;
}
}
break;

case XmlNodeType.EndElement:
if(xr.Name.Equals("DBRow"))
{
Console.WriteLine("EndRow");
return ;
}
break;
default:
break;
}
}
}
catch
{
throw new Exception("Unexpected element in DBValues...");
}
return ;
}
}
}
"Kevin Yu [MSFT]" wrote:
Hi Kenneth,

I tried using XmlTextReader to parse your Xml document. I set the reader's
WhitespaceHandling property to WhitespaceHandling.All and all the white
spaces are preserved. If that doesn't work for you, could you please post a
code snippet and a part of the Xml document here, so that I can reproduce
it? Thanks!

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #10
Thanks for the further response Kenneth,

We'll have a futher investigation on the code and will update you soon.

Regards,

Steven Cheng
Microsoft Online Support

Get Secure! www.microsoft.com/security
(This posting is provided "AS IS", with no warranties, and confers no
rights.)

Nov 12 '05 #11
Hi Kenneth,

The NodeType I got is SignificantWhitespace. All the whitespaces between
two nodes known as insignificant whitespaces are noted as Whitespace. We
add the xml:space="preserve" to make sure that the value of the node is
preserved. But the NodeType is not Text. You can just look for the
SignificantWhitespace for the blank node values and ignore the Whitespace
nodes. Some SignificantWhitespace you see might belong to the parent node.

This is by design.

Kevin Yu
=======
"This posting is provided "AS IS" with no warranties, and confers no
rights."

Nov 12 '05 #12

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Jesper Stocholm | last post by:
I have som XML like this: <root> <Course CourseCode="id1"> <Teacher Name="Some name"/> <Title Titlde="Dansk Titel 1"/> <Title Title="English Title 1"/> <Location Place="Some place"/>...
1
by: Hang Cheng | last post by:
I've been trying to parse the following XML file with the .NET XMLTextReader: <?xml...
3
by: Raghu | last post by:
Does XmlTextReader class in .net represent SAX implementation? If yes, are there any performance gains if I use C++ SAX implementation in msxml4.dll versus XmlTextReader in .net? Did any one try...
4
by: Andy Neilson | last post by:
I've run across a strange behaviour with XmlSerializer that I'm unable to explain. I came across this while trying to use XmlSerializer to deserialize from a the details of a SoapException. This...
5
by: Geoff Bennett | last post by:
While parsing an XML document, my TextReader instance skips nodes. For example, in this fragment: <Person Sex="Male" FirstHomeBuyer="No" YearsInCurrentProfession="14"> <RelatedEntityRef...
2
by: ecomputerdelicacy | last post by:
Compact Framework SP3 (and I think also SP2 and SP1) XmlTextReader does not handle !DOCTYPE and (maybe?) rdf elements. How can I get the XmlTextReader to read these without issuing a...
2
by: Q | last post by:
I am feeding XmlTextReader a URL that returns the XML that then gets parsed. The URL forms a query that affects how much data is returned in XML but not the format of the data. The problem is...
1
by: Alexander Gnauck | last post by:
Hello, i get my XML data over TCP/IP Sockets from a Server. And i must parse the XML Data with the XmlTextReader because ist the fastest Method i found. The size of my Xml files is variable. The...
17
by: Slonocode | last post by:
Is there any relation between the NodeType and the Text when using the XmlTextReader? The documentation uses select/case while using the xmltextreader. While reader.Read() Select Case...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.