473,378 Members | 1,411 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,378 software developers and data experts.

Special Characters not resolving

When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.

How can I force those 5 special character types to be translated?

Oct 23 '06 #1
7 4876
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.

How can I force those 5 special character types to be translated?
Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo &amp; Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&amp;' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 23 '06 #2

I am using VS 2003. I have an XML file that is generated by Oracle. There
is an excerpt of the XML under this explanation.

Here is an example of the code I am using:
myXMLDataDoc = new System.Xml.XmlDataDocument();
myXMLDataDoc.DataSet.ReadXml("myFile.XML");

I then use the DataSet.Tables["BRIDGE"] to iterate through the data and pick
out what I need. I am not using element/node lookups, only the dataset view.

I also have a dataset object with an XSD that matches my local database
schema.
I create a new row in the table to house the incomming information something
like this:

dataTable1 = internalDataSet.Tables["BRIDGE"];
dataRow = dataTable1.NewRow();

dataTable2= myXMLDataDoc.DataSet.Tables[["BRIDGE"];

dataRow["MYKEY"] = dataTable2.Rows[0]["MYKEY"];
dataRow["NOTES"] = dataTable2.Rows[0]["NOTES"];

If I check the length of dataRow["NOTES"] it is 2031 (not 2000 which is the
length of the field if the characters are resolved).

Subsequently, when I use the dataset.update command to commit the data to
the local database, the 2031 characters is greater than the 2000 field length.
Here is an excerpt of the XML:
<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
200208XXT PEB2 X,XXX.XX CONS 8024 XX.X 11XXXX 33 = X.XX
950824SD1 P4B1 16,643.70 SAI 1033 284.0 115851 XYYY33 = 25,386.41
960830MD2 P ID 0.00 BAKER REH M001 662.0 REHAB 33 = 0.00
970825SD3 P4B1+19,216.34 SAI+2C-B1 3033 324.0 115851 XXYY33 = 25,591.81
980821TD4 P4B2+22,996.60 SAI+2C-B1 4026 358.0 115851 HXXYY33 = 20,200.00
19990820S PEB1 22,758.58 HDR 5032 283.5 116615 33 = 23,140.00
20001103T PEB1 22,758.28 HDR 6024 283.5 116615 33 = 16,695.00
200108XXS PEB1 X,XXX.XX CONS 7032 XX.X 11XXXX 33 = X.XX
QAF-LR/MN 5/24/93,FV-LR/JJ 8/9,10/19/3,7/19/4R= 4UW, I= 2UW FV LR 8/22/95
0411STZZ000001U1927 HRS = 282/ PEB1 / 16A18 /1995 XXXXXXXXXXX33= 30,000
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 AMBRIDGE-ALIQUIPPA BR OV CONRAIL & P&LE(CSX) RR&apos;S
X932
BAKER WILL SEND CALCS TO SAI VIA ANDY BEAV CO BY JUNE 1, 1997.
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN &apos;96(SPANS 1 & 10
INCL&apos;D ?)
A H
SPAN 1 OV 0065 & SPAN 10 OV 0051 TRANSF&apos;D FR STATE TO BEAV CO ON
02/06/97.
LR 641 STA 60+80 SPAN 1 OVER SR 0065 45&apos;
APPL3955 STA 84+87 SPAN 10 OVER SR 0051 (NB) 43&apos;
SCOUR EVALUATION 11359 W06 = 4 E29-A = 4
SF SP SW DDDATE USGSFV USGSSD EP DSTAT USGSSF EF SAS FEDCAT
B H -- 11141997 MMYYYY 032001 4 ----- 082002 4 047 2A1
910606XXX NBIS 3,173.80 SAI XXXX I 02 UNTYYYY YYYYEST= 4,000
920612XXX NBIS 7,283.00 SAI XXXX R 04 UNTYYYY YYYYEST= 8,000
940830TWO A1-5 5,537 P C & S TW01 R 04 UNTYYYY YYYYEST= 6,000
961031UW1 A1-5 4,599 P C & S UW R 04 430533 XYYYEST= 8,000
M-00870144
MAP D13 D14 HSOR HSCCV SPR CK COMMENT OVER O DATE P/F
THIS LINE IS RESERVED FOR CCV DATA
STAT IR IC ACM INSP ACM QNTY # LOCATION OF ACM
B B 0 0 MMDDYYYY UNKNWN 0 ***
B H -- 11141997 MMYYY</NOTES>
</BRIDGE>
</BEGIN_DATA>
Could my problem be related to using the DataSet view of the XMLDATADOC
instead of the the elemenet/node?

Thanks!

"Martin Honnen" wrote:
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.

How can I force those 5 special character types to be translated?

Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo & Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 24 '06 #3
I need to update my response to you! To better define the issue, when I
load a XML file into an XMLDATADOCUMENT, the special characters are not
resolved.

If I write the file to disk using the XMLDATADOCUMENT.SAVE method, the
various characters are resolved.

Here is an example cut from 2 .XML files. In the first example, you will
see &apos; is never resolved. When I write the file out to disk again, the
&apos; is resolved.

I don't want to have to load large XMLDataDocuments (which can take many
many minutes!), write to disk to resolve the characters then load again.

I am missing some common crutial items. HELP!

**** AS LOADED INTO THE XML DATA DOCUMENT ****

<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 CONRAIL & P&LE(CSX) RR&apos;S X932
REHAB BY BAKER IN &apos;96(SPANS 1 & 10 INCL&apos;D ?)
</NOTES>
</BRIDGE>
</BEGIN_DATA>
**** AS SAVED TO A FILE USING THE XMLDATADOCUMENT.SAVE method ****

<NewDataSet>
<BEGIN_DATA>
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 CONRAIL & P&LE(CSX) RR'S X932
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN '96(SPANS 1 & 10 INCL'D ?)
</NOTES>
</BRIDGE>
</BEGIN_DATA>
</NewDataSet>
"Trac Bannon" wrote:
>
I am using VS 2003. I have an XML file that is generated by Oracle. There
is an excerpt of the XML under this explanation.

Here is an example of the code I am using:
myXMLDataDoc = new System.Xml.XmlDataDocument();
myXMLDataDoc.DataSet.ReadXml("myFile.XML");

I then use the DataSet.Tables["BRIDGE"] to iterate through the data and pick
out what I need. I am not using element/node lookups, only the dataset view.

I also have a dataset object with an XSD that matches my local database
schema.
I create a new row in the table to house the incomming information something
like this:

dataTable1 = internalDataSet.Tables["BRIDGE"];
dataRow = dataTable1.NewRow();

dataTable2= myXMLDataDoc.DataSet.Tables[["BRIDGE"];

dataRow["MYKEY"] = dataTable2.Rows[0]["MYKEY"];
dataRow["NOTES"] = dataTable2.Rows[0]["NOTES"];

If I check the length of dataRow["NOTES"] it is 2031 (not 2000 which is the
length of the field if the characters are resolved).

Subsequently, when I use the dataset.update command to commit the data to
the local database, the 2031 characters is greater than the 2000 field length.
Here is an excerpt of the XML:
<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
200208XXT PEB2 X,XXX.XX CONS 8024 XX.X 11XXXX 33 = X.XX
950824SD1 P4B1 16,643.70 SAI 1033 284.0 115851 XYYY33 = 25,386.41
960830MD2 P ID 0.00 BAKER REH M001 662.0 REHAB 33 = 0.00
970825SD3 P4B1+19,216.34 SAI+2C-B1 3033 324.0 115851 XXYY33 = 25,591.81
980821TD4 P4B2+22,996.60 SAI+2C-B1 4026 358.0 115851 HXXYY33 = 20,200.00
19990820S PEB1 22,758.58 HDR 5032 283.5 116615 33 = 23,140.00
20001103T PEB1 22,758.28 HDR 6024 283.5 116615 33 = 16,695.00
200108XXS PEB1 X,XXX.XX CONS 7032 XX.X 11XXXX 33 = X.XX
QAF-LR/MN 5/24/93,FV-LR/JJ 8/9,10/19/3,7/19/4R= 4UW, I= 2UW FV LR 8/22/95
0411STZZ000001U1927 HRS = 282/ PEB1 / 16A18 /1995 XXXXXXXXXXX33= 30,000
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 AMBRIDGE-ALIQUIPPA BR OV CONRAIL & P&LE(CSX) RR&apos;S
X932
BAKER WILL SEND CALCS TO SAI VIA ANDY BEAV CO BY JUNE 1, 1997.
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN &apos;96(SPANS 1 & 10
INCL&apos;D ?)
A H
SPAN 1 OV 0065 & SPAN 10 OV 0051 TRANSF&apos;D FR STATE TO BEAV CO ON
02/06/97.
LR 641 STA 60+80 SPAN 1 OVER SR 0065 45&apos;
APPL3955 STA 84+87 SPAN 10 OVER SR 0051 (NB) 43&apos;
SCOUR EVALUATION 11359 W06 = 4 E29-A = 4
SF SP SW DDDATE USGSFV USGSSD EP DSTAT USGSSF EF SAS FEDCAT
B H -- 11141997 MMYYYY 032001 4 ----- 082002 4 047 2A1
910606XXX NBIS 3,173.80 SAI XXXX I 02 UNTYYYY YYYYEST= 4,000
920612XXX NBIS 7,283.00 SAI XXXX R 04 UNTYYYY YYYYEST= 8,000
940830TWO A1-5 5,537 P C & S TW01 R 04 UNTYYYY YYYYEST= 6,000
961031UW1 A1-5 4,599 P C & S UW R 04 430533 XYYYEST= 8,000
M-00870144
MAP D13 D14 HSOR HSCCV SPR CK COMMENT OVER O DATE P/F
THIS LINE IS RESERVED FOR CCV DATA
STAT IR IC ACM INSP ACM QNTY # LOCATION OF ACM
B B 0 0 MMDDYYYY UNKNWN 0 ***
B H -- 11141997 MMYYY</NOTES>
</BRIDGE>
</BEGIN_DATA>
Could my problem be related to using the DataSet view of the XMLDATADOC
instead of the the elemenet/node?

Thanks!

"Martin Honnen" wrote:
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.
>
How can I force those 5 special character types to be translated?
Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo & Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 25 '06 #4
If I remember your original post, you were using dataDoc.DataSet.ReadXml.
Does dataDoc.Load work?

John
Oct 25 '06 #5
No... it acutally results in an empty dataset. I've opened a trouble ticket
with MicroSoft so I will update this once I have an answer... but any and
all feedback is welcome.
"John Saunders" wrote:
If I remember your original post, you were using dataDoc.DataSet.ReadXml.
Does dataDoc.Load work?

John
Oct 25 '06 #6
"Trac Bannon" <Tr********@discussions.microsoft.comwrote in message
news:17**********************************@microsof t.com...
No... it acutally results in an empty dataset. I've opened a trouble
ticket
with MicroSoft so I will update this once I have an answer... but any
and
all feedback is welcome.
It results in an empty dataset, but what's loaded as far as XML? Are there
any elements loaded? Do the have the right characters?

John
Oct 26 '06 #7
I had the opportunity to work with Microsoft on this problem. I was
reporting an error when I tried to commit my data to the database. My
starting data at the origin in an ORacle database is 2000 characters. When I
receive it directly from a web service, load my XMLDATADOCUMENT and .SAVE to
disk, I can read the .XML file later into an XMLDATADOCUMENT and commit to a
local MSDE database with no problem...the 2000 character field resolves
properly and inserts without an issue.

When the XML isdelivered to me on disk (not from a WebService) and I load an
XMLDATADOCUMENT my insert into an MSDE database failed! The field in
question showed as 2031 characters.

I had some of the facts but not all of them. When I compared the .XML files
(the one I wrote to disk and the one that was delivered to me on disk (never
from a web service), I could visually see the difference in the field (the
latter contained extra XML special character entities).

.......HERE IS THE NEAT PART.....

Microsoft compared the files and found that the culprit (the extra 31
characters) were not from the XML special character entities not resolving.
They were from extra carriage returns. When I compared the files using a HEX
editor... lo and behold.... "0D" were inserted where ever newline/linefeed
characters were inserted.

I added a very slow string.replace("\r","") and wah lah... the problem was
fixed.

Obviously I need to find either a way to have the XML file provider strip
these out (They are adding these somehow when writing to disk. They are NOT
in the original data or in the memory XML prior to their writing to disk).

Thanks for working with me on the problem, though. Having other developers
in the community respond to these posts with intelligent questions has been
great.

Regards,
Trac

"Trac Bannon" wrote:
>
I am using VS 2003. I have an XML file that is generated by Oracle. There
is an excerpt of the XML under this explanation.

Here is an example of the code I am using:
myXMLDataDoc = new System.Xml.XmlDataDocument();
myXMLDataDoc.DataSet.ReadXml("myFile.XML");

I then use the DataSet.Tables["BRIDGE"] to iterate through the data and pick
out what I need. I am not using element/node lookups, only the dataset view.

I also have a dataset object with an XSD that matches my local database
schema.
I create a new row in the table to house the incomming information something
like this:

dataTable1 = internalDataSet.Tables["BRIDGE"];
dataRow = dataTable1.NewRow();

dataTable2= myXMLDataDoc.DataSet.Tables[["BRIDGE"];

dataRow["MYKEY"] = dataTable2.Rows[0]["MYKEY"];
dataRow["NOTES"] = dataTable2.Rows[0]["NOTES"];

If I check the length of dataRow["NOTES"] it is 2031 (not 2000 which is the
length of the field if the characters are resolved).

Subsequently, when I use the dataset.update command to commit the data to
the local database, the 2031 characters is greater than the 2000 field length.
Here is an excerpt of the XML:
<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
200208XXT PEB2 X,XXX.XX CONS 8024 XX.X 11XXXX 33 = X.XX
950824SD1 P4B1 16,643.70 SAI 1033 284.0 115851 XYYY33 = 25,386.41
960830MD2 P ID 0.00 BAKER REH M001 662.0 REHAB 33 = 0.00
970825SD3 P4B1+19,216.34 SAI+2C-B1 3033 324.0 115851 XXYY33 = 25,591.81
980821TD4 P4B2+22,996.60 SAI+2C-B1 4026 358.0 115851 HXXYY33 = 20,200.00
19990820S PEB1 22,758.58 HDR 5032 283.5 116615 33 = 23,140.00
20001103T PEB1 22,758.28 HDR 6024 283.5 116615 33 = 16,695.00
200108XXS PEB1 X,XXX.XX CONS 7032 XX.X 11XXXX 33 = X.XX
QAF-LR/MN 5/24/93,FV-LR/JJ 8/9,10/19/3,7/19/4R= 4UW, I= 2UW FV LR 8/22/95
0411STZZ000001U1927 HRS = 282/ PEB1 / 16A18 /1995 XXXXXXXXXXX33= 30,000
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 AMBRIDGE-ALIQUIPPA BR OV CONRAIL & P&LE(CSX) RR&apos;S
X932
BAKER WILL SEND CALCS TO SAI VIA ANDY BEAV CO BY JUNE 1, 1997.
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN &apos;96(SPANS 1 & 10
INCL&apos;D ?)
A H
SPAN 1 OV 0065 & SPAN 10 OV 0051 TRANSF&apos;D FR STATE TO BEAV CO ON
02/06/97.
LR 641 STA 60+80 SPAN 1 OVER SR 0065 45&apos;
APPL3955 STA 84+87 SPAN 10 OVER SR 0051 (NB) 43&apos;
SCOUR EVALUATION 11359 W06 = 4 E29-A = 4
SF SP SW DDDATE USGSFV USGSSD EP DSTAT USGSSF EF SAS FEDCAT
B H -- 11141997 MMYYYY 032001 4 ----- 082002 4 047 2A1
910606XXX NBIS 3,173.80 SAI XXXX I 02 UNTYYYY YYYYEST= 4,000
920612XXX NBIS 7,283.00 SAI XXXX R 04 UNTYYYY YYYYEST= 8,000
940830TWO A1-5 5,537 P C & S TW01 R 04 UNTYYYY YYYYEST= 6,000
961031UW1 A1-5 4,599 P C & S UW R 04 430533 XYYYEST= 8,000
M-00870144
MAP D13 D14 HSOR HSCCV SPR CK COMMENT OVER O DATE P/F
THIS LINE IS RESERVED FOR CCV DATA
STAT IR IC ACM INSP ACM QNTY # LOCATION OF ACM
B B 0 0 MMDDYYYY UNKNWN 0 ***
B H -- 11141997 MMYYY</NOTES>
</BRIDGE>
</BEGIN_DATA>
Could my problem be related to using the DataSet view of the XMLDATADOC
instead of the the elemenet/node?

Thanks!

"Martin Honnen" wrote:
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.
>
How can I force those 5 special character types to be translated?
Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo & Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 29 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Roy W. Andersen | last post by:
I've been searching google about this for days but can't find anything, so I'm hoping someone here can help me out. I'm trying to create zip-files without needing the zip-file extension in PHP,...
3
by: Barry Olly | last post by:
Hi, I'm working on a mini content management system and need help with dealing with special characters. The input are taken from html form which are then stored into a varchar column in...
5
by: Sakharam Phapale | last post by:
Hi All, I am using an API function, which takes file path as an input. When file path contains special characters (@,#,$,%,&,^, etc), API function gives an error as "Unable to open input file"....
17
by: Carl Mercier | last post by:
Hi, Is it possible to use special characters like \n or \t in a VB.NET string, just like in C#? My guess is NO, but maybe there's something I don't know. If it's not possible, does anybody...
8
by: david.lindsay.green | last post by:
Hello all, I am quite new a web scripting and making web pages in general and I have stumbled across a problem I have as yet been unable to solve. I am trying to take the contents of a textarea box...
5
by: Doc | last post by:
Hello! I'm experiencing a little problem counting the number of characters in a textarea on a html page. This is the content type of my HTML document content="text/html; charset=iso-8859-1" ...
1
by: sonald | last post by:
Dear All, I am working on a module that validates the provided CSV data in a text format, which must be in a predefined format. We check for the : 1. Number of fields provided in the text file,...
3
KevinADC
by: KevinADC | last post by:
Purpose The purpose of this article is to discuss the difference between characters inside a character class and outside a character class and some special characters inside a character class....
0
by: AAaron123 | last post by:
Been playing with asp:changepassword and have it looking OK except that I can't elininate or change the title at the top that says "Change Your Password". It's a repeat of my pages title. ...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.