By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,341 Members | 1,376 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,341 IT Pros & Developers. It's quick & easy.

Special Characters not resolving

P: n/a
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.

How can I force those 5 special character types to be translated?

Oct 23 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.

How can I force those 5 special character types to be translated?
Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo &amp; Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&amp;' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 23 '06 #2

P: n/a

I am using VS 2003. I have an XML file that is generated by Oracle. There
is an excerpt of the XML under this explanation.

Here is an example of the code I am using:
myXMLDataDoc = new System.Xml.XmlDataDocument();
myXMLDataDoc.DataSet.ReadXml("myFile.XML");

I then use the DataSet.Tables["BRIDGE"] to iterate through the data and pick
out what I need. I am not using element/node lookups, only the dataset view.

I also have a dataset object with an XSD that matches my local database
schema.
I create a new row in the table to house the incomming information something
like this:

dataTable1 = internalDataSet.Tables["BRIDGE"];
dataRow = dataTable1.NewRow();

dataTable2= myXMLDataDoc.DataSet.Tables[["BRIDGE"];

dataRow["MYKEY"] = dataTable2.Rows[0]["MYKEY"];
dataRow["NOTES"] = dataTable2.Rows[0]["NOTES"];

If I check the length of dataRow["NOTES"] it is 2031 (not 2000 which is the
length of the field if the characters are resolved).

Subsequently, when I use the dataset.update command to commit the data to
the local database, the 2031 characters is greater than the 2000 field length.
Here is an excerpt of the XML:
<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
200208XXT PEB2 X,XXX.XX CONS 8024 XX.X 11XXXX 33 = X.XX
950824SD1 P4B1 16,643.70 SAI 1033 284.0 115851 XYYY33 = 25,386.41
960830MD2 P ID 0.00 BAKER REH M001 662.0 REHAB 33 = 0.00
970825SD3 P4B1+19,216.34 SAI+2C-B1 3033 324.0 115851 XXYY33 = 25,591.81
980821TD4 P4B2+22,996.60 SAI+2C-B1 4026 358.0 115851 HXXYY33 = 20,200.00
19990820S PEB1 22,758.58 HDR 5032 283.5 116615 33 = 23,140.00
20001103T PEB1 22,758.28 HDR 6024 283.5 116615 33 = 16,695.00
200108XXS PEB1 X,XXX.XX CONS 7032 XX.X 11XXXX 33 = X.XX
QAF-LR/MN 5/24/93,FV-LR/JJ 8/9,10/19/3,7/19/4R= 4UW, I= 2UW FV LR 8/22/95
0411STZZ000001U1927 HRS = 282/ PEB1 / 16A18 /1995 XXXXXXXXXXX33= 30,000
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 AMBRIDGE-ALIQUIPPA BR OV CONRAIL & P&LE(CSX) RR&apos;S
X932
BAKER WILL SEND CALCS TO SAI VIA ANDY BEAV CO BY JUNE 1, 1997.
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN &apos;96(SPANS 1 & 10
INCL&apos;D ?)
A H
SPAN 1 OV 0065 & SPAN 10 OV 0051 TRANSF&apos;D FR STATE TO BEAV CO ON
02/06/97.
LR 641 STA 60+80 SPAN 1 OVER SR 0065 45&apos;
APPL3955 STA 84+87 SPAN 10 OVER SR 0051 (NB) 43&apos;
SCOUR EVALUATION 11359 W06 = 4 E29-A = 4
SF SP SW DDDATE USGSFV USGSSD EP DSTAT USGSSF EF SAS FEDCAT
B H -- 11141997 MMYYYY 032001 4 ----- 082002 4 047 2A1
910606XXX NBIS 3,173.80 SAI XXXX I 02 UNTYYYY YYYYEST= 4,000
920612XXX NBIS 7,283.00 SAI XXXX R 04 UNTYYYY YYYYEST= 8,000
940830TWO A1-5 5,537 P C & S TW01 R 04 UNTYYYY YYYYEST= 6,000
961031UW1 A1-5 4,599 P C & S UW R 04 430533 XYYYEST= 8,000
M-00870144
MAP D13 D14 HSOR HSCCV SPR CK COMMENT OVER O DATE P/F
THIS LINE IS RESERVED FOR CCV DATA
STAT IR IC ACM INSP ACM QNTY # LOCATION OF ACM
B B 0 0 MMDDYYYY UNKNWN 0 ***
B H -- 11141997 MMYYY</NOTES>
</BRIDGE>
</BEGIN_DATA>
Could my problem be related to using the DataSet view of the XMLDATADOC
instead of the the elemenet/node?

Thanks!

"Martin Honnen" wrote:
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.

How can I force those 5 special character types to be translated?

Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo & Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 24 '06 #3

P: n/a
I need to update my response to you! To better define the issue, when I
load a XML file into an XMLDATADOCUMENT, the special characters are not
resolved.

If I write the file to disk using the XMLDATADOCUMENT.SAVE method, the
various characters are resolved.

Here is an example cut from 2 .XML files. In the first example, you will
see &apos; is never resolved. When I write the file out to disk again, the
&apos; is resolved.

I don't want to have to load large XMLDataDocuments (which can take many
many minutes!), write to disk to resolve the characters then load again.

I am missing some common crutial items. HELP!

**** AS LOADED INTO THE XML DATA DOCUMENT ****

<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 CONRAIL & P&LE(CSX) RR&apos;S X932
REHAB BY BAKER IN &apos;96(SPANS 1 & 10 INCL&apos;D ?)
</NOTES>
</BRIDGE>
</BEGIN_DATA>
**** AS SAVED TO A FILE USING THE XMLDATADOCUMENT.SAVE method ****

<NewDataSet>
<BEGIN_DATA>
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 CONRAIL & P&LE(CSX) RR'S X932
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN '96(SPANS 1 & 10 INCL'D ?)
</NOTES>
</BRIDGE>
</BEGIN_DATA>
</NewDataSet>
"Trac Bannon" wrote:
>
I am using VS 2003. I have an XML file that is generated by Oracle. There
is an excerpt of the XML under this explanation.

Here is an example of the code I am using:
myXMLDataDoc = new System.Xml.XmlDataDocument();
myXMLDataDoc.DataSet.ReadXml("myFile.XML");

I then use the DataSet.Tables["BRIDGE"] to iterate through the data and pick
out what I need. I am not using element/node lookups, only the dataset view.

I also have a dataset object with an XSD that matches my local database
schema.
I create a new row in the table to house the incomming information something
like this:

dataTable1 = internalDataSet.Tables["BRIDGE"];
dataRow = dataTable1.NewRow();

dataTable2= myXMLDataDoc.DataSet.Tables[["BRIDGE"];

dataRow["MYKEY"] = dataTable2.Rows[0]["MYKEY"];
dataRow["NOTES"] = dataTable2.Rows[0]["NOTES"];

If I check the length of dataRow["NOTES"] it is 2031 (not 2000 which is the
length of the field if the characters are resolved).

Subsequently, when I use the dataset.update command to commit the data to
the local database, the 2031 characters is greater than the 2000 field length.
Here is an excerpt of the XML:
<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
200208XXT PEB2 X,XXX.XX CONS 8024 XX.X 11XXXX 33 = X.XX
950824SD1 P4B1 16,643.70 SAI 1033 284.0 115851 XYYY33 = 25,386.41
960830MD2 P ID 0.00 BAKER REH M001 662.0 REHAB 33 = 0.00
970825SD3 P4B1+19,216.34 SAI+2C-B1 3033 324.0 115851 XXYY33 = 25,591.81
980821TD4 P4B2+22,996.60 SAI+2C-B1 4026 358.0 115851 HXXYY33 = 20,200.00
19990820S PEB1 22,758.58 HDR 5032 283.5 116615 33 = 23,140.00
20001103T PEB1 22,758.28 HDR 6024 283.5 116615 33 = 16,695.00
200108XXS PEB1 X,XXX.XX CONS 7032 XX.X 11XXXX 33 = X.XX
QAF-LR/MN 5/24/93,FV-LR/JJ 8/9,10/19/3,7/19/4R= 4UW, I= 2UW FV LR 8/22/95
0411STZZ000001U1927 HRS = 282/ PEB1 / 16A18 /1995 XXXXXXXXXXX33= 30,000
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 AMBRIDGE-ALIQUIPPA BR OV CONRAIL & P&LE(CSX) RR&apos;S
X932
BAKER WILL SEND CALCS TO SAI VIA ANDY BEAV CO BY JUNE 1, 1997.
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN &apos;96(SPANS 1 & 10
INCL&apos;D ?)
A H
SPAN 1 OV 0065 & SPAN 10 OV 0051 TRANSF&apos;D FR STATE TO BEAV CO ON
02/06/97.
LR 641 STA 60+80 SPAN 1 OVER SR 0065 45&apos;
APPL3955 STA 84+87 SPAN 10 OVER SR 0051 (NB) 43&apos;
SCOUR EVALUATION 11359 W06 = 4 E29-A = 4
SF SP SW DDDATE USGSFV USGSSD EP DSTAT USGSSF EF SAS FEDCAT
B H -- 11141997 MMYYYY 032001 4 ----- 082002 4 047 2A1
910606XXX NBIS 3,173.80 SAI XXXX I 02 UNTYYYY YYYYEST= 4,000
920612XXX NBIS 7,283.00 SAI XXXX R 04 UNTYYYY YYYYEST= 8,000
940830TWO A1-5 5,537 P C & S TW01 R 04 UNTYYYY YYYYEST= 6,000
961031UW1 A1-5 4,599 P C & S UW R 04 430533 XYYYEST= 8,000
M-00870144
MAP D13 D14 HSOR HSCCV SPR CK COMMENT OVER O DATE P/F
THIS LINE IS RESERVED FOR CCV DATA
STAT IR IC ACM INSP ACM QNTY # LOCATION OF ACM
B B 0 0 MMDDYYYY UNKNWN 0 ***
B H -- 11141997 MMYYY</NOTES>
</BRIDGE>
</BEGIN_DATA>
Could my problem be related to using the DataSet view of the XMLDATADOC
instead of the the elemenet/node?

Thanks!

"Martin Honnen" wrote:
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.
>
How can I force those 5 special character types to be translated?
Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo & Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 25 '06 #4

P: n/a
If I remember your original post, you were using dataDoc.DataSet.ReadXml.
Does dataDoc.Load work?

John
Oct 25 '06 #5

P: n/a
No... it acutally results in an empty dataset. I've opened a trouble ticket
with MicroSoft so I will update this once I have an answer... but any and
all feedback is welcome.
"John Saunders" wrote:
If I remember your original post, you were using dataDoc.DataSet.ReadXml.
Does dataDoc.Load work?

John
Oct 25 '06 #6

P: n/a
"Trac Bannon" <Tr********@discussions.microsoft.comwrote in message
news:17**********************************@microsof t.com...
No... it acutally results in an empty dataset. I've opened a trouble
ticket
with MicroSoft so I will update this once I have an answer... but any
and
all feedback is welcome.
It results in an empty dataset, but what's loaded as far as XML? Are there
any elements loaded? Do the have the right characters?

John
Oct 26 '06 #7

P: n/a
I had the opportunity to work with Microsoft on this problem. I was
reporting an error when I tried to commit my data to the database. My
starting data at the origin in an ORacle database is 2000 characters. When I
receive it directly from a web service, load my XMLDATADOCUMENT and .SAVE to
disk, I can read the .XML file later into an XMLDATADOCUMENT and commit to a
local MSDE database with no problem...the 2000 character field resolves
properly and inserts without an issue.

When the XML isdelivered to me on disk (not from a WebService) and I load an
XMLDATADOCUMENT my insert into an MSDE database failed! The field in
question showed as 2031 characters.

I had some of the facts but not all of them. When I compared the .XML files
(the one I wrote to disk and the one that was delivered to me on disk (never
from a web service), I could visually see the difference in the field (the
latter contained extra XML special character entities).

.......HERE IS THE NEAT PART.....

Microsoft compared the files and found that the culprit (the extra 31
characters) were not from the XML special character entities not resolving.
They were from extra carriage returns. When I compared the files using a HEX
editor... lo and behold.... "0D" were inserted where ever newline/linefeed
characters were inserted.

I added a very slow string.replace("\r","") and wah lah... the problem was
fixed.

Obviously I need to find either a way to have the XML file provider strip
these out (They are adding these somehow when writing to disk. They are NOT
in the original data or in the memory XML prior to their writing to disk).

Thanks for working with me on the problem, though. Having other developers
in the community respond to these posts with intelligent questions has been
great.

Regards,
Trac

"Trac Bannon" wrote:
>
I am using VS 2003. I have an XML file that is generated by Oracle. There
is an excerpt of the XML under this explanation.

Here is an example of the code I am using:
myXMLDataDoc = new System.Xml.XmlDataDocument();
myXMLDataDoc.DataSet.ReadXml("myFile.XML");

I then use the DataSet.Tables["BRIDGE"] to iterate through the data and pick
out what I need. I am not using element/node lookups, only the dataset view.

I also have a dataset object with an XSD that matches my local database
schema.
I create a new row in the table to house the incomming information something
like this:

dataTable1 = internalDataSet.Tables["BRIDGE"];
dataRow = dataTable1.NewRow();

dataTable2= myXMLDataDoc.DataSet.Tables[["BRIDGE"];

dataRow["MYKEY"] = dataTable2.Rows[0]["MYKEY"];
dataRow["NOTES"] = dataTable2.Rows[0]["NOTES"];

If I check the length of dataRow["NOTES"] it is 2031 (not 2000 which is the
length of the field if the characters are resolved).

Subsequently, when I use the dataset.update command to commit the data to
the local database, the 2031 characters is greater than the 2000 field length.
Here is an excerpt of the XML:
<?xml version="1.0" encoding="ISO8859-1"?>
<BEGIN_DATA xmlns:xs="http://www.w3.org/2001/XMLSchema">
<BRIDGE>
<MYKEY>3894</MYKEY>
<NOTES>
200208XXT PEB2 X,XXX.XX CONS 8024 XX.X 11XXXX 33 = X.XX
950824SD1 P4B1 16,643.70 SAI 1033 284.0 115851 XYYY33 = 25,386.41
960830MD2 P ID 0.00 BAKER REH M001 662.0 REHAB 33 = 0.00
970825SD3 P4B1+19,216.34 SAI+2C-B1 3033 324.0 115851 XXYY33 = 25,591.81
980821TD4 P4B2+22,996.60 SAI+2C-B1 4026 358.0 115851 HXXYY33 = 20,200.00
19990820S PEB1 22,758.58 HDR 5032 283.5 116615 33 = 23,140.00
20001103T PEB1 22,758.28 HDR 6024 283.5 116615 33 = 16,695.00
200108XXS PEB1 X,XXX.XX CONS 7032 XX.X 11XXXX 33 = X.XX
QAF-LR/MN 5/24/93,FV-LR/JJ 8/9,10/19/3,7/19/4R= 4UW, I= 2UW FV LR 8/22/95
0411STZZ000001U1927 HRS = 282/ PEB1 / 16A18 /1995 XXXXXXXXXXX33= 30,000
BRDG POSTING SHOULD BE REDUCED TO 17T 22T COMB; REF LETTER 08/26/97
004001 AMBRIDGE-ALIQUIPPA BR OV CONRAIL & P&LE(CSX) RR&apos;S
X932
BAKER WILL SEND CALCS TO SAI VIA ANDY BEAV CO BY JUNE 1, 1997.
ORIG CALC BY BAKER ON 12/87, REHAB BY BAKER IN &apos;96(SPANS 1 & 10
INCL&apos;D ?)
A H
SPAN 1 OV 0065 & SPAN 10 OV 0051 TRANSF&apos;D FR STATE TO BEAV CO ON
02/06/97.
LR 641 STA 60+80 SPAN 1 OVER SR 0065 45&apos;
APPL3955 STA 84+87 SPAN 10 OVER SR 0051 (NB) 43&apos;
SCOUR EVALUATION 11359 W06 = 4 E29-A = 4
SF SP SW DDDATE USGSFV USGSSD EP DSTAT USGSSF EF SAS FEDCAT
B H -- 11141997 MMYYYY 032001 4 ----- 082002 4 047 2A1
910606XXX NBIS 3,173.80 SAI XXXX I 02 UNTYYYY YYYYEST= 4,000
920612XXX NBIS 7,283.00 SAI XXXX R 04 UNTYYYY YYYYEST= 8,000
940830TWO A1-5 5,537 P C & S TW01 R 04 UNTYYYY YYYYEST= 6,000
961031UW1 A1-5 4,599 P C & S UW R 04 430533 XYYYEST= 8,000
M-00870144
MAP D13 D14 HSOR HSCCV SPR CK COMMENT OVER O DATE P/F
THIS LINE IS RESERVED FOR CCV DATA
STAT IR IC ACM INSP ACM QNTY # LOCATION OF ACM
B B 0 0 MMDDYYYY UNKNWN 0 ***
B H -- 11141997 MMYYY</NOTES>
</BRIDGE>
</BEGIN_DATA>
Could my problem be related to using the DataSet view of the XMLDATADOC
instead of the the elemenet/node?

Thanks!

"Martin Honnen" wrote:
Trac Bannon wrote:
When I load XML from a file into a dotNet XMLDataDocument, the UTF-8 codes
are resolved but the 5 special XML entities are not.
>
How can I force those 5 special character types to be translated?
Please show us an XML sample you have and then explain us which
properties or methods of XmlDataDocument exactly you use.
For instance with e.g.
<element>Kibo & Xibo</element>
I am sure that the InnerText property of that element node returns the
string
'Kibo & Xibo'
so with that property the entity reference '&' (& a m p ; for web
forum readers) is "translated" into the character '&'.

--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 29 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.