473,385 Members | 1,602 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Reading XML Encoding errors

AGP
I am programming an XML reader in VB.NET 2005 and it works fairly well.
Once in a while though I encounter an old XML file without the header
<?xml version="1.0" encoding="UTF-8"?>
It craps out on the Load with an error similar to "Invalid character in the
given encoding. Line 3, position 5475070".
After some research the character in question is the copyright character. My
question is how can i force the reader to assume UTF-8?
It seems like my other newer files do not have this problem, just my older
files. I want to be able to catch this error
and then attempt to load the file. It also seems like this older file does
not have a BOM so Im assuming the XML reader has no idea how to interpret
it. Im hoping i can force a UTF-8 read of the XML file.

As a secondary question, it seems like these older XML files were
originally written out as one or two huge lines. is there a way to output a
copy
that is more user readable in the node-type format with line breaks and all?

Thanks for any help
AGP

Sep 30 '07 #1
4 4400
"AGP" <si**********@softhome.netschrieb:
Once in a while though I encounter an old XML file without the header
<?xml version="1.0" encoding="UTF-8"?>
It craps out on the Load with an error similar to "Invalid character in
the given encoding. Line 3, position 5475070".
After some research the character in question is the copyright character.
My question is how can i force the reader to assume UTF-8?
It seems like my other newer files do not have this problem, just my older
files. I want to be able to catch this error
and then attempt to load the file. It also seems like this older file does
not have a BOM so Im assuming the XML reader has no idea how to interpret
it. Im hoping i can force a UTF-8 read of the XML file.
IIRC UTF-8 is the default encoding for XML files.

--
M S Herfried K. Wagner
M V P <URL:http://dotnet.mvps.org/>
V B <URL:http://dotnet.mvps.org/dotnet/faqs/>

Sep 30 '07 #2
AGP

"Herfried K. Wagner [MVP]" <hi***************@gmx.atwrote in message
news:O1**************@TK2MSFTNGP03.phx.gbl...
"AGP" <si**********@softhome.netschrieb:
>Once in a while though I encounter an old XML file without the header
<?xml version="1.0" encoding="UTF-8"?>
It craps out on the Load with an error similar to "Invalid character in
the given encoding. Line 3, position 5475070".
After some research the character in question is the copyright character.
My question is how can i force the reader to assume UTF-8?
It seems like my other newer files do not have this problem, just my
older files. I want to be able to catch this error
and then attempt to load the file. It also seems like this older file
does not have a BOM so Im assuming the XML reader has no idea how to
interpret it. Im hoping i can force a UTF-8 read of the XML file.

IIRC UTF-8 is the default encoding for XML files.
ok so then why do i get the error? I did a test and loaded the XML into
notepad
and then saved that file as Text UTF-8 and it seems that file is read
correctly. So my
question is why does the original not load properly?

AGP
Oct 1 '07 #3
"AGP" <si**********@softhome.netschrieb:
>IIRC UTF-8 is the default encoding for XML files.

ok so then why do i get the error? I did a test and loaded the XML into
notepad
and then saved that file as Text UTF-8 and it seems that file is read
correctly. So my
question is why does the original not load properly?
Maybe it's stored in an encoding other than UTF-8, Windows ANSI, for
example.

--
M S Herfried K. Wagner
M V P <URL:http://dotnet.mvps.org/>
V B <URL:http://dotnet.mvps.org/dotnet/faqs/>

Oct 1 '07 #4
AGP wrote:
I will ask but the source could be from a variety of providers so I may end
up
with no concrete answer as to whats used for encoding. However i did open
the file in Notepad and added the XML declaration and then just did a plain
old save and the file still errors out. But in my mind of there is no
declaration then
the functions assume a UTF-8 correct? But not sure why if this is the case
why
I still get an error.
Try whether using
New StreamReader("file.xml", Encoding.Default)
works with those files.
--

Martin Honnen --- MVP XML
http://JavaScript.FAQTs.com/
Oct 2 '07 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Nick | last post by:
Hi ! I want to load an old Pascal-Dos-File where records stand in. When i view the file in a HEX-Editor it's clear how to acces these Strings and chars in that file. Since these are old 8BIT...
6
by: Neil Patel | last post by:
I have a log file that puts the most recent record at the bottom of the file. Each line is delimited by a \r\n Does anyone know how to seek to the end of the file and start reading backwards?
5
by: Ed West | last post by:
Hi, I am trying to read a file, make changes, and write it to a new file. The original file has the copyright character © which is ascii 169 I believe, which is more than 7 bits. I am using...
3
by: Nelson R. | last post by:
Hi, im using a form to get some input from the user. This form is in a HTML file. When I post the form directly to my email, i receive all fields correctly. Example test.html: <FORM...
9
by: jeff M via .NET 247 | last post by:
I'm still having problems reading EBCDIC files. Currently itlooks like the lower range (0 to 127) is working. I have triedthe following code pages 20284, 20924, 1140, 37, 500 and 20127.By working I...
0
by: tshad | last post by:
I can't seem to retrieve messages that are not in my mailbox from Exchange. If I am reading mail from my Exchange server, I will get messages that are in my inbox that have already been read but...
8
by: =?gb2312?B?yMvR1MLkyNXKx8zs0cSjrM37vKvM7NHEsru8+7z | last post by:
I lookup the utf-8 form of delta from the link. http://www.fileformat.info/info/unicode/char/0394/index.htm and then I want to print it in the python ( I work under windows) #!/usr/bin/python...
4
by: AGP | last post by:
I am programming an XML reader in VB.NET 2005 and it works fairly well. Once in a while though I encounter an old XML file without the header <?xml version="1.0" encoding="UTF-8"?> It craps out on...
3
by: Benny the Guard | last post by:
I have a CSV file created by VisualBasic in UTF-8. If I open the file in vi/emacs I see the Byte-Order marker (BOM), <feff> So now when I read the file: import codecs f = open ('myfile')...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.