473,508 Members | 2,159 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Parse xml document containing accentuated characters (é, à)

hi everybody,

I would like to parse an xml file containing some accentuated
characters like è. This causes my java program to throw this error :

file:/c:/dxl/Abbink.xml; Line 216; Column 26
XSL Error: Could not parse c:\dxl\Abbink.xml document!
XSL Error: SAX Exception
Invalid UTF-8 code. (bytes: 0xffffffe9 0x20)

How can I take this in account ?

Thanks in advance
Ndeye
Jul 20 '05 #1
3 7952
Rule of thumb:
- if you specify UTF-8 encoding in xml declaration i.e. encoding="UTF-8"
then
you must SAVE your document in UTF-8.
- Alternatively you can specify encoding="ISO-8859-1" in xml declaration for
example and then you can save your doc in plain latin1 (same as ISO-8859-1)
encoding - this should work for you. Choosing UTF-8 at saving stage is an
option
too of course.

with respect,
Toni Uusitalo

"Ndeye" <ns******@hotmail.com> wrote in message
news:93**************************@posting.google.c om...
hi everybody,

I would like to parse an xml file containing some accentuated
characters like è. This causes my java program to throw this error :

file:/c:/dxl/Abbink.xml; Line 216; Column 26
XSL Error: Could not parse c:\dxl\Abbink.xml document!
XSL Error: SAX Exception
Invalid UTF-8 code. (bytes: 0xffffffe9 0x20)

How can I take this in account ?

Thanks in advance
Ndeye

Jul 20 '05 #2
Sorry I didn't mention that if you don't specify any encoding,
in xml it defaults to UTF-8.

That's a common mistake to SAVE a file in latin1 (file that's including
characters above ASCII 127) and not to specify ISO-8859-1, then parser tries
to read the file in UTF-8 (default) and fails of course.

with respect,
Toni Uusitalo

"Toni Uusitalo" <to**************@luukkudot.kom> wrote in message
news:LL***************@reader1.news.jippii.net...
Rule of thumb:
- if you specify UTF-8 encoding in xml declaration i.e. encoding="UTF-8"
then
you must SAVE your document in UTF-8.
- Alternatively you can specify encoding="ISO-8859-1" in xml declaration for example and then you can save your doc in plain latin1 (same as ISO-8859-1) encoding - this should work for you. Choosing UTF-8 at saving stage is an
option
too of course.

with respect,
Toni Uusitalo

"Ndeye" <ns******@hotmail.com> wrote in message
news:93**************************@posting.google.c om...
hi everybody,

I would like to parse an xml file containing some accentuated
characters like è. This causes my java program to throw this error :

file:/c:/dxl/Abbink.xml; Line 216; Column 26
XSL Error: Could not parse c:\dxl\Abbink.xml document!
XSL Error: SAX Exception
Invalid UTF-8 code. (bytes: 0xffffffe9 0x20)

How can I take this in account ?

Thanks in advance
Ndeye


Jul 20 '05 #3
You're right
I specified utf-8 and it works
Thanks
"Toni Uusitalo" <to**************@luukkudot.kom> wrote in message news:<fT**************@reader1.news.jippii.net>...
Sorry I didn't mention that if you don't specify any encoding,
in xml it defaults to UTF-8.

That's a common mistake to SAVE a file in latin1 (file that's including
characters above ASCII 127) and not to specify ISO-8859-1, then parser tries
to read the file in UTF-8 (default) and fails of course.

with respect,
Toni Uusitalo

"Toni Uusitalo" <to**************@luukkudot.kom> wrote in message
news:LL***************@reader1.news.jippii.net...
Rule of thumb:
- if you specify UTF-8 encoding in xml declaration i.e. encoding="UTF-8"
then
you must SAVE your document in UTF-8.
- Alternatively you can specify encoding="ISO-8859-1" in xml declaration

for
example and then you can save your doc in plain latin1 (same as

ISO-8859-1)
encoding - this should work for you. Choosing UTF-8 at saving stage is an
option
too of course.

with respect,
Toni Uusitalo

"Ndeye" <ns******@hotmail.com> wrote in message
news:93**************************@posting.google.c om...
hi everybody,

I would like to parse an xml file containing some accentuated
characters like è. This causes my java program to throw this error :

file:/c:/dxl/Abbink.xml; Line 216; Column 26
XSL Error: Could not parse c:\dxl\Abbink.xml document!
XSL Error: SAX Exception
Invalid UTF-8 code. (bytes: 0xffffffe9 0x20)

How can I take this in account ?

Thanks in advance
Ndeye


Jul 20 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
2941
by: Jean-Marc Molina | last post by:
Hello, I'm trying to generate a RSS newsfeed using the DOM XML functions. However I can't find a way to use accentuated characters. I even tried to specify a character encoding set but it...
4
3305
by: mikeyjudkins | last post by:
I have an XML file containing localized strings in 9 languages, encoded in Unicode (UTF-8). Im trying to parse this XML document via XSLT (Apache Xalan) to selectively render localized strings...
2
2237
by: Cesar Ronchese | last post by:
Hello, All! I'm working with accentuated characters in my XML files, and I have found problems to load and save it. First, for this case, I always have my XML in memory, and I load it via...
5
6676
by: Marcos Ribeiro | last post by:
Hi I'm trying to read a textfile using System.IO.StreamReader, but all accentuated characters are skiped. Why's that? There is any workaround? Thanks Marcos
2
2167
by: jcdperf | last post by:
Hello, I have small problems while reading of text files containing accentuated characters. (like é è à ç ...). I use this basic code : Dim sr As StreamReader Try sr = New...
25
5293
by: Wim Cossement | last post by:
Hello, I was wondering if there are a few good pages and/or examples on how to process form data correctly for putting it in a MySQL DB. Since I'm not used to using PHP a lot, I already found...
8
8810
by: DierkErdmann | last post by:
Hi ! I know that this topic has been discussed in the past, but I could not find a working solution for my problem: sorting (lists of) strings containing special characters like "ä", "ü",......
0
1181
by: rafficabdullah | last post by:
I need to select the records containing newline characters in them.The data may contain tab and spaces that should not come in the output.This can be done with regexp_like in oracle in 10g.but i...
0
964
by: themillenium | last post by:
Hello! I am trying to get an html page of urls containing special characters as %21,... "%21" stands for "!",.. e.g.: http://en.wikipedia.org/wiki/Joan_R%C3%B6ell (this page was picked my random)...
0
7225
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
7326
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
7046
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7498
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
5629
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
4707
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3195
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
3182
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
1558
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.