473,322 Members | 1,614 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,322 software developers and data experts.

Accentuated char

Hello,

The methods Read() of the class StreamReader don't read the accentuated
characters. When an accentuated character is present in a file, Read() skip
it and read the following character. Missing something?

Thanks for help

P. Cloup
Nov 15 '05 #1
13 2181
Pascal Cloup <pc****@biogesta.fr> wrote:
The methods Read() of the class StreamReader don't read the accentuated
characters. When an accentuated character is present in a file, Read() skip
it and read the following character. Missing something?


Chances are you've got the wrong encoding. It certainly *does* read
accented characters when everything is correct. How are you
constructing your StreamReader, and what's your data source?

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #2
Hello Jon
"Jon Skeet [C# MVP]" <sk***@pobox.com> a écrit dans le message de
news:MP************************@msnews.microsoft.c om...
Pascal Cloup <pc****@biogesta.fr> wrote:
The methods Read() of the class StreamReader don't read the accentuated
characters. When an accentuated character is present in a file, Read() skip it and read the following character. Missing something?
Chances are you've got the wrong encoding. It certainly *does* read
accented characters when everything is correct. How are you
constructing your StreamReader, and what's your data source?


In fact i create a file stream object and 2 stream objects:
itsFileStream = File.Open( itsPath , FileMode.Open );

itsBinaryReader = new BinaryReader( itsFileStream , Encoding.ASCII );
// Perhaps the problem is here but i also try UTF8

itsStreamReader = new StreamReader( itsFileStream );

Depending of the nature of the file (binary or text) i use one of the 2
streams , but the 2 remain open (??); i need the BinaryStream to determine
if the file is text or not. All works fine except for accentuated characters
(é à è) which are skipped.

Any idea?

Thanks in advance,

Pascal Cloup
--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #3
Pascal Cloup <pc****@biogesta.fr> wrote:
Chances are you've got the wrong encoding. It certainly *does* read
accented characters when everything is correct. How are you
constructing your StreamReader, and what's your data source?
In fact i create a file stream object and 2 stream objects:
itsFileStream = File.Open( itsPath , FileMode.Open );

itsBinaryReader = new BinaryReader( itsFileStream , Encoding.ASCII );
// Perhaps the problem is here but i also try UTF8

itsStreamReader = new StreamReader( itsFileStream );


That sounds like a bad idea to start with. Two readers on the same
stream is bound to cause problems.
Depending of the nature of the file (binary or text) i use one of the 2
streams , but the 2 remain open (??); i need the BinaryStream to determine
if the file is text or not. All works fine except for accentuated characters
(é à è) which are skipped.


I thought the point was that it was definitely a text file - otherwise
why are you trying to read it? *Any* file can be a text file, but it
depends on what encoding is being used as to what that file means.

You need to know what encoding the file is in, and specify that to the
StreamReader - ignore the BinaryReader.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #4
Hi Pascal,

Thanks for posting in the community.

From your description, I understand that you want to read accented
characters(in a .txt file) with streamreader/binaryreader,
Please correct me if there is any misunderstand.

First you should have a valid data source as Jon mentioned, for example,
you can save the data with UTF-8/Unicode encoding in the Notepad.

Then, for the reason that the accented character is double character
encoding(byte), I suggest you to use the BinaryReader to read the accented
characters:
itsBinaryReader = new BinaryReader(itsFileStream, Encoding.UTF8); //or
Encoding.Unicode

Now, you can use the Byte[] to read accented characters from the stream
object.
Please apply my suggestion above and let me know if it helps resolve your
problem.
Thanks!

Best regards,

Gary Chang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------

Nov 15 '05 #5
Gary Chang <v-******@online.microsoft.com> wrote:
Hi Pascal,

Thanks for posting in the community.

From your description, I understand that you want to read accented
characters(in a .txt file) with streamreader/binaryreader,
Please correct me if there is any misunderstand.

First you should have a valid data source as Jon mentioned, for example,
you can save the data with UTF-8/Unicode encoding in the Notepad.

Then, for the reason that the accented character is double character
encoding(byte), I suggest you to use the BinaryReader to read the accented
characters:
itsBinaryReader = new BinaryReader(itsFileStream, Encoding.UTF8); //or
Encoding.Unicode

Now, you can use the Byte[] to read accented characters from the stream
object.


Using BinaryReader is a bad idea - StreamReader is designed for exactly
the job required.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #6
Hello Jon & Gary

You need to know what encoding the file is in, and specify that to the
StreamReader - ignore the BinaryReader.


Ok, now i create only a StreamReader object for text file.

When the file is created with Encoding.UTF8, the StreamReader object reads
correctly the accentueted char.
But if a file is created with Encoding.Default (or ASCII), the StreamReader
object
skip the accentuated char.

I understand that my problem depends on the constrains of Encoding, but:

How to know the Encoding of a file before creating a reader?
How to specify the Encoding of a StreamReader?

Thanks for help,

Pascal Cloup
--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #7
Pascal Cloup <pc****@biogesta.fr> wrote:
You need to know what encoding the file is in, and specify that to the
StreamReader - ignore the BinaryReader.
Ok, now i create only a StreamReader object for text file.

When the file is created with Encoding.UTF8, the StreamReader object reads
correctly the accentueted char.
But if a file is created with Encoding.Default (or ASCII), the StreamReader
object skip the accentuated char.

I understand that my problem depends on the constrains of Encoding, but:

How to know the Encoding of a file before creating a reader?


You should just know it - there's no absolutely accurate way of
determining an encoding from just the binary data. There are ways you
can guess it heuristically, but there's nothing in the framework which
will do this for you. (Some encodings will fix themselves in terms of
endianness, but that's not the kind of issue you're looking at here.)
How to specify the Encoding of a StreamReader?


StreamReader reader = new StreamReader(stream, encoding)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #8
Hello,

Thank you both,

Finally, i use the Encoding.Default (that corespond to ANSI character set)
encoding for both StreamWriter and StremReader. I can not use UTF8/7 or
unicode for some reasons of compatibility with files created with older
software or created on other platform (Mac).

Nevertheless, it seems that the StreamReader doesn't use the
Encoding.Default by default.

with kind greetings

Pascal Cloup
"Jon Skeet [C# MVP]" <sk***@pobox.com> a écrit dans le message de
news:MP************************@msnews.microsoft.c om...
Pascal Cloup <pc****@biogesta.fr> wrote:
You need to know what encoding the file is in, and specify that to the
StreamReader - ignore the BinaryReader.


Ok, now i create only a StreamReader object for text file.

When the file is created with Encoding.UTF8, the StreamReader object reads correctly the accentueted char.
But if a file is created with Encoding.Default (or ASCII), the StreamReader object skip the accentuated char.

I understand that my problem depends on the constrains of Encoding, but:

How to know the Encoding of a file before creating a reader?


You should just know it - there's no absolutely accurate way of
determining an encoding from just the binary data. There are ways you
can guess it heuristically, but there's nothing in the framework which
will do this for you. (Some encodings will fix themselves in terms of
endianness, but that's not the kind of issue you're looking at here.)
How to specify the Encoding of a StreamReader?


StreamReader reader = new StreamReader(stream, encoding)

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Nov 15 '05 #9
Pascal Cloup <pc****@biogesta.fr> wrote:
Nevertheless, it seems that the StreamReader doesn't use the
Encoding.Default by default.


Indeed it does. As the docs for the constructor StreamReader(Stream)
say:

<quote>
This constructor initializes the encoding to UTF8Encoding
</quote>

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #10
GG
I had a similar problem while reading a text file with
unicode and UTF8. Both did not return chars that look weird example ÿ.
Had to use Encoding.ASCII which read all chars but getting ????????? for
unkown ascii chars.
I did not want to loose the bufferRead position so stayed with Ascii.

*** Sent via Developersdex http://www.developersdex.com ***
Don't just participate in USENET...get rewarded for it!
Nov 15 '05 #11
GG <gg@hotmail.com> wrote:
I had a similar problem while reading a text file with
unicode and UTF8. Both did not return chars that look weird example =3F.
Had to use Encoding.ASCII which read all chars but getting ????????? for
unkown ascii chars.
I did not want to loose the bufferRead position so stayed with Ascii.


That suggests you were using the wrong encoding then - perhaps you
should have used Encoding.Default instead? It's hard to know without
knowing what you were trying to read - but you should know what
encoding your file is in rather than guessing until something works.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #12
Hi Pascal,

Thanks for your response!

Skeet said:
Using BinaryReader is a bad idea,
StreamReader is designed for exactly the job required.


Yes, I agree with it, StreamReader is designed for the txt file.

However at first time I try to read the accented characters with
StreamReader, I got null for them, and the BinaryReader actually retrieve
the correct characters, so I think the BinaryReadermay be better.

Today, I tested that program again, and found the StreamReader.Read() works
fine this time, I think I have missed something.
Thanks!

Best regards,

Gary Chang
Microsoft Online Partner Support

Get Secure! - www.microsoft.com/security
This posting is provided "AS IS" with no warranties, and confers no rights.
--------------------

Nov 15 '05 #13
Gary Chang <v-******@online.microsoft.com> wrote:
Skeet said:
Using BinaryReader is a bad idea,
StreamReader is designed for exactly the job required.


Yes, I agree with it, StreamReader is designed for the txt file.

However at first time I try to read the accented characters with
StreamReader, I got null for them, and the BinaryReader actually retrieve
the correct characters, so I think the BinaryReadermay be better.


If BinaryReader was reading the correct characters, then you must have
been giving it the correct encoding, while giving StreamReader the
wrong encoding. BinaryReader isn't capable of guessing an encoding any
better than StreamReader is.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 15 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

6
by: Jean-Marc Molina | last post by:
Hello, I'm trying to generate a RSS newsfeed using the DOM XML functions. However I can't find a way to use accentuated characters. I even tried to specify a character encoding set but it...
3
by: Ndeye | last post by:
hi everybody, I would like to parse an xml file containing some accentuated characters like è. This causes my java program to throw this error : file:/c:/dxl/Abbink.xml; Line 216; Column 26...
9
by: Christopher Benson-Manica | last post by:
I need a smart char * class, that acts like a char * in all cases, but lets you do some std::string-type stuff with it. (Please don't say to use std::string - it's not an option...). This is my...
2
by: Cesar Ronchese | last post by:
Hello, All! I'm working with accentuated characters in my XML files, and I have found problems to load and save it. First, for this case, I always have my XML in memory, and I load it via...
2
by: Peter Nilsson | last post by:
In a post regarding toupper(), Richard Heathfield once asked me to think about what the conversion of a char to unsigned char would mean, and whether it was sensible to actually do so. And pete has...
5
by: Marcos Ribeiro | last post by:
Hi I'm trying to read a textfile using System.IO.StreamReader, but all accentuated characters are skiped. Why's that? There is any workaround? Thanks Marcos
2
by: jcdperf | last post by:
Hello, I have small problems while reading of text files containing accentuated characters. (like é è à ç ...). I use this basic code : Dim sr As StreamReader Try sr = New...
4
by: Paul Brettschneider | last post by:
Hello all, consider the following code: typedef char T; class test { T *data; public: void f(T, T, T); void f2(T, T, T);
29
by: Kenzogio | last post by:
Hi, I have a struct "allmsg" and him member : unsigned char card_number; //16 allmsg.card_number
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.