473,324 Members | 1,856 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,324 software developers and data experts.

ASCII files

I'm trying to load ASCII files that contain characters from the French
language in a way that is independant of whatever Locale the machine is
configured to use.

So If I have machine who's default Locale is "en-US" and I open some
french text like this:

[C# exaple that has the same behaviour in any .net languages]

StreamReader sr = new StreamReader("C:\\someFrenchFile.txt");
string strInput = sr.ReadToEnd();

Suppose the file contains this:
"Le Québec en été."
the characters that I get in strInput are:
"Le Qu?bec en ?t?."

If I change the default Locale in the Control Panel and use
Encoding.Default in the StreamReader's constructor parameters, I get the
right characters in strInput:
"Le Québec en été."

What I'd like to be able to do is load the french string with the right
characters regardless of what's the machine's default Locale. What's the
way to programmatically decide what Locale to use with all ASCII strings?

Alexandre Leduc

Nov 15 '05 #1
14 1918
Your stream reader is missing something important for the second parameter
use System.Text.Encoding.ASCII
otherwise it should eb unicode i believe.

This should helo you

"Alex Leduc" <le******@netscape.net> wrote in message
news:eO**************@TK2MSFTNGP09.phx.gbl...
I'm trying to load ASCII files that contain characters from the French
language in a way that is independant of whatever Locale the machine is
configured to use.

So If I have machine who's default Locale is "en-US" and I open some
french text like this:

[C# exaple that has the same behaviour in any .net languages]

StreamReader sr = new StreamReader("C:\\someFrenchFile.txt");
string strInput = sr.ReadToEnd();

Suppose the file contains this:
"Le Québec en été."
the characters that I get in strInput are:
"Le Qu?bec en ?t?."

If I change the default Locale in the Control Panel and use
Encoding.Default in the StreamReader's constructor parameters, I get the
right characters in strInput:
"Le Québec en été."

What I'd like to be able to do is load the french string with the right
characters regardless of what's the machine's default Locale. What's the
way to programmatically decide what Locale to use with all ASCII strings?

Alexandre Leduc

Nov 15 '05 #2
"Alex Leduc" <le******@netscape.net> wrote in message
news:eO**************@TK2MSFTNGP09.phx.gbl...
I'm trying to load ASCII files that contain characters from the French
language in a way that is independant of whatever Locale the machine is
configured to use. [snip] What I'd like to be able to do is load the french string with the right
characters regardless of what's the machine's default Locale. What's the
way to programmatically decide what Locale to use with all ASCII strings?


If you know what's the code page of the file you can try to set
StreamReader's CurrentEncoding property to ASCIIEncoding with the CodePage
set to file's code page. [Warning! haven't tried it myself :)]

OTOH if you want to read arbitrary file in arbitrary language I'm afraid
it's not possible... (or, at least, I don't know the way...)
Nov 15 '05 #3
Alex Leduc <le******@netscape.net> wrote:
I'm trying to load ASCII files that contain characters from the French
language


Assuming you mean accented characters, that's impossible. ASCII doesn't
contain any accented characters.

See http://www.pobox.com/~skeet/csharp/unicode.html

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet/
If replying to the group, please do not mail me too
Nov 15 '05 #4
Dave Quigley[work] wrote:
Your stream reader is missing something important for the second parameter
use System.Text.Encoding.ASCII
otherwise it should eb unicode i believe.


I forgot to mention that I've tried that and the result I get is:

"Le Qubec en t."

It removes all accentuated characters from the string.

Nov 15 '05 #5
Bruno Jouhier [MVP] wrote:
ASCII is a 7-bit codeset and it does not cover accentuated characters.

What you want is probably ISO-Latin1 also known as ISO-8859-1, which
contains the French accentuated characters. So, you should specify this
encoding when you open the StreamReader.

Bruno.


Could you tell me how to do that in code? I find the SDK documentation
on this topic to be a bit confusing.

Nov 15 '05 #6

"Alex Leduc" <le******@netscape.net> wrote in message
news:eO**************@TK2MSFTNGP09.phx.gbl...
I'm trying to load ASCII files that contain characters from the French
language in a way that is independant of whatever Locale the machine is
configured to use.


If it contains anything non-English (such as accented letters), it's not
ASCII.

What you have is some kind of extension of ASCII, and there are many such.
Nov 15 '05 #7
Try:

StreamReader sr = new StreamReader("C:\\someFrenchFile.txt",
System.Text.Encoding.GetEncoding("ISO-8859-1") );
string strInput = sr.ReadToEnd();

"Alex Leduc" <le******@netscape.net> wrote in message
news:O1**************@TK2MSFTNGP09.phx.gbl...
Bruno Jouhier [MVP] wrote:
ASCII is a 7-bit codeset and it does not cover accentuated characters.

What you want is probably ISO-Latin1 also known as ISO-8859-1, which
contains the French accentuated characters. So, you should specify this
encoding when you open the StreamReader.

Bruno.


Could you tell me how to do that in code? I find the SDK documentation
on this topic to be a bit confusing.

Nov 15 '05 #8
>> Your stream reader is missing something important for the second parameter
use System.Text.Encoding.ASCII
otherwise it should eb unicode i believe.


I forgot to mention that I've tried that and the result I get is:
"Le Qubec en t."
It removes all accentuated characters from the string.


Is it really ASCII (as in DOS / OEM), or is it ANSI (as in a regular
Windows file)??

If it's ANSI / Windows, try using System.Text.Encoding.Default. Works
for German umlauts for me :-)

Marc

================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #9
Yeah I think what I was talking about is ANSI. I never understood the
difference between the two so I assumed they were two different names
for the same thing.

Nov 15 '05 #10
Thanks a lot. That worked fine.

Now what I'd like to know is if there's a way to tell my application to
always use this encoding for whatever string related methods/types it
has to use.

Kind of like in C

char *loc = setlocale(LC_ALL, "French_Canada.1252");

which can set the appication's locale at a global scope.

Alexandre Leduc

Nov 15 '05 #11
Thanks for the link. I really needed to read somthing like this.

Alexandre Leduc

Nov 15 '05 #12
>Yeah I think what I was talking about is ANSI. I never understood the
difference between the two so I assumed they were two different names
for the same thing.


No, not really - the ASCII stuff is "old" DOS age thingies - the ASCII
character set is standardized up to ASCII 127 and country-specific
above that - it usually contains things like French accented
characters, German Umlauts (ö ä ü) and so forth, plus line drawing
characters and a few mathematical symbols.

ANSI is the Windows base character set, which tossed out the
line-drawing characters and math stuff, and added extra characters.

Marc
================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #13
>Alex Leduc <le******@netscape.net> wrote:
I'm trying to load ASCII files that contain characters from the French
language


Assuming you mean accented characters, that's impossible. ASCII doesn't
contain any accented characters.


8-bit ASCII (e.g. codepage 850) does contain accented chars and German
umlauts etc - ASCII doesn't always stop at 7 bit, you know! There's a
whole wide world outside of English-speaking 7 bits! :-)

Marc
================================================== ==============
Marc Scheuner May The Source Be With You!
Bern, Switzerland m.scheuner(at)inova.ch
Nov 15 '05 #14
Jon Skeet schrieb:
Alex Leduc <le******@netscape.net> wrote:
I'm trying to load ASCII files that contain characters from the
French language


Assuming you mean accented characters, that's impossible. ASCII
doesn't contain any accented characters.

See http://www.pobox.com/~skeet/csharp/unicode.html


In addition to reading Jon's excellent article, I recommend to have a
look at

http://www.microsoft.com/globaldev/r...ce/cphome.mspx

to understand the subtle differences between not-quite-the-same
encodings like ISO-8859-1(5) and CP-1252.

Cheers,
--
Joerg Jooss
jo*********@gmx.net
Nov 15 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Sebastian Krause | last post by:
Hello, I tried to read in some large ascii files (200MB-2GB) in Python using scipy.io.read_array, but it did not work as I expected. The whole idea was to find a fast Python routine to read in...
8
by: W. de Jonge | last post by:
Who can help me? I want to create a link(href) which opens an .doc or an ..xls directly in MS Word or MS Excell and not in IE so that I don't have to save the document first en open it from...
3
by: Mark | last post by:
I'm working with ASCII data files provided by data vendors in a standard format. These files contains lots of various pieces of information for each reporting entity in the file. Currently I...
16
by: chunhui_true | last post by:
I know in ASCII '\r' is 0x0d,'\n' is 0x0a. But some say ASCII characters in UTF8 is unchanged. Now I want to know in UTF8 '\r' and '\n' are already 0x0d and 0x0a?? Could anybody can tell me? Very...
4
by: wob | last post by:
Many thanks for those who responded to my question of "putting greek char into C string". In searching for an solution, I noticed that there are more than one version of "Extended ASCII...
18
by: Ger | last post by:
I have not been able to find a simple, straight forward Unicode to ASCII string conversion function in VB.Net. Is that because such a function does not exists or do I overlook it? I found...
12
by: IamIan | last post by:
I searched the archives but couldn't find anyone else with this problem. Basically I'm grabbing all ASCII files in a directory and doing geoprocessing on them. I need to calculate a z-factor based...
18
by: John | last post by:
Hi, I'm a beginner is using C# and .net. I have big legacy files that stores various values (ints, bytes, strings) and want to read them into a C# programme so that I can store them in a...
12
by: bg_ie | last post by:
Hi, I'm updating my program to Python 2.5, but I keep running into encoding problems. I have no ecodings defined at the start of any of my scripts. What I'd like to do is scan a directory and...
399
by: =?UTF-8?B?Ik1hcnRpbiB2LiBMw7Z3aXMi?= | last post by:
PEP 1 specifies that PEP authors need to collect feedback from the community. As the author of PEP 3131, I'd like to encourage comments to the PEP included below, either here (comp.lang.python), or...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.