473,569 Members | 2,791 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Encoding.Defaul t - reliable ???

I am having alot of difficulty with text files in .NET when they have special
characters like Ã*, ó, ç etc...

When i read a text file with them and then write it back out it ignores all
of those characters completely.

I tried all the encoding types and it seems only Encoding.Defaul t does it
right... but it sounds dangerous... can I rely on Encoding.Defaul t behaving
like this for all other machines?
Nov 2 '06 #1
7 5772
Thus wrote MrNobody,
I am having alot of difficulty with text files in .NET when they have
special characters like í, ó, ç etc...

When i read a text file with them and then write it back out it
ignores all of those characters completely.

I tried all the encoding types and it seems only Encoding.Defaul t does
it right...
The question is how do you check the resulting file? It's pretty likely that
your editor wasn't up to the task to decode the file correctly.
but it sounds dangerous... can I rely on Encoding.Defaul t
behaving like this for all other machines?
Depends on what reach you require, but around the globe certainly no. UTF-8
or UTF-16 are much better choices.

Cheers,
--
Joerg Jooss
ne********@joer gjooss.de
Nov 2 '06 #2
Hi Nobody,

From the Docs of the .NET Framework Encoding.Defaul t is the current
ANSI-CodePage. So it can easily change, for all if the application runs on
another system.
Encoding.Unicod e and Encoding.UTF8 can encode any Character so they should
work fine.

"MrNobody" <Mr******@discu ssions.microsof t.comschrieb im Newsbeitrag
news:90******** *************** ***********@mic rosoft.com...
>I am having alot of difficulty with text files in .NET when they have
special
characters like í, ó, ç etc...

When i read a text file with them and then write it back out it ignores
all
of those characters completely.

I tried all the encoding types and it seems only Encoding.Defaul t does it
right... but it sounds dangerous... can I rely on Encoding.Defaul t
behaving
like this for all other machines?

Nov 2 '06 #3
OK, but I have a problem when I use either Encoding.Unicod e or Encoding.UTF8.

string src = // path to source file
string tgt = // path where to write new file to

string txt = System.IO.File. ReadAllText(src , Encoding.Unicod e);

Console.WriteLi ne("index = " + txt.IndexOf("so mething"));

System.IO.File. WriteAllText(tg t, txt, Encoding.UTF8); // or Encoding.Unicod e
When I run that code, the index is always -1 for a string which is
definitely in the file and the file it prints out has complete data loss, it
is just full of these little black boxes...
Nov 2 '06 #4
How did you generate the textfile beforehand? Obviously it is not stored in
UTF-16 encoding. If you used Nodepad an simply hit 'save' i guess it is
stored in the standard codpage of your system. So Encoding.Defaul t would be
right here. But if the textfile was created on a machine with a different
default codepage even this will not work. But in Nodepad you con choose the
encoding in the 'save as'-dialog. Other editors will have similar features.

In any case you will first have to know, in wich encoding the file was
stored. Alas there is no general way to detect it. At least i don't know.

The best situation is, when you can agree with the creator of the sourcefile
about the encoding. The next best situation is, someone knows the encoding
used while creating the file.

"MrNobody" <Mr******@discu ssions.microsof t.comschrieb im Newsbeitrag
news:6C******** *************** ***********@mic rosoft.com...
OK, but I have a problem when I use either Encoding.Unicod e or
Encoding.UTF8.

string src = // path to source file
string tgt = // path where to write new file to

string txt = System.IO.File. ReadAllText(src , Encoding.Unicod e);

Console.WriteLi ne("index = " + txt.IndexOf("so mething"));

System.IO.File. WriteAllText(tg t, txt, Encoding.UTF8); // or
Encoding.Unicod e
When I run that code, the index is always -1 for a string which is
definitely in the file and the file it prints out has complete data loss,
it
is just full of these little black boxes...


Nov 2 '06 #5
Well, all I know is the files are created in Windows machines only., using
such programs as Notepad.

And this app is tagretted for Windows machines only, but it will be used by
people accross the globe who may need those special characters.

So is it safe then given these restrictions to rely on Encoding.Defaul t?

I still really don't understand why I can get the files to read/write OK
using Encoding.Defaul t but using any of the specific encodings fail...
Nov 2 '06 #6
MrNobody <Mr******@discu ssions.microsof t.comwrote:
Well, all I know is the files are created in Windows machines only., using
such programs as Notepad.

And this app is tagretted for Windows machines only, but it will be used by
people accross the globe who may need those special characters.

So is it safe then given these restrictions to rely on Encoding.Defaul t?

I still really don't understand why I can get the files to read/write OK
using Encoding.Defaul t but using any of the specific encodings fail...
You won't be able to correctly read a file unless you know its
encoding. For instance, if you try to read a UTF-8 encoded file using
Encoding.Defaul t, then any characters outside the ASCII range are
likely to end up being corrupted.

It sounds like you might be a bit fuzzy on what encodings are about.
See if
http://www.pobox.com/~skeet/csharp/unicode.html helps.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 2 '06 #7
Thus wrote MrNobody,
OK, but I have a problem when I use either Encoding.Unicod e or
Encoding.UTF8.

string src = // path to source file
string tgt = // path where to write new file to
string txt = System.IO.File. ReadAllText(src , Encoding.Unicod e);

Console.WriteLi ne("index = " + txt.IndexOf("so mething"));

System.IO.File. WriteAllText(tg t, txt, Encoding.UTF8); // or
Encoding.Unicod e

When I run that code, the index is always -1 for a string which is
definitely in the file and the file it prints out has complete data
loss, it is just full of these little black boxes...
If you're consuming text files that have been authored outside of your application,
you have to use the encoding that was used to create the file in order to
read it.

Notepad for example can create both UTF-8 and UTF-16 encoded files, but neither
is its default encoding. So if you've created your test files in Notepad
without considering the encoding, they will end up encoded as something that
is compatible with or equal to Encoding.Defaul t.

Cheers,
--
Joerg Jooss
ne********@joer gjooss.de
Nov 3 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
3518
by: Sebastian Meyer | last post by:
Hi newsgroup, i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str) When i work with this, i always get the message UniCode Error: ASCII decoding error : ordinal not in range(128) Yes i have googled, i searched the faq, manual and python library and searched all known soruces of information. I...
6
3576
by: Duncan M | last post by:
Hi all, I have seen a similar post to this in the past but no resolution. I will explain fully my problem: I am writing a text editor that will be used in several regions around the world but my testing will be done in Turkey (I am in GB). The output of this editor will be used in a DOS enviroment so any hint of it outputting unicode is...
2
7582
by: CMan | last post by:
Hi, I am reading a text file using a StreamReader in C# but the reader is unable to handle some of the characheters. Using the default encoding the program cannot handle accented characters. I tried opening the file using other encodings e.g. UTF7. UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having problems with...
88
12377
by: Mike | last post by:
Is there a way to determine what a user's default email client is? I read a post from 3 years ago that said no. I guess I'm hoping something has come along since then.
4
8821
by: fitsch | last post by:
Hi, I am trying to write a generic RSS/Atom/OPML feed client. The problem is, that those xml feeds may have different encodings: - <?xml version="1.0" encoding="ISO-8859-1" ?>... - <?xml version="1.0" encoding="utf-8" ?>... - ... I am using the WebRequest functionality to get the feeds. So, my code
9
23692
by: Mark | last post by:
I've run a few simple tests looking at how query string encoding/decoding gets handled in asp.net, and it seems like the situation is even messier than it was in asp... Can't say I think much of the "improvements", but maybe someone here can point me in the right direction... First, it looks like asp.net will automatically read and recognize...
4
6692
by: I.Charitopoulos | last post by:
The reason I want to do so, is that I am sending to DOS and I am pretty certain that it will not work. Everything I've tried so far hasnt. In my test environment (Windows to Windows) this works perfectly, but not when sending to DOS: Private Function bytearray2string(ByVal input As Byte()) As String Dim output As String
11
31691
by: LucaJonny | last post by:
Hi, I've got a problem using StreamReader in VB.NET. I try to read a txt file that contains extended characters and theese are removed from the line that is being read. I've read a lot of articles about ANSI encoding like this http://support.microsoft.com/default.aspx?scid=kb;en-us;889835 but System.Text.Encoding.Default don't work!!
4
34128
by: George | last post by:
Hi, I am puzzled by the following and seeking some assistance to help me understand what happened. I have very limited encoding knowledge. Our SAP system writes out a text file which includes German characters. 1. When I use StreamReader(System.String filepath) without specifying an encoding method, the German characters such as Ä are...
0
7693
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main...
0
7605
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7917
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8118
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
6277
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
1
5501
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
3631
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2105
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
933
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.