473,698 Members | 2,528 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Encoding.Defaul t - reliable ???

I am having alot of difficulty with text files in .NET when they have special
characters like Ã*, ó, ç etc...

When i read a text file with them and then write it back out it ignores all
of those characters completely.

I tried all the encoding types and it seems only Encoding.Defaul t does it
right... but it sounds dangerous... can I rely on Encoding.Defaul t behaving
like this for all other machines?
Nov 2 '06 #1
7 5779
Thus wrote MrNobody,
I am having alot of difficulty with text files in .NET when they have
special characters like í, ó, ç etc...

When i read a text file with them and then write it back out it
ignores all of those characters completely.

I tried all the encoding types and it seems only Encoding.Defaul t does
it right...
The question is how do you check the resulting file? It's pretty likely that
your editor wasn't up to the task to decode the file correctly.
but it sounds dangerous... can I rely on Encoding.Defaul t
behaving like this for all other machines?
Depends on what reach you require, but around the globe certainly no. UTF-8
or UTF-16 are much better choices.

Cheers,
--
Joerg Jooss
ne********@joer gjooss.de
Nov 2 '06 #2
Hi Nobody,

From the Docs of the .NET Framework Encoding.Defaul t is the current
ANSI-CodePage. So it can easily change, for all if the application runs on
another system.
Encoding.Unicod e and Encoding.UTF8 can encode any Character so they should
work fine.

"MrNobody" <Mr******@discu ssions.microsof t.comschrieb im Newsbeitrag
news:90******** *************** ***********@mic rosoft.com...
>I am having alot of difficulty with text files in .NET when they have
special
characters like í, ó, ç etc...

When i read a text file with them and then write it back out it ignores
all
of those characters completely.

I tried all the encoding types and it seems only Encoding.Defaul t does it
right... but it sounds dangerous... can I rely on Encoding.Defaul t
behaving
like this for all other machines?

Nov 2 '06 #3
OK, but I have a problem when I use either Encoding.Unicod e or Encoding.UTF8.

string src = // path to source file
string tgt = // path where to write new file to

string txt = System.IO.File. ReadAllText(src , Encoding.Unicod e);

Console.WriteLi ne("index = " + txt.IndexOf("so mething"));

System.IO.File. WriteAllText(tg t, txt, Encoding.UTF8); // or Encoding.Unicod e
When I run that code, the index is always -1 for a string which is
definitely in the file and the file it prints out has complete data loss, it
is just full of these little black boxes...
Nov 2 '06 #4
How did you generate the textfile beforehand? Obviously it is not stored in
UTF-16 encoding. If you used Nodepad an simply hit 'save' i guess it is
stored in the standard codpage of your system. So Encoding.Defaul t would be
right here. But if the textfile was created on a machine with a different
default codepage even this will not work. But in Nodepad you con choose the
encoding in the 'save as'-dialog. Other editors will have similar features.

In any case you will first have to know, in wich encoding the file was
stored. Alas there is no general way to detect it. At least i don't know.

The best situation is, when you can agree with the creator of the sourcefile
about the encoding. The next best situation is, someone knows the encoding
used while creating the file.

"MrNobody" <Mr******@discu ssions.microsof t.comschrieb im Newsbeitrag
news:6C******** *************** ***********@mic rosoft.com...
OK, but I have a problem when I use either Encoding.Unicod e or
Encoding.UTF8.

string src = // path to source file
string tgt = // path where to write new file to

string txt = System.IO.File. ReadAllText(src , Encoding.Unicod e);

Console.WriteLi ne("index = " + txt.IndexOf("so mething"));

System.IO.File. WriteAllText(tg t, txt, Encoding.UTF8); // or
Encoding.Unicod e
When I run that code, the index is always -1 for a string which is
definitely in the file and the file it prints out has complete data loss,
it
is just full of these little black boxes...


Nov 2 '06 #5
Well, all I know is the files are created in Windows machines only., using
such programs as Notepad.

And this app is tagretted for Windows machines only, but it will be used by
people accross the globe who may need those special characters.

So is it safe then given these restrictions to rely on Encoding.Defaul t?

I still really don't understand why I can get the files to read/write OK
using Encoding.Defaul t but using any of the specific encodings fail...
Nov 2 '06 #6
MrNobody <Mr******@discu ssions.microsof t.comwrote:
Well, all I know is the files are created in Windows machines only., using
such programs as Notepad.

And this app is tagretted for Windows machines only, but it will be used by
people accross the globe who may need those special characters.

So is it safe then given these restrictions to rely on Encoding.Defaul t?

I still really don't understand why I can get the files to read/write OK
using Encoding.Defaul t but using any of the specific encodings fail...
You won't be able to correctly read a file unless you know its
encoding. For instance, if you try to read a UTF-8 encoded file using
Encoding.Defaul t, then any characters outside the ASCII range are
likely to end up being corrupted.

It sounds like you might be a bit fuzzy on what encodings are about.
See if
http://www.pobox.com/~skeet/csharp/unicode.html helps.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet Blog: http://www.msmvps.com/jon.skeet
If replying to the group, please do not mail me too
Nov 2 '06 #7
Thus wrote MrNobody,
OK, but I have a problem when I use either Encoding.Unicod e or
Encoding.UTF8.

string src = // path to source file
string tgt = // path where to write new file to
string txt = System.IO.File. ReadAllText(src , Encoding.Unicod e);

Console.WriteLi ne("index = " + txt.IndexOf("so mething"));

System.IO.File. WriteAllText(tg t, txt, Encoding.UTF8); // or
Encoding.Unicod e

When I run that code, the index is always -1 for a string which is
definitely in the file and the file it prints out has complete data
loss, it is just full of these little black boxes...
If you're consuming text files that have been authored outside of your application,
you have to use the encoding that was used to create the file in order to
read it.

Notepad for example can create both UTF-8 and UTF-16 encoded files, but neither
is its default encoding. So if you've created your test files in Notepad
without considering the encoding, they will end up encoded as something that
is compatible with or equal to Encoding.Defaul t.

Cheers,
--
Joerg Jooss
ne********@joer gjooss.de
Nov 3 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

14
3524
by: Sebastian Meyer | last post by:
Hi newsgroup, i am trying to replace german special characters in strings like str = re.sub('ö', 'oe', str) When i work with this, i always get the message UniCode Error: ASCII decoding error : ordinal not in range(128) Yes i have googled, i searched the faq, manual and python library and searched all known soruces of information. I played with the python
6
3585
by: Duncan M | last post by:
Hi all, I have seen a similar post to this in the past but no resolution. I will explain fully my problem: I am writing a text editor that will be used in several regions around the world but my testing will be done in Turkey (I am in GB). The output of this editor will be used in a DOS enviroment so any hint of it outputting unicode is out of the question.
2
7601
by: CMan | last post by:
Hi, I am reading a text file using a StreamReader in C# but the reader is unable to handle some of the characheters. Using the default encoding the program cannot handle accented characters. I tried opening the file using other encodings e.g. UTF7. UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having problems with the Euro symbol x80 and quote x92.
88
12504
by: Mike | last post by:
Is there a way to determine what a user's default email client is? I read a post from 3 years ago that said no. I guess I'm hoping something has come along since then.
4
8840
by: fitsch | last post by:
Hi, I am trying to write a generic RSS/Atom/OPML feed client. The problem is, that those xml feeds may have different encodings: - <?xml version="1.0" encoding="ISO-8859-1" ?>... - <?xml version="1.0" encoding="utf-8" ?>... - ... I am using the WebRequest functionality to get the feeds. So, my code
9
23705
by: Mark | last post by:
I've run a few simple tests looking at how query string encoding/decoding gets handled in asp.net, and it seems like the situation is even messier than it was in asp... Can't say I think much of the "improvements", but maybe someone here can point me in the right direction... First, it looks like asp.net will automatically read and recognize query strings encoded in utf8 and 16-bit unicode, only the latter is some mutant, non-standard...
4
6699
by: I.Charitopoulos | last post by:
The reason I want to do so, is that I am sending to DOS and I am pretty certain that it will not work. Everything I've tried so far hasnt. In my test environment (Windows to Windows) this works perfectly, but not when sending to DOS: Private Function bytearray2string(ByVal input As Byte()) As String Dim output As String
11
31724
by: LucaJonny | last post by:
Hi, I've got a problem using StreamReader in VB.NET. I try to read a txt file that contains extended characters and theese are removed from the line that is being read. I've read a lot of articles about ANSI encoding like this http://support.microsoft.com/default.aspx?scid=kb;en-us;889835 but System.Text.Encoding.Default don't work!!
4
34167
by: George | last post by:
Hi, I am puzzled by the following and seeking some assistance to help me understand what happened. I have very limited encoding knowledge. Our SAP system writes out a text file which includes German characters. 1. When I use StreamReader(System.String filepath) without specifying an encoding method, the German characters such as Ä are lost when I do a ReadLine()
0
8609
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9169
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8899
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7738
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6528
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5861
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4371
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
2335
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2007
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.