473,698 Members | 2,196 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Character encoding problem?

Hi,

I am reading a text file using a StreamReader in C# but the reader is unable
to handle some of the characheters.

Using the default encoding the program cannot handle accented characters. I
tried opening the file using other encodings e.g. UTF7.
UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having
problems with the Euro symbol x80 and quote x92.

How do I read this file correctly?
Is this problem caused by the encoding?
Is there a way to determine the file's encoding at runtime?
How else can I find out the encoding?

Thanks

Colin

Jul 21 '05 #1
2 7601
CMan <cm**@nospam.no spam> wrote:
I am reading a text file using a StreamReader in C# but the reader is unable
to handle some of the characheters.

Using the default encoding the program cannot handle accented characters. I
tried opening the file using other encodings e.g. UTF7.
UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having
problems with the Euro symbol x80 and quote x92.
If your file is using 0x80 for the Euro symbol, that should help to
narrow it down...

Have you tried using Encoding.Defaul t? That's not the same as the
default encoding for StreamReader when you don't specify an encoding.
(It's the default encoding for your computer, instead.)
How do I read this file correctly?
By specifying the correct encoding.
Is this problem caused by the encoding?
Almost certainly.
Is there a way to determine the file's encoding at runtime?
There are ways you can try to guess it heuristically, but nothing
foolproof.
How else can I find out the encoding?


Well, what generated this text file to start with?

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Jul 21 '05 #2
Thanks Jon.

You got in one. Encoding.Defaul t fixed on my my machine.

Should have spotted that one.

Colin
"Jon Skeet [C# MVP]" <sk***@pobox.co m> wrote in message
news:MP******** *************** *@msnews.micros oft.com...
CMan <cm**@nospam.no spam> wrote:
I am reading a text file using a StreamReader in C# but the reader is unable to handle some of the characheters.

Using the default encoding the program cannot handle accented characters. I tried opening the file using other encodings e.g. UTF7.
UTF7 fixed the accents but cannot hadle the plus sign x2B. I am also having problems with the Euro symbol x80 and quote x92.


If your file is using 0x80 for the Euro symbol, that should help to
narrow it down...

Have you tried using Encoding.Defaul t? That's not the same as the
default encoding for StreamReader when you don't specify an encoding.
(It's the default encoding for your computer, instead.)
How do I read this file correctly?


By specifying the correct encoding.
Is this problem caused by the encoding?


Almost certainly.
Is there a way to determine the file's encoding at runtime?


There are ways you can try to guess it heuristically, but nothing
foolproof.
How else can I find out the encoding?


Well, what generated this text file to start with?

See http://www.pobox.com/~skeet/csharp/unicode.html for more
information.

--
Jon Skeet - <sk***@pobox.co m>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too

Jul 21 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
8153
by: Safalra | last post by:
The idea here is relatively simple: a java program (I'm using JDK1.4 if that makes a difference) that loads an HTML file, removes invalid characters (or replaces them in the case of common ones like Microsoft's 'smartquotes'), and outputs the file. The problem is these files will be on disk, so the program won't have the character encoding information from the server. Questions:
4
3020
by: HeroOfSpielburg | last post by:
Hello, I am trying to using the Shift_JIS character set in my web pages, and have specified it as such in the <head>. <meta http-equiv="Content-Type" content="text/html; charset=Shift_JIS"> This used to work just fine, but recently I migrated all of my web pages to a new server. Now I find that when I view the web pages they
8
3482
by: Agnes | last post by:
In my .net ,i need to generate an xml file , however, user may input a chinese character, Then , the xml will got something unknow characters. the following is my code, Does anyone know how to solve it ?? Private Sub Init() With AMSXML ..Formatting = Formatting.Indented ..Indentation = 4 ..IndentChar = " " ..WriteStartDocument()
6
2504
by: Pavils Jurjans | last post by:
Hello, I am experiencing a weird behaviour on my ASP.NET project. The project consists from client-side, which can be whatever environment - web page, EXE application, etc. The client sends HTTP POST request to the server with data, and the server has ASP.NET application that handles the request and gives answer. I have biled all the fat code down to a very simple test case, which consists from three files - HTML page, which does...
18
4618
by: james | last post by:
Hi, I am loading a CSV file ( Comma Seperated Value) into a Richtext box. I have a routine that splits the data up when it hits the "," and then copies the results into a listbox. The data also has some different characters in it that I am trying to remove. The small a with two dots over it and the small y with two dots over it. Here is my code so far to remove the small y: Private Sub Button2_Click(ByVal sender As System.Object, ByVal...
18
8751
by: Marcel Saucier | last post by:
Hello, I want to use the above characters codes chart but I dont know how to set the typeface (documentation: The characters that appear in Windows above 127 depend on the selected typeface). For example, when I print the chr(205), I am not getting the one shown in the Chart 2 list... A step by step approach will be appreciated (in a windows form application). --
17
10681
by: =?Utf-8?B?R2Vvcmdl?= | last post by:
Hello everyone, Wide character and multi-byte character are two popular encoding schemes on Windows. And wide character is using unicode encoding scheme. But each time I feel confused when talking with another team -- codepage -- at the same time. I am more confused when I saw sometimes we need codepage parameter for wide character conversion, and sometimes we do not need for conversion. Here are two examples,
1
4252
by: PHP Wooer | last post by:
Can anybody there please help me out? I am having a problem with the display of Japanese character in the subject line of the mails sent in Japanese language. This problem is particularly with Hotmail Live and Yahoo Beta mails for rest all like Gmail, Hotmail original and Yahoo original its working fine. It may be some character encoding problem, but is there any way that I can get rid of this issue.
14
4111
by: Ioannis Vranos | last post by:
The following code does not work as expected: #include <wchar.h> #include <locale.h> #include <stdio.h> #include <stddef.h> int main() {
10
3974
by: Paul W | last post by:
Hi all, I have an application that reads data in from a text file and stores it in a database. My problem is that there are some characters in the file that aren't being handled properly. For instance, one of the characters has an ASCII code of 150 (it looks like a dash '-'), when I'm debugging this character is displayed as the square box that Windows uses for unsupported characters and when it's copied to the database it's stored as...
0
8683
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9170
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8901
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7739
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6528
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5862
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
3052
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2336
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2007
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.