473,799 Members | 2,834 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

An input UTF-8 encoded file is output as a ANSI encoded file. Why?

58 New Member
I am inputing a UTF-8 encoded file into memory using the following code...

Expand|Select|Wrap|Line Numbers
  1.             try
  2.             {
  3.                 StreamReader readFile = new StreamReader(pathNames[0]);
  4.                 while (line != null)
  5.                 {
  6.                     line = readFile.ReadLine();
  7.                     compareData.Add(line);
  8.                 }
  9.             }
  10.             catch (Exception f)
  11.             {
  12.                 Console.WriteLine(f.Message);
  13.                 Console.ReadLine();
  14.             }
  15.  
The input data includes some special characters such as "ü" and "ß". When I check the input "line" in debug mode, and the subsequent output file, those characters are look like this "�". Why?

Joe
Apr 28 '12 #1
5 2300
RhysW
70 New Member
because the software you are storing it in deosnt know what those characters are, those characters aren't part of the font thats being used, there just isnt an equivalent graphical representation of the code value of those characters.
Apr 30 '12 #2
Joe Campanini
58 New Member
@RhysW
The software is C# 2010 version, and it does recognize it, because I can key in alt 129 in a string and the character ü appears in the software. No, this has something to do with the way that the file is being read, I just don't know what! I have just seen that if I open the txt file using Exel the same thing happens, but if I open the file with notepad the characters are OK. But hey! thank you for at least making a sensible suggestion, I appreaciate it. Joe
Apr 30 '12 #3
RhysW
70 New Member
no i mean the file, not vis studio or its equivalent, i mean the literal file that its being stored in, as in if youre reading from notepad i think if you opened notepad it would show that questionmark not the character. if you open up some files in notepad and it deosnt know the symbol it displays that questionmark in its place, this might be the problem though i havent checked

Edit: though hecking in notepad it does support those characters, so im not sure, what sortware is the file actually stored as?
Apr 30 '12 #4
Joe Campanini
58 New Member
Thanks for your input and sorry I did not reply sooner, but I managed to get round the problem, I think. I opened the txt file with notepad copied and pasted the whole file into a new notepad and saved it as a UTF-8 file. The C# program seems happy with this but exel still doesn't like. I have seen some funny things in my life in the IT world but this is got to be one of the strang ones. I bet the answere is really simple, but don't have time to investigate. Once again, thank you for your input. Joe
May 3 '12 #5
Plater
7,872 Recognized Expert Expert
The default encoding is ASCII in us-en locale.
Did you set the encoding type of the output stream? For instance streamwriter takes an encoding paramater (which could be utf-8)
May 11 '12 #6

Sign in to post your reply or Sign up for a free account.

Similar topics

3
4086
by: Lakshmi Narayanan | last post by:
Hi experts, My problem is, for password <input name="password" type="password"> element the size given is 20. For another one <input name="username"> is also 20. But in browser the size differs for Charset="UTF-8" only. How to make equal in display? the sample is here: <%@LANGUAGE="VBSCRIPT" CODEPAGE="1252"%>
16
6171
by: lawrence | last post by:
I was told in another newsgroup (about XML, I was wondering how to control user input) that most modern browsers empower the designer to cast the user created input to a particular character encoding. This arose in answer to my question about how to control user input. I had complained that I had users who wrote articles in Microsoft Word or WordPerfect and then input that to the web through a textarea box on a form I'd created. I've...
4
9474
by: Madha K | last post by:
I am developing a web application that need to support UTF-8 characters. For client side validations, can javascript be used to handle UTF-8 characters. 1) How javascript can be used to restrict non-utf8 characters? 2) Using javascript how to find the lengh of a string having unicode characters? e.g: For a field Name on the form, there is a corrosponding field Name varchar2(10) in DB. Through my application when i try enter 10 normal...
23
10980
by: lawrence | last post by:
I'd love to ask why this page is not rendering correctly in Safari on a Macintosh but I suspect someone will tell me to validate the page first. Nevertheless, if anyone sees an obvious reason that I'm missing, I'd like to know. It looks like a missing div tag but I can't see one. http://www.krubner.com/ Let's move on to a question that might be answerable. If i copy and
20
15501
by: Jacky Cheung | last post by:
Hi, I am developing a vCard application which have to support UTF-8. Does the UTF-8 in char* will crash the strlen, I mean does UTF-8 have some char which treat as NULL character in strlen? Jacky
12
9907
by: Rafał Maj Raf256 | last post by:
Hi, I have an UNICODE text file endcoded in UTF-8. I should store the UNICODE strings in my program for example in std::wstring right? To be able to work on them normally, so that std::wstring foo; foo would mean 5-th _character_, and not 5-th byte of UNICODE encoded string. How do I read a text from UTF-8 file into std::wstring? I need to do some conversion right? from utf-8 to internal format used by
5
2799
by: Kamal R. Prasad | last post by:
Hello, I am using a lexer (lex specification supplied to lex) to parse data, and one of the requirements is to handle UTF-8 characters. My understanding is that the first non-ascii character's byte will be > 0x7f in a UTF-8 character If I look for the same in yytext -will that suffice? Is there some std function that one can use to operate on the input stream? I want my code to be locale agnostic. thanks
3
3134
by: Nobody | last post by:
I'm trying to put together code to deal with a SOAP with attachements response, and I'd like to process the response in a single pass. The SOAP with attachments specification returns XML in a MIME message, so it looks like this: --4389012.48390 Content-Type: text/xml <?xml version="1.0" encoding="UTF-8"?> <soap-env:Envelope
10
19587
by: Jed | last post by:
I have a form that needs to handle international characters withing the UTF-8 character set. I have tried all the recommended strategies for getting utf-8 characters from form input to email message and I cannot get it to work. I need to stay with classic asp for this. Here are some things I tried: 'CDONTS Call msg.SetLocaleIDs(65001)
44
9500
by: Kulgan | last post by:
Hi I am struggling to find definitive information on how IE 5.5, 6 and 7 handle character input (I am happy with the display of text). I have two main questions: 1. Does IE automaticall convert text input in HTML forms from the
0
9541
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
1
10231
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10027
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9073
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7565
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6805
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5463
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
2
3759
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2938
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.