Read Input, Write Output (File) with Umlaute

Carlo Marchesoni

I really don't achieve to read a simple 'input.txt' with the following content:
JÃ¼rg (Hex: 4a fc 72 67)
to an identical 'output.txt'

I do the following (and tried with tons of different encodings):
private static void WriteFile() {
StreamWriter sr = File.CreateText("Output.txt");
try
{
using (TextReader tr = new StreamReader(new
FileStream("Input.txt",FileMode.Open),Encoding.ASC II ))
{
string iniLine = "";
while ((iniLine = tr.ReadLine()) != null)
{
if (iniLine.Length > 0)
sr.WriteLine(iniLine);
}
tr.Close();
}
}
catch
{
sr.Close();
}
sr.Flush();
sr.Close();
}
But in Output I NEVER have exactly the same Hex values as in Input. Isn't
there a way to say "take the same encoding as the input" ?
Thanks for your help

Nov 19 '05 #1

Subscribe Post Reply

6181

Joerg Jooss

Carlo Marchesoni wrote:

I really don't achieve to read a simple 'input.txt' with the
following content: JÃ¼rg (Hex: 4a fc 72 67)
to an identical 'output.txt'

I do the following (and tried with tons of different encodings):
private static void WriteFile() {
StreamWriter sr = File.CreateText("Output.txt");
try
{
using (TextReader tr = new StreamReader(new
FileStream("Input.txt",FileMode.Open),Encoding.ASC II ))
{
string iniLine = "";
while ((iniLine = tr.ReadLine()) != null)
{
if (iniLine.Length > 0)
sr.WriteLine(iniLine);
}
tr.Close();
}
}
catch
{
sr.Close();
}
sr.Flush();
sr.Close();
}
But in Output I NEVER have exactly the same Hex values as in Input.
Isn't there a way to say "take the same encoding as the input" ?

There's no way of identifying a text file's character encoding (save
for a few exceptions). And regarding your code sample, note that ASCII
doesn't include Umlaut characters. Thus, your StreamReader simply loses
them in this case.

But the real issue is that File.OpenText() always uses UTF-8, but your
sample text 0x4a 0xfc 0x72 0x67 is an 8 bit encoding, most likely
Windows-1252 or ISO-8859-1. Even if you open the source file with the
correct encoding, the output will always differ at the byte level,
because UTF-8 encodes Umlaut characters differently.

But why decode and encode anyway? Your code is a simple file copy. If
that's all you need, File.Copy() or using FileStreams will work just
fine with all encoding combinations.

Cheers,
--
http://www.joergjooss.de
mailto:ne********@joergjooss.de

Nov 19 '05 #2

Carlo Marchesoni

Thank you for yous answer. I know that for this sample the File.Copy() would
be much better, but in my real application I obviousely have a much larger
Input file and I have to change a couple of things before writing it to
output.

"Joerg Jooss" wrote:

Carlo Marchesoni wrote:
I really don't achieve to read a simple 'input.txt' with the
following content: JÃ¼rg (Hex: 4a fc 72 67)
to an identical 'output.txt'

I do the following (and tried with tons of different encodings):
private static void WriteFile() {
StreamWriter sr = File.CreateText("Output.txt");
try
{
using (TextReader tr = new StreamReader(new
FileStream("Input.txt",FileMode.Open),Encoding.ASC II ))
{
string iniLine = "";
while ((iniLine = tr.ReadLine()) != null)
{
if (iniLine.Length > 0)
sr.WriteLine(iniLine);
}
tr.Close();
}
}
catch
{
sr.Close();
}
sr.Flush();
sr.Close();
}
But in Output I NEVER have exactly the same Hex values as in Input.
Isn't there a way to say "take the same encoding as the input" ?

There's no way of identifying a text file's character encoding (save
for a few exceptions). And regarding your code sample, note that ASCII
doesn't include Umlaut characters. Thus, your StreamReader simply loses
them in this case.

But the real issue is that File.OpenText() always uses UTF-8, but your
sample text 0x4a 0xfc 0x72 0x67 is an 8 bit encoding, most likely
Windows-1252 or ISO-8859-1. Even if you open the source file with the
correct encoding, the output will always differ at the byte level,
because UTF-8 encodes Umlaut characters differently.

But why decode and encode anyway? Your code is a simple file copy. If
that's all you need, File.Copy() or using FileStreams will work just
fine with all encoding combinations.

Cheers,
--
http://www.joergjooss.de
mailto:ne********@joergjooss.de

Nov 19 '05 #3

Joerg Jooss

Carlo Marchesoni wrote:

Thank you for yous answer. I know that for this sample the
File.Copy() would be much better, but in my real application I
obviousely have a much larger Input file and I have to change a
couple of things before writing it to output.

In this case, make sure to create a StreamReader and a StreamWriter
that use the same encoding.

Cheers,
--
http://www.joergjooss.de
mailto:ne********@joergjooss.de

Nov 19 '05 #4

Carlo Marchesoni

Thanks a lot for your hint - now it works .

"Carlo Marchesoni" wrote:

I really don't achieve to read a simple 'input.txt' with the following content:
JÃ¼rg (Hex: 4a fc 72 67)
to an identical 'output.txt'

I do the following (and tried with tons of different encodings):
private static void WriteFile() {
StreamWriter sr = File.CreateText("Output.txt");
try
{
using (TextReader tr = new StreamReader(new
FileStream("Input.txt",FileMode.Open),Encoding.ASC II ))
{
string iniLine = "";
while ((iniLine = tr.ReadLine()) != null)
{
if (iniLine.Length > 0)
sr.WriteLine(iniLine);
}
tr.Close();
}
}
catch
{
sr.Close();
}
sr.Flush();
sr.Close();
}
But in Output I NEVER have exactly the same Hex values as in Input. Isn't
there a way to say "take the same encoding as the input" ?
Thanks for your help

Nov 19 '05 #5

Similar topics

Read/Write from/to a process

by: jas | last post by:

Hi, I would like to start a new process and be able to read/write from/to it. I have tried things like... import subprocess as sp p = sp.Popen("cmd.exe", stdout=sp.PIPE)...

Python

Reading stdin once confuses second stdin read

by: Charlie Zender | last post by:

Hi, I have a program which takes the output filename argument from stdin. Once the program knows the output filename, it tries to open it. If the output file exists, the program asks the user to...

C / C++

Write error on Read() ??

by: Bill Cohagan | last post by:

I'm writing a console app in c# and am encountering a strange problem. I'm trying to use redirection of the standard input stream to read input from a (xml) file. The following code snippet is from...

C# / C Sharp

Read/Write Files

by: Tibby | last post by:

I need to read/write not only text files, but binary as well. It seems like on binary files, it doesn't right the last 10% of the file. -- Thanks --- Outgoing mail is certified Virus...

Visual Basic .NET

How to save and read very big Array Value to/from file in VB.NET?

by: oncelovecoffee | last post by:

str_Array() 'strings I need save it to file and next time i can read from file. --------------------------------------------------------...

Visual Basic .NET

Write/Read struct to file

by: a | last post by:

I have a struct to write to a file struct _structA{ long x; int y; float z; } struct _structA A; //file open write(fd,A,sizeof(_structA)); //file close

C / C++

SAX/Python : read an xml from the end to the top

by: kepioo | last post by:

I currently have an xml input file containing lots of data. My objectiv is to write a script that reports in another xml file only the data I am interested in. Doing this is really easy using SAX....

Python

Simple VB6 read and write converted to vb.net? How to Read CSV????

by: newsaboutgod | last post by:

I think VB.NET drives some people crazy because some simple VB6 things seem so hard. Here is some VB6 code: 'Write CSV File open "c:\test.csv" for output as #1 write#1, "1","2","3","4","5"...

Visual Basic .NET

read and write columns

by: Bill Cunningham | last post by:

I don't think I can do this without some help or hints. Here is the code I have. #include <stdio.h> #include <stdlib.h> double input(double input) { int count=0,div=0; double...

C / C++

How to turn on java script in a villaon keypad mobile phone

by: Charles Arthur | last post by:

How do i turn on java script on a villaon, callus and itel keypad mobile phone

Java

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Maximizing Business Potential: The Nexus of Website Design and Digital Marketing

by: jinu1996 | last post by:

In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

Online Marketing