473,395 Members | 1,456 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

Using detectEncodingFromByteOrderMarks while copying a text file

I've noticed after copying a text file line by line and comparing, that the
original had several bytes of data at the beginning denoting its encoding.
How do I use that in my copy?
My original code shown below, didn't produce a perfect copy, so I used the
StreamReader construct that includes detectEncodingFromByteOrderMarks. But I
need to pass that to the construct for my StreamWriter so I need to be able
to work out the encoding type somehow. How please?

string InputPath = Path.GetDirectoryName(Application.ExecutablePath) +
@"\intext.txt";
string OutputPath = Path.GetDirectoryName(Application.ExecutablePath)
+ @"\outtext.txt";
string In;
string Out;

using (StreamReader Input = new StreamReader(InputPath))
// using (StreamReader Input = new StreamReader(InputPath, true)) <<
construct
{
using (StreamWriter Output = new StreamWriter(OutputPath))
{
while ((In = Input.ReadLine()) != null)
{
Out = DoSomethingTo(In);
Output.WriteLine(Out);
}
}
}

Jun 27 '08 #1
6 4921
I'm guessing - tell the writer about it?

using (StreamWriter Output = new StreamWriter(OutputPath, false,
Input.CurrentEncoding)) {...}

Marc
Jun 27 '08 #2
Correction - the CurrentEncoding is not valid until it has read some
data; perhaps something like below; note that it also can't detect every
encoding possible...

Marc

using (StreamReader reader = new StreamReader(path1, true))
{
string line = reader.ReadLine();
using (StreamWriter writer = new StreamWriter(path2, false,
reader.CurrentEncoding))
{
Console.WriteLine("Reading {0} with {1}", path1,
reader.CurrentEncoding.EncodingName);
Console.WriteLine("Writing {0} with {1}", path2,
writer.Encoding.EncodingName);

while (line != null)
{
string t = Transform(line);
Console.WriteLine(t);
writer.WriteLine(t);
line = reader.ReadLine();
}
}
}
Jun 27 '08 #3
"Marc Gravell" <ma**********@gmail.comwrote in message
news:u4**************@TK2MSFTNGP03.phx.gbl...
Correction - the CurrentEncoding is not valid until it has read some data;
perhaps something like below; note that it also can't detect every
encoding possible...
That's great! thank you :)

Jun 27 '08 #4
Using detectEncodingFromByteOrderMarks while copying a text file
Unless you process the text somehow, it is not worth the trouble to
copy a text file as text file (with encoding detection, line ending,
and so on).
Just copy it as a binary. The routine can also be reused for any type
of files, and there is no risk of data corruption if you "guess" the
encoding wrong.
--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Jun 27 '08 #5
I very nearly said the same thing - but if you look carefully, there is
a transform hidden in the code:

Out = DoSomethingTo(In);
Output.WriteLine(Out);

Marc
Jun 27 '08 #6
I very nearly said the same thing - but if you look carefully, there is
a transform hidden in the code:
Right, I missed that one. Got fouled by the subject :-)
--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Jun 27 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Grant Edwards | last post by:
I'm trying in vain to set the icon for the executable generated by py2exe. According to various sources there are two answers: 1) Do it on the command line: python setup.py py2exe --icon...
2
by: Bernd Lambertz | last post by:
I have a problem with bcp and format files. We changed our databases from varchar to nvarchar to support unicode. No problems so fare with that. It is working fine. But now I need a format...
22
by: Matt | last post by:
When browsing a web page a user has the ability to highlight content on a page (by holding down the left mouse button and dragging the mouse over the desired content). Is there a way to disable...
14
by: Tony Johansson | last post by:
Hello Experts! Assume I have a class called SphereClass as the base class and a class called BallClass that is derived from the SphereClass. The copy constructor initialize the left hand object...
0
by: Richard Taylor | last post by:
User-Agent: OSXnews 2.07 Xref: number1.nntp.dca.giganews.com comp.lang.python:437315 Hi I am trying to use py2app (http://undefined.org/python/) to package a gnome-python application...
121
by: typingcat | last post by:
First of all, I'm an Asian and I need to input Japanese, Korean and so on. I've tried many PHP IDEs today, but almost non of them supported Unicode (UTF-8) file. I've found that the only Unicode...
3
by: John | last post by:
Hi all, My application updates a sql server 2005 express database prior to copying it with the result being the "in use by another process" and I cannot copy it as a result. I've posted the code...
0
by: Grant Edwards | last post by:
I've got a system where I try to install extensions using /usr/local/bin/python setup.py install But, it fails when it tries to use a non-existant compiler path and specs file. I suspect it's...
6
by: kimiraikkonen | last post by:
Hi, I use system.io.file class to copy files but i have a difficulty about implementing a basic / XP-like progress bar indicator during copying process. My code is this with no progress bar,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.