473,231 Members | 1,825 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,231 software developers and data experts.

Using detectEncodingFromByteOrderMarks while copying a text file

I've noticed after copying a text file line by line and comparing, that the
original had several bytes of data at the beginning denoting its encoding.
How do I use that in my copy?
My original code shown below, didn't produce a perfect copy, so I used the
StreamReader construct that includes detectEncodingFromByteOrderMarks. But I
need to pass that to the construct for my StreamWriter so I need to be able
to work out the encoding type somehow. How please?

string InputPath = Path.GetDirectoryName(Application.ExecutablePath) +
@"\intext.txt";
string OutputPath = Path.GetDirectoryName(Application.ExecutablePath)
+ @"\outtext.txt";
string In;
string Out;

using (StreamReader Input = new StreamReader(InputPath))
// using (StreamReader Input = new StreamReader(InputPath, true)) <<
construct
{
using (StreamWriter Output = new StreamWriter(OutputPath))
{
while ((In = Input.ReadLine()) != null)
{
Out = DoSomethingTo(In);
Output.WriteLine(Out);
}
}
}

Jun 27 '08 #1
6 4897
I'm guessing - tell the writer about it?

using (StreamWriter Output = new StreamWriter(OutputPath, false,
Input.CurrentEncoding)) {...}

Marc
Jun 27 '08 #2
Correction - the CurrentEncoding is not valid until it has read some
data; perhaps something like below; note that it also can't detect every
encoding possible...

Marc

using (StreamReader reader = new StreamReader(path1, true))
{
string line = reader.ReadLine();
using (StreamWriter writer = new StreamWriter(path2, false,
reader.CurrentEncoding))
{
Console.WriteLine("Reading {0} with {1}", path1,
reader.CurrentEncoding.EncodingName);
Console.WriteLine("Writing {0} with {1}", path2,
writer.Encoding.EncodingName);

while (line != null)
{
string t = Transform(line);
Console.WriteLine(t);
writer.WriteLine(t);
line = reader.ReadLine();
}
}
}
Jun 27 '08 #3
"Marc Gravell" <ma**********@gmail.comwrote in message
news:u4**************@TK2MSFTNGP03.phx.gbl...
Correction - the CurrentEncoding is not valid until it has read some data;
perhaps something like below; note that it also can't detect every
encoding possible...
That's great! thank you :)

Jun 27 '08 #4
Using detectEncodingFromByteOrderMarks while copying a text file
Unless you process the text somehow, it is not worth the trouble to
copy a text file as text file (with encoding detection, line ending,
and so on).
Just copy it as a binary. The routine can also be reused for any type
of files, and there is no risk of data corruption if you "guess" the
encoding wrong.
--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Jun 27 '08 #5
I very nearly said the same thing - but if you look carefully, there is
a transform hidden in the code:

Out = DoSomethingTo(In);
Output.WriteLine(Out);

Marc
Jun 27 '08 #6
I very nearly said the same thing - but if you look carefully, there is
a transform hidden in the code:
Right, I missed that one. Got fouled by the subject :-)
--
Mihai Nita [Microsoft MVP, Visual C++]
http://www.mihai-nita.net
------------------------------------------
Replace _year_ with _ to get the real email
Jun 27 '08 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Grant Edwards | last post by:
I'm trying in vain to set the icon for the executable generated by py2exe. According to various sources there are two answers: 1) Do it on the command line: python setup.py py2exe --icon...
2
by: Bernd Lambertz | last post by:
I have a problem with bcp and format files. We changed our databases from varchar to nvarchar to support unicode. No problems so fare with that. It is working fine. But now I need a format...
22
by: Matt | last post by:
When browsing a web page a user has the ability to highlight content on a page (by holding down the left mouse button and dragging the mouse over the desired content). Is there a way to disable...
14
by: Tony Johansson | last post by:
Hello Experts! Assume I have a class called SphereClass as the base class and a class called BallClass that is derived from the SphereClass. The copy constructor initialize the left hand object...
0
by: Richard Taylor | last post by:
User-Agent: OSXnews 2.07 Xref: number1.nntp.dca.giganews.com comp.lang.python:437315 Hi I am trying to use py2app (http://undefined.org/python/) to package a gnome-python application...
121
by: typingcat | last post by:
First of all, I'm an Asian and I need to input Japanese, Korean and so on. I've tried many PHP IDEs today, but almost non of them supported Unicode (UTF-8) file. I've found that the only Unicode...
3
by: John | last post by:
Hi all, My application updates a sql server 2005 express database prior to copying it with the result being the "in use by another process" and I cannot copy it as a result. I've posted the code...
0
by: Grant Edwards | last post by:
I've got a system where I try to install extensions using /usr/local/bin/python setup.py install But, it fails when it tries to use a non-existant compiler path and specs file. I suspect it's...
6
by: kimiraikkonen | last post by:
Hi, I use system.io.file class to copy files but i have a difficulty about implementing a basic / XP-like progress bar indicator during copying process. My code is this with no progress bar,...
0
by: VivesProcSPL | last post by:
Obviously, one of the original purposes of SQL is to make data query processing easy. The language uses many English-like terms and syntax in an effort to make it easy to learn, particularly for...
3
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 3 Jan 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). For other local times, please check World Time Buddy In...
0
by: jianzs | last post by:
Introduction Cloud-native applications are conventionally identified as those designed and nurtured on cloud infrastructure. Such applications, rooted in cloud technologies, skillfully benefit from...
0
by: abbasky | last post by:
### Vandf component communication method one: data sharing ​ Vandf components can achieve data exchange through data sharing, state sharing, events, and other methods. Vandf's data exchange method...
0
by: fareedcanada | last post by:
Hello I am trying to split number on their count. suppose i have 121314151617 (12cnt) then number should be split like 12,13,14,15,16,17 and if 11314151617 (11cnt) then should be split like...
0
by: stefan129 | last post by:
Hey forum members, I'm exploring options for SSL certificates for multiple domains. Has anyone had experience with multi-domain SSL certificates? Any recommendations on reliable providers or specific...
1
by: davi5007 | last post by:
Hi, Basically, I am trying to automate a field named TraceabilityNo into a web page from an access form. I've got the serial held in the variable strSearchString. How can I get this into the...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.