473,385 Members | 2,003 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Bug in CurrentEncoding.EncodingName?

Hi,

I have used the following code to test the encoding of a file :
public string DetermineFileType(string aFileName)
{
string sEncoding = string.Empty;

StreamReader oSR = new StreamReader(aFileName, true);
oSR.ReadToEnd(); // Add this line to read the file.
sEncoding = oSR.CurrentEncoding.EncodingName;

return sEncoding;
}
from:
http://groups.google.com.hk/groups?h...3DN%26tab%3Dwg

But the encoding is always showing Unicode? What's wrong?

Thanks

Nick
Nov 16 '05 #1
4 3923
Nick <ni*****@heha.net.tw> wrote:
I have used the following code to test the encoding of a file :

public string DetermineFileType(string aFileName)
{
string sEncoding = string.Empty;

StreamReader oSR = new StreamReader(aFileName, true);
oSR.ReadToEnd(); // Add this line to read the file.
sEncoding = oSR.CurrentEncoding.EncodingName;

return sEncoding;
}
from:
http://groups.google.com.hk/groups?h...8&threadm=258t
005q76lpof86nsbqv4f0o2d66sba20%404ax.com&rnum=1&pr ev=/groups%3Fq%3Dc%2
523%2520detect%2520file%2520encoding%26hl%3Dzh-TW%26lr%3D%26ie%3DUTF-8
%26sa%3DN%26tab%3Dwg

But the encoding is always showing Unicode? What's wrong?


As I replied in the thread you quoted there, you shouldn't expect code
like that to correctly determine a file's encoding.

It may be able to work out byte order and encoding for Unicode/UTF-8
files which include byte order marks, but it's unlikely to work for
other files and other encodings.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #2
Nick,
In addition to the other comments:

Read the help for StreamReader(path,detectEncodingFromByteOrderMarks )
closer. ;-)

"The detectEncodingFromByteOrderMarks parameter detects the
encoding by looking at the first three bytes of the stream. It
automatically recognizes UTF-8, little-endian Unicode, and
big-endian Unicode text if the file starts with the appropriate
byte order marks. Otherwise, the user-provided encoding is used.
See the Encoding.GetPreamble method for more information."

Remember that if you do not call the constructor of StreamReader with an
Encoding object that UTF8Encoding is used. I would expect the same rule to
apply here. In other words Encoding.Default is not considered unless you
pass it to the constructor.

Ergo your code always returns a Unicode encoding.

Have you tried the StreamReader(path,encoding,
detectEncodingFromByteOrderMarks) constructor?

Hope this helps
Jay
"Nick" <ni*****@heha.net.tw> wrote in message
news:%2****************@TK2MSFTNGP09.phx.gbl...
Hi,

I have used the following code to test the encoding of a file :
public string DetermineFileType(string aFileName)
{
string sEncoding = string.Empty;

StreamReader oSR = new StreamReader(aFileName, true);
oSR.ReadToEnd(); // Add this line to read the file.
sEncoding = oSR.CurrentEncoding.EncodingName;

return sEncoding;
}
from:
http://groups.google.com.hk/groups?h...3DN%26tab%3Dwg
But the encoding is always showing Unicode? What's wrong?

Thanks

Nick

Nov 16 '05 #3
Hi Jon,

So any other method can do that?

Thanks?

Nick

Jon Skeet [C# MVP] wrote:
Nick <ni*****@heha.net.tw> wrote:
I have used the following code to test the encoding of a file :

public string DetermineFileType(string aFileName)
{
string sEncoding = string.Empty;

StreamReader oSR = new StreamReader(aFileName, true);
oSR.ReadToEnd(); // Add this line to read the file.
sEncoding = oSR.CurrentEncoding.EncodingName;

return sEncoding;
}
from:
http://groups.google.com.hk/groups?h...8&threadm=258t
005q76lpof86nsbqv4f0o2d66sba20%404ax.com&rnum=1& prev=/groups%3Fq%3Dc%2
523%2520detect%2520file%2520encoding%26hl%3Dzh-TW%26lr%3D%26ie%3DUTF-8
%26sa%3DN%26tab%3Dwg

But the encoding is always showing Unicode? What's wrong?

As I replied in the thread you quoted there, you shouldn't expect code
like that to correctly determine a file's encoding.

It may be able to work out byte order and encoding for Unicode/UTF-8
files which include byte order marks, but it's unlikely to work for
other files and other encodings.

Nov 16 '05 #4
Nick <ni*****@heha.net.tw> wrote:
So any other method can do that?


You can't do it reliably - there's no way to tell (for instance)
whether something is using one 8-bit code page or another. The best you
can do is make heuristic guesses, to be honest. For instance, if every
other byte is 0 for most of the time, that *probably* means it's a
Unicode encoding. If the whole file is valid in UTF-8, that may be
indicated - but it's still very dodgy, to be honest.

--
Jon Skeet - <sk***@pobox.com>
http://www.pobox.com/~skeet
If replying to the group, please do not mail me too
Nov 16 '05 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
by: Bill Cohagan | last post by:
I'm constructing an ASP page that I'd like to test by writing a program that simulates "many" users hitting the submit button on a form. I assume it's possible to manually construct an httprequest...
14
by: John A Grandy | last post by:
has anyone successfully used HttpWebRequest or WebClient class to simulate submission of a simple HTML form? for example: a very simple plain-vanilla form with a textbox and a button. when the...
5
by: James Wong | last post by:
Dear all, I've a web service function and it contains a parameter in System.Text.Encoding. I found that the data type of this parameter in caller application becomes MyWebSvcName.Encoding...
3
by: MattB | last post by:
Hi. I'm going around and around with an issue that I can't seem to get around. I have a function I wrote that uses a StreamReader to read a text file into a string variable. It's been working well...
6
by: 6kjfsyg02 | last post by:
I have written a client to a web service. I use ASP.NET 1.1 for the client. It worked until I tried to send accented characters. Then the service answered that my signature is not valid. I was...
2
by: Husam | last post by:
Hi EveryBody: I made windows application project as e-mail sender. This project consist of 13 textbox and one label and one button. I add {system.web.dll} as refrance to this project to help me...
0
by: pintu | last post by:
Hi..I posted my message earlier but it was not properly described..so am posting again. I am working in an application in which i hav to send the contents of an xml file(from my local machine)...
12
palanivel
by: palanivel | last post by:
hi frinendz, how to create the mail project. i am senting the mail throw .net (c# asp)
6
by: Claire | last post by:
I've noticed after copying a text file line by line and comparing, that the original had several bytes of data at the beginning denoting its encoding. How do I use that in my copy? My original...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.