By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,961 Members | 1,333 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,961 IT Pros & Developers. It's quick & easy.

ASCII or Unicode

P: n/a
Can anyone provide a quick code snippit to open a text file and tell if it's
ASCII or Unicode?

Nov 21 '05 #1
Share this Question
Share on Google+
1 Reply

P: n/a
Hi Scott
You would be using the System.Text.Encoding Namespace to check for that
System.Text.Encoding enc = null;
System.IO.FileStream file = new System.IO.FileStream(filePath,
FileMode.Open, FileAccess.Read, FileShare.Read);
if (file.CanSeek)
byte[] bom = new byte[4]; // Get the byte-order mark, if there is one
file.Read(bom, 0, 4);
if ((bom[0] == 0xef && bom[1] == 0xbb && bom[2] == 0xbf) || // utf-8
(bom[0] == 0xff && bom[1] == 0xfe) || // ucs-2le, ucs-4le, and
(bom[0] == 0xfe && bom[1] == 0xff) || // utf-16 and ucs-2
(bom[0] == 0 && bom[1] == 0 && bom[2] == 0xfe && bom[3] == 0xff))
// ucs-4
enc = System.Text.Encoding.Unicode;
enc = System.Text.Encoding.ASCII;

// Now reposition the file cursor back to the start of the file
file.Seek(0, System.IO.SeekOrigin.Begin);
// The file cannot be randomly accessed, so you need to decide what to
set the default to
// based on the data provided. If you're expecting data from a lot of
older applications,
// default your encoding to Encoding.ASCII. If you're expecting data
from a lot of newer
// applications, default your encoding to Encoding.Unicode. Also, since
binary files are
// single byte-based, so you will want to use Encoding.ASCII, even
though you'll probably
// never need to use the encoding then since the Encoding classes are
really meant to get
// strings from the byte array that is the file.

enc = System.Text.Encoding.ASCII;

// Do your file operations here, such as getting a string from the byte
array that is the file
byte[] buffer = new byte[4096]; // A good buffer size; should always be
base2 in case of Unicode
while (file.Read(buffer, 0, 4096))
string line = enc.GetString(buffer); // Uses the encoding we defined

// Close the file: never forget this step!

here you are the link to the complete article

Mohamed Mahfouz
MEA Developer Support Center
ITworx on behalf of Microsoft EMEA GTSC

Nov 21 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.