Hi. I'm having trouble reading some unicode files. Basically, I have to
parse certain files. Some of those files are being input in Japanese,
Chinese etc. The easiest way, I figured, to distinguish between plain
ASCII files I receive and the Unicode ones would be to check if the
first two bytes read 0xFFFE.
But nothing I do seems to be able to do that.
I tried reading it in binary mode and reading two characters in:
FILE *fin; char ch [2];
fin.open (filename, "rb");
if (fin) { fopen (ch, sizeof (char), 2, fin); ......
I tried reading it in binary mode and read a wchar_t in:
FILE *fin; wchar_t wch;
fin.open (filename, "rb");
if (fin) { fopen (&wch, sizeof (wchar_t), 1, fin); ....
I tried using ifstream for two characters/wifstream for wchar_t but to
no avail.
All of them seems to skip the so-called byte-order-mask. I am quite
lost for ideas. I saw a few examples using MFC Class CStdioFile etc.
but I don't want to use those. I'm sure there's a perfectly simple
method to do this.
Sorry about the long msg for such a simple problem, but it is getting
quite frustrating.... Any help would be very much appreciated.
Cheers,
Nemo.
PS. I know the mask is there. I viewed the files using a hex editor. 2 3218
<nn****@gmail.com> wrote in message
news:11**********************@f14g2000cwb.googlegr oups.com... Hi. I'm having trouble reading some unicode files. Basically, I have to parse certain files. Some of those files are being input in Japanese, Chinese etc. The easiest way, I figured, to distinguish between plain ASCII files I receive and the Unicode ones would be to check if the first two bytes read 0xFFFE.
But nothing I do seems to be able to do that.
I tried reading it in binary mode and reading two characters in:
FILE *fin; char ch [2]; fin.open (filename, "rb"); if (fin) { fopen (ch, sizeof (char), 2, fin); ......
I tried reading it in binary mode and read a wchar_t in:
FILE *fin; wchar_t wch; fin.open (filename, "rb"); if (fin) { fopen (&wch, sizeof (wchar_t), 1, fin); ....
I tried using ifstream for two characters/wifstream for wchar_t but to no avail.
All of them seems to skip the so-called byte-order-mask. I am quite lost for ideas. I saw a few examples using MFC Class CStdioFile etc. but I don't want to use those. I'm sure there's a perfectly simple method to do this.
See our CoreX library, at our web site. It has exactly what you need.
P.J. Plauger
Dinkumware, Ltd. http://www.dinkumware.com
In message <11**********************@f14g2000cwb.googlegroups .com>, nn****@gmail.com writes Hi. I'm having trouble reading some unicode files. Basically, I have to parse certain files. Some of those files are being input in Japanese, Chinese etc. The easiest way, I figured, to distinguish between plain ASCII files I receive and the Unicode ones would be to check if the first two bytes read 0xFFFE.
But nothing I do seems to be able to do that.
I tried reading it in binary mode and reading two characters in:
FILE *fin; char ch [2]; fin.open (filename, "rb"); if (fin) { fopen (ch, sizeof (char), 2, fin); ......
Try posting the *actual* code that causes the problem. The above is
clearly not it.
--
Richard Herring This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: lawrence |
last post by:
I'm just now trying to give my site a character encoding of UTF-8. The
site has been built in a hodge-podge way over the last 6 years. The
validator tells me I've lots of characters that don't...
|
by: Ldaled |
last post by:
Okay, I had a previous post called reading an XML document. Since this post I
have revised my code and got it to work. Now, Like Derek had mentioned in
answer to my previous post, I am getting an...
|
by: jassi |
last post by:
Hi,
i have an app.config file as follows :
<?xml version="1.0" encoding="utf-8">
<configuration>
<appSettings>
<add key="button1.Text" value="cc1"/>
|
by: Chua Wen Ching |
last post by:
Hi there,
I have some problems when reading XML file.
1. First this, is what i did, cause i can't seem to read "sub elements or
tags" values, so i place those values into attributes like this....
|
by: spacekid |
last post by:
Hi there
I am exposing a c# assembly as a COM component (regasm /codebase) and
calling it from classic asp. When I try to call the
ConfigurationSettings.AppSettings function in the c# assembly,...
|
by: Drew Berkemeyer |
last post by:
Hello,
I'm using the following code to read a text file in VB.NET.
Dim sr As StreamReader = File.OpenText(strFilePath)
Dim input As String = sr.ReadLine()
While Not input Is Nothing...
|
by: cj |
last post by:
I'm doing something wrong in the reading of this file. I think the rest
will work but it keeps telling me something else is using the file.
Nothing is. Any ideas?
Private Sub...
|
by: HaggMan |
last post by:
I'm creating a page that:
- accepts user input in whatever language
- saves that input to a file
- reads the file and displays the original input
The following code successfully writes the user...
|
by: ramyakrishnakumar |
last post by:
Hi All,
I am facing some problem with basic file operation...
I have one xml file looks like
<?xml version="1.0" encoding="UTF-8" standalone="no" ?>
<x:recording>
<udf3>Gélin</udf3>
...
|
by: Elliot |
last post by:
My XML is using encoding UTF-8 and its content contains Chinese character.
When debug the following codes:
string strXmlFile = "xml.xml";
XmlDocument objXml = new XmlDocument();
...
|
by: MeoLessi9 |
last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
|
by: DolphinDB |
last post by:
Tired of spending countless mintues downsampling your data? Look no further!
In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
|
by: Aftab Ahmad |
last post by:
Hello Experts!
I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
|
by: ryjfgjl |
last post by:
ExcelToDatabase: batch import excel into database automatically...
|
by: isladogs |
last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM).
In this month's session, we are pleased to welcome back...
|
by: Vimpel783 |
last post by:
Hello!
Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
|
by: jfyes |
last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
|
by: ArrayDB |
last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
|
by: PapaRatzi |
last post by:
Hello,
I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
| |