473,756 Members | 6,098 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Reading unicode (utf-16 le) using wifstream

Heay, i've got this problem:

http://rafb.net/paste/results/lpNgbn49.html

i'm using wifstream to read utf-16 file and i've got this problem, that
each byte is read into seperate char while little-endian uses at least
2 bytes for one sign.

the code of method is in the above letter, also with the problem i'm
attaching below:
i've googled and browsed documentation, but can't make it out by my
own. :(

/*
- subgroup {"?yH} std::basic_stri ng<wchar_t,std: :char_traits<wc har_t>,std::all ocator<wchar_t>
+ std::_String_va l<wchar_t,std:: allocator<wchar _t> > {_Alval={...}
} std::_String_va l<wchar_t,std:: allocator<wchar _t> >
- _Bx {_Buf=0x0012f34 0 "쀰7H" _Ptr=0x0037c030 "þÿH"
} std::basic_stri ng<wchar_t,std: :char_traits<wc har_t>,std::all ocator<wchar_t>::_Bxty

- _Buf 0x0012f340 "쀰7H" wchar_t [8]
[0] 49200 '쀰' wchar_t
[1] 55 '7' wchar_t
[2] 72 'H' wchar_t
[3] 0 wchar_t
[4] 101 'e' wchar_t
[5] 0 wchar_t
[6] 108 'l' wchar_t
[7] 0 wchar_t
- _Ptr 0x0037c030 "þÿH" wchar_t *
254 'þ' wchar_t
_Mysize 12 unsigned int
_Myres 15 unsigned int

*/

Mar 21 '06 #1
2 8814
Hi "Anubis",
i'm using wifstream to read utf-16 file and i've got this
problem, that each byte is read into seperate char while
little-endian uses at least 2 bytes for one sign.


PJ Plauger wrote a pair of columns in the April and May 1999
editions of the C/C++ Users Journal about reading/writing
Unicode files. He uses Codecvt facets to do the conversion.
You can find the source code on the cuj server:

http://www.cuj.com/code/

<OT>
If you are using Windows/Visual C++ the following link
might also be useful:
http://www.i18nguy.com/unicode/c-unicode.html
</OT>

Best regards,
Tilman
Mar 21 '06 #2
Thx man, i'll check those links and post my results.

Mar 21 '06 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
9765
by: Bernd Lambertz | last post by:
I have a problem with bcp and format files. We changed our databases from varchar to nvarchar to support unicode. No problems so fare with that. It is working fine. But now I need a format file for the customer table and and it is not working. It is working fine with the old DB with varchar, but with nvarchar I'm not able to copy the data. The biggest problem is, that I got no error message. BCP starts copying to table and finished...
1
7561
by: Daman | last post by:
Hi, I am currently facing difficulty displaying chinese, japanese, russian etc. characters. I am using VB 6 and ADO to query the DB2 Version 7.2 unicode database (UTF-8). The resultset that comes back contains garbage characters for Chinese, Russian etc languages. The english characters come back fine using ADO. It seems that DB2 assumes that my application is NOT Unicode compliant.
5
18662
by: Jamie | last post by:
I have a file that was written using Java and the file has unicode strings. What is the best way to deal with these in C? The file definition reads: Data Field Description CHAR File identifier (64 bytes corresponding to Unicode character string padded with '0' Unicode characters. CHAR File format version (32 bytes corresponding to Unicode character string "x.y.z" where x, y, z are integers corresponding to major, minor and...
2
428
by: Pascal Polleunus | last post by:
Hi, I need to synchronize some tables from a database (master) to another one (slave). Both servers are running Debian Woody with PostgreSQL 7.2.1 (postgresql 7.2.1-2woody4). The databases are in unicode and doesn't contain any binary data. The tables have primary/foreign key constraints, sequences and indexes, but no triggers/rules. There are OIDs but these are different on the 2 databases. In fact they are not used by the application.
2
10318
by: aurora | last post by:
I have some unicode string with some characters encode using python notation like '\n' for LF. I need to convert that to the actual LF character. There is a 'unicode_escape' codec that seems to suit my purpose. >>> encoded = u'A\\nA' >>> decoded = encoded.decode('unicode_escape') >>> print len(decoded) 3 Note that both encoded and decoded are unicode string. I'm trying to use
1
3208
by: Wx | last post by:
Hello. I'm trying to read a textfile written by the NTBackup utility on Windows 2003 SBS. The problem is that when i print the output, it looks like this: S t a t o : b a c k u p O p e r a z i o n e : b a c k u p D e s t i n a z i o n e b a c k u p a t t i v o : F i l e N o m e s u p p o r t o : " l u m e v e . b k f c r e a t o i
0
1207
by: s13khan | last post by:
Hi, I'm using FCKEditor for my CMS based web site thru which I save my site data in HTML format and the site is developped using ASP and MS ACCESS as backend. So far the site was of single language (only Enlish). But now I need some of the pages in Korean language which will have to be managed using the FckEditor and I think this can be done using UNICODE. But I don't know how to handle and/or insert and/or retrieve UNICODE data and...
5
5332
by: John Ztwin | last post by:
Hello, I have a file that contains ordinary text and some special charaters in Unicode escape sequences (\uxxxx). When I read the file using e.g. StreamReader Unicode escape sequences are not converted to their character representation. They are shown excatly same way than in file. Literals in C# code's variables are shown corretly. Can anyone tell how to read Unicode escape sequences from file so that they
3
6883
by: amollokhande1 | last post by:
Hi All, I am using Sql server 2005 as a backend for my application. I want to read/write the unicode data using sql query. When I am using insert into UnicodeData values('سي') command and if we view the data using sql query analyser, it shows ??. On the otherhand, when i insert the same unicode data through sql enterprize manager simply copying the data in the actual field, it properly stores the data.
0
9454
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
1
9836
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9707
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8709
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7242
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6533
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5139
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5301
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
2664
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.