Connecting Tech Pros Worldwide Help | Site Map

convert raw bytes to Unicode strings

 
LinkBack Thread Tools Search this Thread
  #1  
Old July 21st, 2008, 01:25 PM
brad
Guest
 
Posts: n/a
Default convert raw bytes to Unicode strings

Does standard C++ have any methods to do this? I'd like to convert raw
bytes to utf-8. Thanks for any tips.

  #2  
Old July 21st, 2008, 01:55 PM
Victor Bazarov
Guest
 
Posts: n/a
Default Re: convert raw bytes to Unicode strings

brad wrote:
Quote:
Does standard C++ have any methods to do this? I'd like to convert raw
bytes to utf-8. Thanks for any tips.
What is the difference between "raw bytes" and "utf-8"?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
  #3  
Old July 21st, 2008, 01:55 PM
brad
Guest
 
Posts: n/a
Default Re: convert raw bytes to Unicode strings

Victor Bazarov wrote:
Quote:
What is the difference between "raw bytes" and "utf-8"?
>
V
raw bytes are not character streams. They do not conform to the concept
of a char. grep a binary file for a string, then grep a text file for a
string to gain a better understanding of this difference.
  #4  
Old July 21st, 2008, 02:15 PM
Pascal J. Bourguignon
Guest
 
Posts: n/a
Default Re: convert raw bytes to Unicode strings

brad <byte8bits@gmail.comwrites:
Quote:
Victor Bazarov wrote:
Quote:
>What is the difference between "raw bytes" and "utf-8"?
>V
>
raw bytes are not character streams. They do not conform to the
concept of a char. grep a binary file for a string, then grep a text
file for a string to gain a better understanding of this difference.
But when you take a string containing characters, and you encode it
into a sequence of UTF-8 bytes, you don't get a string, but a sequence
of bytes.

What is the difference between these bytes and your "raw" bytes?

Do you know what UTF-8 is? (read at least wikipedia article about it).


Anyways, there's no standard C++ function to do what you want. You
could use an external library like libiconv, or just write the utf-8
encoding/decoding algorithm in C++ yourself.

--
__Pascal Bourguignon__
  #5  
Old July 21st, 2008, 02:15 PM
Victor Bazarov
Guest
 
Posts: n/a
Default Re: convert raw bytes to Unicode strings

brad wrote:
Quote:
Victor Bazarov wrote:
Quote:
>What is the difference between "raw bytes" and "utf-8"?
>>
>V
>
raw bytes are not character streams. They do not conform to the concept
of a char. grep a binary file for a string, then grep a text file for a
string to gain a better understanding of this difference.
In C++ a byte is a char. The type 'char' is an integral type "large
enough to store any member of the implementation's basic character set".
There is no separate "concept of a char" from that, at least in C++.

C++ has no specific provisions for UTF-8. There is the class 'codecvt'
(actually a class template), that the Standard says "is for use when
converting from one codeset to another". Perhaps you should look into
that...

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 220,989 network members.