On Mon, 2007-11-05 at 12:41 -0800, Nemanja Trifunovic wrote:
On Nov 5, 2:55 pm, Tristan Wibberley <maihem-...@maihem.orgwrote:
On Mon, 2007-11-05 at 11:04 -0800, Nemanja Trifunovic wrote:
On Nov 5, 1:37 pm, Travis <travis.bow...@gmail.comwrote:
Is there an easy to convert from UnicodeString to string or char *?
You mean ICU UnicodeString? You'll need to convert it to UTF-8. See
the ICU converters:http://icu-project.org/userguide/cod...onverters.html
Rather than simply UTF-8, you'd be better off converting it to the
character encoding of std::string::value_type WRT the std::codecvt facet
of the current global locale (if you expect to do any string processing
or stream it to a std::ostream) or to the std::codecvt facet of the
locale you're going to output the string to if you're going to use some
binary I/O API.
But how you can be sure that all the characters from the UnicodeString
fit into char encoding of the current global locale?
You can set the global locale to have a UTF-8 char codecvt facet, or you
can decline to use any string processing functions unless you can pass a
locale/codecvt facet that tells it that the string is UTF-8 and decline to
stream the string through a std::ostream (although you can .imbue() the stream
with an appropriate locale).
Although, if you are just going to throw the data out to a file via binary I/O
or if you will have custom UTF-8 processing functions then you can do it safely.
I'm kind of just trying to warn against throwing character sets and encodings
around just because it seems like they fit better in a particular data type
(UTF-16 can be put into a std::string too).
--
Tristan Wibberley
Any opinion expressed is mine (or else I'm playing devils advocate for
the sake of a good argument). My employer had nothing to do with this
communication.