Panjandrum wrote:
Rolf Magnus wrote: Panjandrum wrote: red floyd wrote:
vsgdp wrote:
> Is there a unicode equivalent to std::string?
std::wstring?
for UTF-16, not for UTF-8.
std::wstring does not specify any encoding type or any character set.
The most common implementations of std::wstring use Unicode
character set, and either UCS-4, UCS-2 or UTF-16 encoding.
But this is not a requirement.
std::wstring and std::string are both not appropriate for
variable-length character encodings like UTF-8.
std::string is appropriate for UTF-8.
But you must remember that functions like size() and find() will
apply to the encoded bytes, not to the decoded version.
If that was your issue, then you could also say that std::wstring
is not appropriate for any Unicode encoding, because of combining
characters (ie. the string length won't match the number of
display characters).
Of course the correct answer is that you should engage your
brain when manipulating Unicode strings, and be aware of
these issues. Many applications do indeed use std::string for
UTF-8 processing.