468,469 Members | 2,269 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 468,469 developers. It's quick & easy.

how to convert narrow string to wide string and vice versa?

i'm using VC++6 IDE
i know i could use macros like A2T, T2A,
but is there any way more decent way to do this?

Sep 7 '06 #1
4 17042
thinktwice wrote:
i'm using VC++6 IDE
i know i could use macros like A2T, T2A,
but is there any way more decent way to do this?
Look up std::ctype::widen and std::ctype::narrow in the <locale>
header.

Regards,
Bart.

Sep 7 '06 #2

Bart wrote:
thinktwice wrote:
i'm using VC++6 IDE
i know i could use macros like A2T, T2A,
but is there any way more decent way to do this?

Look up std::ctype::widen and std::ctype::narrow in the <locale>
header.
These may not be much good for Unicode or other variable width
encodings - depends on how you use the resultant strings.

It's a tricky thing to deal with. If you properly understand what you
mean by 'narrow' and 'wide' strings the solution should present itself.
If you're not sure what the string content means then you're unlikely
to find the right solution in a library because you won't know how to
use the functions or their results properly.
K

Sep 7 '06 #3
Kirit Sælensminde wrote:
Bart wrote:
>thinktwice wrote:
>>i'm using VC++6 IDE
i know i could use macros like A2T, T2A,
but is there any way more decent way to do this?
Look up std::ctype::widen and std::ctype::narrow in the <locale>
header.

These may not be much good for Unicode or other variable width
encodings - depends on how you use the resultant strings.

It's a tricky thing to deal with. If you properly understand what you
mean by 'narrow' and 'wide' strings the solution should present itself.
If you're not sure what the string content means then you're unlikely
to find the right solution in a library because you won't know how to
use the functions or their results properly.
well, if you just want a quick ugly hack, then personally i've sometimes
used:

wstring wide(L"some wide character string");
string narrow(wide.begin(), wide.end());

But this is a cleaving axe for microsurgery: It depends on wide having
equivalent encoding codepoints to the charset in string, which is only
really tru if wstrings are unicode, contain only ISO-8859-1 characters
(0-255), and normal character encoding is ISO-8859-1 or similar. (char
type, depends on platform).

I would actually be interested in seeing what the "clean" solution for
converting is when you have, say, Unicode in wchar_t's and whatever
encoding the locale specifies in char's (ISO-8859-1, or maybe
windows-1252) :)

//deice [deice at deice dot cjb dot net]
//Arne Pajunen
Sep 7 '06 #4

Arne 'deice' Pajunen wrote:
Kirit Sælensminde wrote:
Bart wrote:
thinktwice wrote:
i'm using VC++6 IDE
i know i could use macros like A2T, T2A,
but is there any way more decent way to do this?
Look up std::ctype::widen and std::ctype::narrow in the <locale>
header.
These may not be much good for Unicode or other variable width
encodings - depends on how you use the resultant strings.

It's a tricky thing to deal with. If you properly understand what you
mean by 'narrow' and 'wide' strings the solution should present itself.
If you're not sure what the string content means then you're unlikely
to find the right solution in a library because you won't know how to
use the functions or their results properly.

well, if you just want a quick ugly hack, then personally i've sometimes
used:

wstring wide(L"some wide character string");
string narrow(wide.begin(), wide.end());

But this is a cleaving axe for microsurgery: It depends on wide having
equivalent encoding codepoints to the charset in string, which is only
really tru if wstrings are unicode, contain only ISO-8859-1 characters
(0-255), and normal character encoding is ISO-8859-1 or similar. (char
type, depends on platform).

I would actually be interested in seeing what the "clean" solution for
converting is when you have, say, Unicode in wchar_t's and whatever
encoding the locale specifies in char's (ISO-8859-1, or maybe
windows-1252) :)
The first step is to convert the UTF-16 (which is normal for wchar_t,
but I think there may be some platforms/compilers that use UTF-32) to
UTF-32. Then convert that down (often with a code table, but sometimes
algorithmically). Of course there's the open question of what to do
with characters that don't/can't map. In some applications you can use
a variety of character encodings (as distinct to character sets). For
example, if you're using ISO-8859-1 in XML/HTML you can use the forms
XML/HTML defines for this.

A full answer depends on what you are using the string for which is why
it's so hard to answer. For some things your solution is perfectly
valid - it's fine for the many parts of internet protocols which are
defined to use ASCII characters only.

For our framework we're looking at using ICU to do the conversions, but
haven't had much of a chance to play with it yet. As nearly 100% of the
interactions we do are through HTTP then we just use UTF-8 and that
solves nearly the whole problem. We have found it useful to define our
own std::wstring like class that uses UTF-32 as the single character
interface points (operator[] and at() etc.) but uses UTF-16 for
character sequences. Things like substr() use the correct position and
count based on the number of UTF-32 characters _not_ the number of
UTF-16 code points so applications can't chop in half some characters.
K

Sep 7 '06 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

6 posts views Thread by Byron | last post: by
3 posts views Thread by Rodusa | last post: by
15 posts views Thread by Yifan | last post: by
5 posts views Thread by sarab | last post: by
20 posts views Thread by Niyazi | last post: by
1 post views Thread by Rui Maciel | last post: by
1 post views Thread by kmladenovski | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.