By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,678 Members | 1,175 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,678 IT Pros & Developers. It's quick & easy.

string to wstring problems

P: n/a
1. why the following program is not working as expected?

#include <iostream>
using namespace std;

int main()
{
string t("test");
wcout << (wchar_t *) t.c_str() << endl;
wcout << t.c_str() << endl;

wstring t2 = (wchar_t *) t.c_str();
wcout << t2.c_str() << endl;

return 0;
}
2. It is acceptable that there is no conversion from wstring to
string, but,
Why there is no conversion (wstring::wstring(string )) from string to
wstring?

Apr 25 '07 #1
Share this Question
Share on Google+
10 Replies


P: n/a
v4vijayakumar wrote:
1. why the following program is not working as expected?
Depends on what your expectations are.
#include <iostream>
using namespace std;

int main()
{
string t("test");
wcout << (wchar_t *) t.c_str() << endl;
wcout << t.c_str() << endl;

wstring t2 = (wchar_t *) t.c_str();
wcout << t2.c_str() << endl;

return 0;
}
2. It is acceptable that there is no conversion from wstring to
string, but,
Why there is no conversion (wstring::wstring(string )) from string to
wstring?
Try to #include <cstdliband use mbstowcs(). (That's what Google tells me...)

Bjoern
Apr 25 '07 #2

P: n/a
v4vijayakumar wrote:
1. why the following program is not working as expected?

[program redacted]

Define "as expected". What were you expecting, and what did you get?

See FAQ 5.8, http://www.parashift.com/c++-faq-lit...t.html#faq-5.8
Apr 25 '07 #3

P: n/a
On Apr 25, 8:17 am, v4vijayakumar <vijayakumar.subbu...@gmail.com>
wrote:
1. why the following program is not working as expected?
What do you expect?
#include <iostream>
using namespace std;
int main()
{
string t("test");
wcout << (wchar_t *) t.c_str() << endl;
You're lying to the compiler. That's generally a good way of
getting into trouble. The address returned by t.c_str() does
NOT point to wchar_t objects.
wcout << t.c_str() << endl;
wstring t2 = (wchar_t *) t.c_str();
More lies.
wcout << t2.c_str() << endl;

return 0;

}
2. It is acceptable that there is no conversion from wstring to
string, but,
Why there is no conversion (wstring::wstring(string )) from string to
wstring?
Because for some stupid reason, they're both instantiations of a
template, and there is no generic solution.

Also, of course, because the conversion would have to be locale
specific. (But that doesn't explain why there is no
string::toWString( locale const& ) function. That has to be
chalked up to the design error of making std::string a
template.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Apr 26 '07 #4

P: n/a
On 26 Apr 2007 06:12:41 -0700, James Kanze <ja*********@gmail.com>
wrote:
>2. It is acceptable that there is no conversion from wstring to
string, but,
Why there is no conversion (wstring::wstring(string )) from string to
wstring?

Because for some stupid reason, they're both instantiations of a
template, and there is no generic solution.

Also, of course, because the conversion would have to be locale
specific. (But that doesn't explain why there is no
string::toWString( locale const& ) function. That has to be
chalked up to the design error of making std::string a
template.)
Do you really mean that it should be a member of std::string? I'd
rather go for a namespace scope function:

std::wstring widen( const std::string &,
const std::locale & = std::locale() );

--
Gennaro Prota
https://sourceforge.net/projects/breeze/
Apr 26 '07 #5

P: n/a
On Apr 27, 2:56 am, Gennaro Prota <address@spam_this.comwrote:
std::wstring widen( const std::string &,
const std::locale & = std::locale() );
If you were generalising this I think it would need two locales. One
the std::string is in and one the std::wstring should be be in.
There's no g'tee that the std::wstring is going to be some form of
Unicode string.
K

Apr 27 '07 #6

P: n/a
Gennaro Prota wrote:
On 26 Apr 2007 06:12:41 -0700, James Kanze <ja*********@gmail.com>
wrote:
2. It is acceptable that there is no conversion from
wstring to string, but, Why there is no conversion
(wstring::wstring(string )) from string to wstring?
Because for some stupid reason, they're both instantiations of a
template, and there is no generic solution.
Also, of course, because the conversion would have to be locale
specific. (But that doesn't explain why there is no
string::toWString( locale const& ) function. That has to be
chalked up to the design error of making std::string a
template.)
Do you really mean that it should be a member of std::string? I'd
rather go for a namespace scope function:
std::wstring widen( const std::string &,
const std::locale & = std::locale() );
Both ways can be made to work. I don't think it really changes
the issues. We have two classes, std::string and std::wstring,
which really require a slightly different interface. And of
course, in practice, you can't really instantiate basic_string
for anything else, and expect it to work.

There's also an interesting question: if std::string is supposed
to represent text, shouldn't it know in what encoding it is?
(This would very strongly argue for a member, of course.) But
of course, if std::string is supposed to represent text, we also
get a number of awkward questions with regards to multi-byte
characters.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Apr 27 '07 #7

P: n/a
On Apr 27, 4:59 am, Kirit Sælensminde <kirit.saelensmi...@gmail.com>
wrote:
On Apr 27, 2:56 am, Gennaro Prota <address@spam_this.comwrote:
std::wstring widen( const std::string &,
const std::locale & = std::locale() );
If you were generalising this I think it would need two locales. One
the std::string is in and one the std::wstring should be be in.
There's no g'tee that the std::wstring is going to be some form of
Unicode string.
The design of locale doesn't allow for this, and to be truthful,
I don't really see how it could. You'd need some way of
creating a codecvt facet on the fly, from the two different
locales.

The design of locale does permit different encodings for
wchar_t, of course. But you'll need nxm different locales, for
n encodings of wchar_t and m encodings of char, in order to make
it work.

If you'll look at the specifications of codecvt, you'll see as
well that it is designed to always go to or from char. There is
a very pervasive underlying assumption that char is the only
external representation, and that conversion is between external
and internal.

(Note that I think your point is well taken. I just don't think
that there is a good practical anser to it at present.)

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Apr 27 '07 #8

P: n/a
On 27 Apr 2007 02:33:58 -0700, James Kanze <ja*********@gmail.com>
wrote:
>Do you really mean that it should be a member of std::string? I'd
rather go for a namespace scope function:
> std::wstring widen( const std::string &,
const std::locale & = std::locale() );

Both ways can be made to work. I don't think it really changes
the issues. We have two classes, std::string and std::wstring,
which really require a slightly different interface. And of
course, in practice, you can't really instantiate basic_string
for anything else, and expect it to work.

There's also an interesting question: if std::string is supposed
to represent text, shouldn't it know in what encoding it is?
Yep, I guess so :-( So we should ask: what does std::string really
represents? It seems to me that the answer is: a sequence of small
integers --not very much different from an std::vector< char >, except
that its interface makes it look like it were representing text.
>(This would very strongly argue for a member, of course.) But
of course, if std::string is supposed to represent text, we also
get a number of awkward questions with regards to multi-byte
characters.
IIUC, the issue is that we have no abstractions for "character" and
"encoding". The expression "multi-byte character" is a misnomer too:
in fact there's a character, and several possible encodings of it,
some of which require multiple bytes. Now, is this arguing for a
generic CharT again? :-)

--
Gennaro Prota
https://sourceforge.net/projects/breeze/
Apr 27 '07 #9

P: n/a
On 26 Apr 2007 19:59:56 -0700, Kirit Sælensminde
<ki****************@gmail.comwrote:
>On Apr 27, 2:56 am, Gennaro Prota <address@spam_this.comwrote:
> std::wstring widen( const std::string &,
const std::locale & = std::locale() );

If you were generalising this I think it would need two locales. One
the std::string is in and one the std::wstring should be be in.
There's no g'tee that the std::wstring is going to be some form of
Unicode string.
Not that I disagree with your general point but... the locale
parameter was actually for the wstring (from what you say I'm under
the impression you are assuming it is for the source string, and that
another one would be need for the destination). Basically the idea
was:

// warning: uncompiled code
std::wstring to_wstring( const std::string & source,
const std::locale & loc = std::locale() )
{
typedef std::ctype< wchar_t ctype;
typedef std::string::size_type size_type;

const size_type len( source.length() );
std::wstring dest( len, wchar_t() );

const ctype & ct( std::use_facet< ctype >( loc ) );
for( size_type i( 0 ); i < len; ++i ) {
std::wstring::traits_type
::assign( dest[ i ], ct.widen( source[ i ] ) );
}

return dest;
}

Not terribly useful, I'm afraid.

--
Gennaro Prota
https://sourceforge.net/projects/breeze/
Apr 27 '07 #10

P: n/a
On Apr 27, 5:43 pm, Gennaro Prota <address@spam_this.comwrote:
On 26 Apr 2007 19:59:56 -0700, Kirit Sælensminde

<kirit.saelensmi...@gmail.comwrote:
On Apr 27, 2:56 am, Gennaro Prota <address@spam_this.comwrote:
std::wstring widen( const std::string &,
const std::locale & = std::locale() );
If you were generalising this I think it would need two locales. One
the std::string is in and one the std::wstring should be be in.
There's no g'tee that the std::wstring is going to be some form of
Unicode string.

Not that I disagree with your general point but... the locale
parameter was actually for the wstring (from what you say I'm under
the impression you are assuming it is for the source string, and that
another one would be need for the destination).
That was kind of what I was thinking and you and James have both
pointed out it doesn't work with the current locale system in C++.

I don't know the locale stuff well enough, but conceptually it needs
to go through some all encompassing encoding and then back out. Or at
least that's a little more practical than a full conversion matrix. I
think for many people Unicode could be the all encompassing encoding,
but I'm aware that it won't do so for everybody.

It's a tricky problem all right. It's one reason that I generally
refuse to work with systems which don't have at least UTF-16. There
are still subtle problems with most of them (they tend to count UTF-16
codes rather than actual characters), but it's less likely to cause a
problem in practice than UTF-8.

I can't see it as any other than a desperately hard problem with no
universal solution. The one thing with that though is it seems to have
slowed progress on Unicode which, although not perfect, does at least
give a pretty practical solution for most uses.
K

Apr 27 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.