By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,305 Members | 1,764 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,305 IT Pros & Developers. It's quick & easy.

string conversion

P: n/a
LuB
I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)
void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);

// assignment
out = buf;

// cleanup
delete buf;
}
Thanks in advance,

-Luther

Jul 6 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a
LuB

LuB wrote:
I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)
void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);

// assignment
out = buf;

// cleanup
delete buf;
}
Thanks in advance,

-Luther
Sorry ... too much refactoring. Here is a better function. Please
ignore the casts.
inline void ToMultiByteStandard(const std::wstring& in,
std::string& out)
{
// determine how large to make the new buffer
int bufSize = (int)(LONG_PTR)(size_t)in.size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, in.c_str(), -1, buf,
bufSize, NULL, NULL);

// assignment
out = buf;

// cleanup
delete buf;
}

Jul 6 '06 #2

P: n/a
LuB wrote:
LuB wrote:
>I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)
There's no such "idiom", there's no way to change the underlying buffer
(no public methods for doing that) in std::string, there cannot be,
since std::string isn't even required (AFAIK) to store its characters
in any particular way (i.e., it could store them in pieces rather than a
single contiguous buffer), how it is done differs between compiler/std
lib implementations.
>>

void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);
BTW I don't know what exactly WideCharToMultiByte do, but judging by its
name it converts wchar_t sequences to multi-byte-encoded byte sequences.
Your calculation for bufSize may be wrong, because each wide character
may be represented by more than 1 byte. But you'll have to read Windows
documentation or ask in a relevant newgroup, because platform-specific
stuff is off-topic here.
D.
Jul 6 '06 #3

P: n/a
LuB

Davlet Panech wrote:
LuB wrote:
LuB wrote:
I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)

There's no such "idiom", there's no way to change the underlying buffer
(no public methods for doing that) in std::string, there cannot be,
since std::string isn't even required (AFAIK) to store its characters
in any particular way (i.e., it could store them in pieces rather than a
single contiguous buffer), how it is done differs between compiler/std
lib implementations.
>

void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);

BTW I don't know what exactly WideCharToMultiByte do, but judging by its
name it converts wchar_t sequences to multi-byte-encoded byte sequences.
Your calculation for bufSize may be wrong, because each wide character
may be represented by more than 1 byte. But you'll have to read Windows
documentation or ask in a relevant newgroup, because platform-specific
stuff is off-topic here.
D.
Thanks Davlet,

-LuB

Jul 6 '06 #4

P: n/a
LuB posted:
I have to convert a wide (16bit) char to a standard (8bit) char.

Maybe something like:
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
}

wchar_t Convertor(char const c)
{
return c;
}

int main()
{
wchar_t * const p = ConvertString("I like ice-cream.",
Convertor);

delete [] p;
}


--

Frederick Gotham
Jul 6 '06 #5

P: n/a
LuB

Frederick Gotham wrote:
LuB posted:
I have to convert a wide (16bit) char to a standard (8bit) char.


Maybe something like:
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
}

wchar_t Convertor(char const c)
{
return c;
}

int main()
{
wchar_t * const p = ConvertString("I like ice-cream.",
Convertor);

delete [] p;
}
I thought about doing something like this. Its nice to see someone else
with a similar idea. I will have to profile both approaches and see
what kind of performance gain this approach has.

I'd hoped there was no black magic to something like converting chars
to wchars ... but Bruce Sutter and Scott Meyers always manage to
convince me otherwise ;-) I didn't know if UNICODE, UTF-8 and/or ANSI
were or should be involved in the thought process regarding the type of
approach you exemplified here. I hope it is just as easy as copying
over the integer value of each character.
Frederick Gotham
Many thanks,

-Luther

Jul 6 '06 #6

P: n/a

"LuB" <lu*********@yahoo.comwrote in message
news:11*********************@m79g2000cwm.googlegro ups.com...
>
Frederick Gotham wrote:
>LuB posted:
I have to convert a wide (16bit) char to a standard (8bit) char.


Maybe something like:
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
}

wchar_t Convertor(char const c)
{
return c;
}

int main()
{
wchar_t * const p = ConvertString("I like ice-cream.",
Convertor);

delete [] p;
}

I thought about doing something like this. Its nice to see someone else
with a similar idea. I will have to profile both approaches and see
what kind of performance gain this approach has.

I'd hoped there was no black magic to something like converting chars
to wchars ... but Bruce Sutter and Scott Meyers always manage to
convince me otherwise ;-) I didn't know if UNICODE, UTF-8 and/or ANSI
were or should be involved in the thought process regarding the type of
approach you exemplified here. I hope it is just as easy as copying
over the integer value of each character.
Something like this might work for wchar_t to char (which the OP asked
about)
as long as the wide chars are defined Unicode characters with values less
than
x100. However it can not work for char to wchar_t for characters with values
>
0x7F && < 0xA0, since I believe this range is undefined in Unicode.

(At least assuming the conversion from char to wchar_t doesn't do the
Unicode
mapping.)
Jul 6 '06 #7

P: n/a
In article <HU*******************@news.indigo.ie>, fg*******@SPAM.com
says...

[ ... ]

Semi-chaing the subject -- just a couple comments on the code itself.
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
Unless there was a _really_ good reason to do otherwise, I'd use a
container of some sort instead of returning a raw pointer. If you
have some reason not to use a wstring, then you could surely use a
std::vector<DestTinstead.
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
First I'd note that this function appears to be pretty much
equivalent to one of the overloads of ctype::widen, so there may not
be any reason to write it at all. Going from memory, the only obvious
difference is that ctype::widen expects you to pre-allocate the
destination memory.

If you weren't going to do that, I'd still consider something like
this:

size_t len = strlen(p_source);
std::vector<DestTp_buffer(len);
std::transform(p_source, p_source+len, Convertor);
return p_buffer;
>
wchar_t Convertor(char const c)
{
return c;
}
If you decide to retain the ConvertString function above, you'd
probably still want to use the single-character overload of
ctype::widen to do the conversion itself. At least in theory it
should know how to handle things like converting a character with its
high-bit set to the appropriate value in the target type.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 9 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.