473,320 Members | 2,012 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

string conversion

LuB
I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)
void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);

// assignment
out = buf;

// cleanup
delete buf;
}
Thanks in advance,

-Luther

Jul 6 '06 #1
7 2893
LuB

LuB wrote:
I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)
void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);

// assignment
out = buf;

// cleanup
delete buf;
}
Thanks in advance,

-Luther
Sorry ... too much refactoring. Here is a better function. Please
ignore the casts.
inline void ToMultiByteStandard(const std::wstring& in,
std::string& out)
{
// determine how large to make the new buffer
int bufSize = (int)(LONG_PTR)(size_t)in.size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, in.c_str(), -1, buf,
bufSize, NULL, NULL);

// assignment
out = buf;

// cleanup
delete buf;
}

Jul 6 '06 #2
LuB wrote:
LuB wrote:
>I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)
There's no such "idiom", there's no way to change the underlying buffer
(no public methods for doing that) in std::string, there cannot be,
since std::string isn't even required (AFAIK) to store its characters
in any particular way (i.e., it could store them in pieces rather than a
single contiguous buffer), how it is done differs between compiler/std
lib implementations.
>>

void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);
BTW I don't know what exactly WideCharToMultiByte do, but judging by its
name it converts wchar_t sequences to multi-byte-encoded byte sequences.
Your calculation for bufSize may be wrong, because each wide character
may be represented by more than 1 byte. But you'll have to read Windows
documentation or ask in a relevant newgroup, because platform-specific
stuff is off-topic here.
D.
Jul 6 '06 #3
LuB

Davlet Panech wrote:
LuB wrote:
LuB wrote:
I have to convert a wide (16bit) char to a standard (8bit) char.

Please nevermind that I am using WideCharToMultiByte ... but my
question is as follows:

Does this approach make any sense? My main thought is I hate allocating
a new char* buf ... and then simply assigning it again to the
std:;string out parameter. I'd prefer to REPLACE the underlying char*
in the std::string or somehow explicitly reallocate the std::string's
internal length and somehow pass it as a parameter to the conversion
function.

But I'm just not sure how to do that. Obviously, out.c_str() is const.
Maybe I'm asking about "an idiom" to write chars to a std::string using
c-style functions without reallocating and copying alot of memory
around. I hope I'm missing something obvious. Please take it easy on
me. :-)

There's no such "idiom", there's no way to change the underlying buffer
(no public methods for doing that) in std::string, there cannot be,
since std::string isn't even required (AFAIK) to store its characters
in any particular way (i.e., it could store them in pieces rather than a
single contiguous buffer), how it is done differs between compiler/std
lib implementations.
>

void Convert(const std::wstring& in, const std::string& out)
{
// determine how large to make the new buffer
int bufSize = in->size() + 1;

// create the new buffer
char* buf = new char[bufSize];

// write to the new buffer
::WideCharToMultiByte(CP_ACP, 0, (*this), -1, buf,
bufSize, NULL, NULL);

BTW I don't know what exactly WideCharToMultiByte do, but judging by its
name it converts wchar_t sequences to multi-byte-encoded byte sequences.
Your calculation for bufSize may be wrong, because each wide character
may be represented by more than 1 byte. But you'll have to read Windows
documentation or ask in a relevant newgroup, because platform-specific
stuff is off-topic here.
D.
Thanks Davlet,

-LuB

Jul 6 '06 #4
LuB posted:
I have to convert a wide (16bit) char to a standard (8bit) char.

Maybe something like:
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
}

wchar_t Convertor(char const c)
{
return c;
}

int main()
{
wchar_t * const p = ConvertString("I like ice-cream.",
Convertor);

delete [] p;
}


--

Frederick Gotham
Jul 6 '06 #5
LuB

Frederick Gotham wrote:
LuB posted:
I have to convert a wide (16bit) char to a standard (8bit) char.


Maybe something like:
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
}

wchar_t Convertor(char const c)
{
return c;
}

int main()
{
wchar_t * const p = ConvertString("I like ice-cream.",
Convertor);

delete [] p;
}
I thought about doing something like this. Its nice to see someone else
with a similar idea. I will have to profile both approaches and see
what kind of performance gain this approach has.

I'd hoped there was no black magic to something like converting chars
to wchars ... but Bruce Sutter and Scott Meyers always manage to
convince me otherwise ;-) I didn't know if UNICODE, UTF-8 and/or ANSI
were or should be involved in the thought process regarding the type of
approach you exemplified here. I hope it is just as easy as copying
over the integer value of each character.
Frederick Gotham
Many thanks,

-Luther

Jul 6 '06 #6

"LuB" <lu*********@yahoo.comwrote in message
news:11*********************@m79g2000cwm.googlegro ups.com...
>
Frederick Gotham wrote:
>LuB posted:
I have to convert a wide (16bit) char to a standard (8bit) char.


Maybe something like:
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
}

wchar_t Convertor(char const c)
{
return c;
}

int main()
{
wchar_t * const p = ConvertString("I like ice-cream.",
Convertor);

delete [] p;
}

I thought about doing something like this. Its nice to see someone else
with a similar idea. I will have to profile both approaches and see
what kind of performance gain this approach has.

I'd hoped there was no black magic to something like converting chars
to wchars ... but Bruce Sutter and Scott Meyers always manage to
convince me otherwise ;-) I didn't know if UNICODE, UTF-8 and/or ANSI
were or should be involved in the thought process regarding the type of
approach you exemplified here. I hope it is just as easy as copying
over the integer value of each character.
Something like this might work for wchar_t to char (which the OP asked
about)
as long as the wide chars are defined Unicode characters with values less
than
x100. However it can not work for char to wchar_t for characters with values
>
0x7F && < 0xA0, since I believe this range is undefined in Unicode.

(At least assuming the conversion from char to wchar_t doesn't do the
Unicode
mapping.)
Jul 6 '06 #7
In article <HU*******************@news.indigo.ie>, fg*******@SPAM.com
says...

[ ... ]

Semi-chaing the subject -- just a couple comments on the code itself.
#include <cstddef>
#include <cstring>

template<class DestT>
DestT * const ConvertString( char const *p_source, DestT (* const
Convertor)(char) )
Unless there was a _really_ good reason to do otherwise, I'd use a
container of some sort instead of returning a raw pointer. If you
have some reason not to use a wstring, then you could surely use a
std::vector<DestTinstead.
{
using std::size_t;

size_t const len = std::strlen(p_source);

DestT * const p_buffer = new DestT[len + 1];

DestT *p_dest = p_buffer;

while ( *p_dest++ = Convertor(*p_source++) );

return p_buffer;
First I'd note that this function appears to be pretty much
equivalent to one of the overloads of ctype::widen, so there may not
be any reason to write it at all. Going from memory, the only obvious
difference is that ctype::widen expects you to pre-allocate the
destination memory.

If you weren't going to do that, I'd still consider something like
this:

size_t len = strlen(p_source);
std::vector<DestTp_buffer(len);
std::transform(p_source, p_source+len, Convertor);
return p_buffer;
>
wchar_t Convertor(char const c)
{
return c;
}
If you decide to retain the ConvertString function above, you'd
probably still want to use the single-character overload of
ctype::widen to do the conversion itself. At least in theory it
should know how to handle things like converting a character with its
high-bit set to the appropriate value in the target type.

--
Later,
Jerry.

The universe is a figment of its own imagination.
Jul 9 '06 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Marcin Kalicinski | last post by:
Why string literals are regarded as char * not as const char *? (1) void f(char *); (2) void f(const char *); f("foo") will call version (1) of function f. I understand that the exact type...
2
by: Thomas Matthews | last post by:
Hi, I'm working with Borland C++ Builder 6.2. My project uses the std::string class. However, Borland in its infinite wisdom has its own string class, AnsiString. To make my life easier, I...
12
by: ABeck | last post by:
Hello List, I have ar more or less academical question. Can there arise runtime errors in a program, if the include of <string.h> has been forgotten? If all the arguments to the functions of...
6
by: Marco Herrn | last post by:
Hi, I need to serialize an object into a string representation to store it into a database. So the SOAPFormatter seems to be the right formatter for this purpose. Now I have the problem that...
6
by: tommaso.gastaldi | last post by:
Hi, does anybody know a speedy analog of IsNumeric() to check for strings/chars. I would like to check if an Object can be treated as a string before using a Cstr(), clearly avoiding the time...
4
by: Russell Warren | last post by:
I've got a case where I want to convert binary blocks of data (various ctypes objects) to base64 strings. The conversion calls in the base64 module expect strings as input, so right now I'm...
10
by: =?Utf-8?B?RWxlbmE=?= | last post by:
I am surprised to discover that c# automatically converts an integer to a string when concatenating with the "+" operator. I thought c# was supposed to be very strict about types. Doesn't it seem...
5
by: jeremyje | last post by:
I'm writing some code that will convert a regular string to a byte for compression and then beable to convert that compressed string back into original form. Conceptually I have.... For...
3
by: Kevin Frey | last post by:
I am porting Managed C++ code from VS2003 to VS2005. Therefore adopting the new C++/CLI syntax rather than /clr:oldSyntax. Much of our managed code is concerned with interfacing to native C++...
10
by: Dancefire | last post by:
Hi, everyone, I'm writing a program using wstring(wchar_t) as internal string. The problem is raised when I convert the multibyte char set string with different encoding to wstring(which is...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.