473,625 Members | 3,239 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

converting from windows wchar_t to linux wchar_t

Hello experts,
I am dealing now in porting our server from windows to linux. our
client is running only on windows machine.
to avoid the wchar_t size problem ( in windows its 2 bytes and linux
is 4 bytes ) we defined

#ifdef WIN32
#define t_wchar_t wchar_t
#else // LINUX
#define t_wchar_t short
#endif

on the server I get a buffer that contains windows t_wchar_t string.
something like

struct user_data
{
t_wchar_t name[32];
.....
.....
};

all the data transfer is working great as long as the server don't
care what's in the string
my problem start when I want to print out some logs on the server
using the content of the buffer.

my Q is : is there a simple way to convert a 2 bytes wchar_t (windows
version ) to 4 bytes wchar_t ( linux version ).

Thanks
Aug 14 '08 #1
5 17007
ya*****@gmail.c om wrote:
Hello experts,
I am dealing now in porting our server from windows to linux. our
client is running only on windows machine.
to avoid the wchar_t size problem ( in windows its 2 bytes and linux
is 4 bytes ) we defined

#ifdef WIN32
#define t_wchar_t wchar_t
#else // LINUX
#define t_wchar_t short
#endif
You might be better off with a typedef, although it's not a very
significant difference. Also, for some reason I seem to remember that
wchar_t is an unsigned type. Since 'char' is often signed (though
different from 'singed char', of course), perhaps I remember incorrectly...
>
on the server I get a buffer that contains windows t_wchar_t string.
something like

struct user_data
{
t_wchar_t name[32];
.....
.....
};

all the data transfer is working great as long as the server don't
care what's in the string
my problem start when I want to print out some logs on the server
using the content of the buffer.
What kind of "problem"?
my Q is : is there a simple way to convert a 2 bytes wchar_t (windows
version ) to 4 bytes wchar_t ( linux version ).
What's the problem? Can't you just copy (and it will expand the sign)?
If you have a buffer

wchar_t localname[32];

and you want to "convert"

t_wchar_t name[32];

to it, just use std::copy

std::copy(name, name + 32, localname);

Every element will be assigned.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask
Aug 14 '08 #2
ya*****@gmail.c om wrote:
my Q is : is there a simple way to convert a 2 bytes wchar_t (windows
version ) to 4 bytes wchar_t ( linux version ).
I suggest to use something like libiconv(http://en.wikipedia.org/wiki/Iconv)
to convert to a common character set on both sides.

Aug 14 '08 #3
On Aug 14, 5:30 pm, Victor Bazarov <v.Abaza...@com Acast.netwrote:
yaki...@gmail.c om wrote:
Hello experts,
I am dealing now in porting our server from windows to linux. our
client is running only on windows machine.
to avoid the wchar_t size problem ( in windows its 2 bytes and linux
is 4 bytes ) we defined
#ifdef WIN32
#define t_wchar_t wchar_t
#else // LINUX
#define t_wchar_t short
#endif
You might be better off with a typedef, although it's not a
very significant difference.
I would be if the second were unsigned short. Something like
"t_wchar_t( something )" would be legal if it were a typedef,
not if it were a #define.
Also, for some reason I seem to remember that wchar_t is an
unsigned type. Since 'char' is often signed (though different
from 'singed char', of course), perhaps I remember
incorrectly...
Both are very implementation defined. In practice, you
generally shouldn't be using wchar_t in portable code:-(.

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Aug 14 '08 #4
my Q is : is there a simple way to convert a 2 bytes wchar_t (windows
version ) to 4 bytes wchar_t ( linux version ).
wchar_t is a particularly useless type : Because its implementation defined it doesn't have (in protable code) any kind of assurance of what type of character encoding it may be using or capable of using.

The next point is that *unicode* characters are unsigned. so use an unsigned short for your UCS-2 / UTF-16 representation. http://en.wikipedia.org/wiki/UTF-16 has loads more information.

Finally, conversion for simple UCS-2 to UTF-32 is simple... Simply pad out the data by doing a direct characterwise copy:

typedef ucs2char unsigned short;
typedef utf32char unsigned long;

void convert_ucs2_2_ utf32(ucs2char const* src; utf32char* dest)
{
do {
*dest++ = *src;
} while(*src++);
}

If you want to properly convert characters outside the basic multilingual plane, and the B.M.P covers all displayable characters from all modern languages that are in use :- european and eastern - then you need to be aware of surrogate pairs: Unicode codepoints in the range U+D800-U+DFFF are not assigned to valid characters, this range is used by UTF-16 to encode pairs of UTF-16 character each of which encodes 10 bits of the final codepoint.

So, something like this will do the translation of UTF-16 to UTF-32

typedef utf16char unsigned short;
void convert_utf16_t o_utf32(ucs2cha r const* src; utf32char* dest)
{
do {
if(*src & 0xD800 == 0xD800) {
*dest++ = (*src++ & 0x07ff) << 10 + (*src & 0x7ff) + 0x10000;
} else
*dest++ = *src;
} while(*src++);
}

Aug 15 '08 #5
On Aug 15, 9:23 am, "Chris Becke" <chris.be...@gm ail.comwrote:
my Q is : is there a simple way to convert a 2 bytes wchar_t
(windows version ) to 4 bytes wchar_t ( linux version ).
wchar_t is a particularly useless type : Because its
implementation defined it doesn't have (in protable code) any
kind of assurance of what type of character encoding it may be
using or capable of using.
That's partially true of char as well; in addition, the
character encoding can depend on the source of the data. But at
least, char is guaranteed to be at least 8 bits, so you know
that it can hold all useful external encodings. (For better or
for worse, the external world is 8 bits, and any attempt to do
otherwise is bound to fail in the long run.)
The next point is that *unicode* characters are unsigned.
I'm not sure what that's supposed to mean. ALL character
encodings I've ever seen use only non-negative values: ASCII
doesn't define any negative encodings, nor do any of the ISO
8859 encodings. The fact that char can be (and often is) a
signed 8 bit value causes no end of problems because of this.
The character value isn't really signed or unsigned: it's just a
value (that happens never to be negative).

What is true is that the Unicode encoding formats UTF-16 and
UTF-8 require values in the range of 0-0xFFFF and 0-0xFF,
respectively, and that if you're short is 16 bits or your char 8
(both relatively frequent cases), those values won't fit in the
corresponding signed types. (For historical reasons, we still
manage to make do putting UTF-8, and other 8 bit encodings, in
an 8 bit signed char. It's a hack, and it's not, at least in
theory, guaranteed to work, but in practice, it's often the
least bad choice available.)

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Aug 15 '08 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

22
5476
by: Keith MacDonald | last post by:
Hello, Is there a portable (at least for VC.Net and g++) method to convert text between wchar_t and char, using the standard library? I may have missed something obvious, but the section on codecvt, in Josuttis' "The Standard C++ Library", did not help, and I'm still awaiting delivery of Langer's "Standard C++ IOStreams and Locales". Thanks,
2
2266
by: Exits Funnel | last post by:
Hello, I've inherited a bunch of code which was written on windows and makes frequent calls to _wtol( ) which converts a 2 byte char array to a long integer. I'm pretty sure it is a Microsoft extension. I'm porting the code to Linux (g++) and I can't figure out how to replace it. It seems I should be able to use the std::string class to transcode from wide chars to chars and use atol( ) but I can't seem to make it work. If anyone...
5
2536
by: Sonu | last post by:
Hello everyone and thanks in advance. I have a multilingual application which has been built in MFC VC++ 6.0 (non-Unicode). It support English German Hungarian so far, which has been fine. But now I need it to work on Russian computers and I realized that the application should be converted to Unicode to work in Russian. I am totally new to .NET so I'm not sure of this, but I read somewhere that if converted my apllication to .NET...
4
2810
by: diDE | last post by:
I want to convert a managed string array f.e. array<string^>^ Texts; // Elements 0: "ABC", 1: "HJO" to a TCHAR** or wchar_t** any ideas?
39
576
by: James Brown | last post by:
could someone please tell me when the wchar_t type was introduced into the C language (and with what version).....perhaps it was introduced as an extension by alot of compiler venders before it became official? I am also interested in finding out what first prompted the introduction of this type - was it Unicode or did wchar_t happen before Unicode came into existence? thanks, James
7
12125
by: Jimmy Shaw | last post by:
Hi everybody, Is there any SIMPLE way to convert from UTF-16 to UTF-32? I may be mixed up, but is it possible that all UTF-16 "code points" that are 16 bits long appear just the same in UTF-32, but with zero padding and hence no real conversion is necessary? If I am completely wrong and some intricate conversion operation needs to take place, can anyone give me some primer on the subject?
4
9784
by: interec | last post by:
Hi Folks, I am writing a c++ program on redhat linux using main(int argc, wchar_t *argv). $LANG on console is set to "en_US.UTF-8". g++ compiler version is 3.4.6. Q1. what is the encoding of data that I get in argv ? Q2. what is encoding of string constants defined in programs (for example L"--count") ?
0
1921
by: clinnebur | last post by:
We have an ASP.NET web application (C#) that copies videos from a CCTV truck to a Linux server. What I am trying to do is convert the .AVI videos(which is how they are created on the truck) to .WMV in my C# code using Windows Media Encoder. I have a virtual directory to the truck location of the videos. I also have a virtual directory created to the Linux box. The application resides on a Windows Server 2003 and I am using VS 2005, .NET...
4
6867
by: =?ISO-8859-2?Q?Boris_Du=B9ek?= | last post by:
Hi, I have an API that returns UTF-8 encoded strings. I have a utf8 codevt facet available to do the conversion from UTF-8 to wchar_t encoding defined by the platform. I have no trouble converting when a UTF-8 encoded string comes from file - I just create a std::wifstream and imbue it with a locale that uses the utf-8 facet for std::locale::ctype. Then I just use operator>to get wstring properly decoded from UTF-8. I thought I could...
0
8259
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8696
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8637
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
7188
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6119
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4090
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4195
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2621
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1805
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.