473,388 Members | 1,496 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

Unicode Initialization.

Me
I am trying to compile some code Ive gotten from another and
I know I need a 16 bit unicode string, for he passes the pointer to
functions
that take a (uint16 *), however there are initializations that look like
this.

typedef unsigned short int ucs2_char;

....
....
....

static const ucs2_char form_feed[] = L"\f";

The above like in gcc give me the compiler error: 'invalid initializer'

When I change it to the following, everything works fine.

static const ucs2_char *form_feed = L"\f";
What is up with this error?


--
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
Nov 14 '05 #1
7 5077
"Me" <bo***@bogus.com> wrote in message
news:op**************@danocdhcp011136.americas.nok ia.com...
I am trying to compile some code Ive gotten from another and
I know I need a 16 bit unicode string, for he passes the pointer to
functions
that take a (uint16 *), however there are initializations that look like
this.

typedef unsigned short int ucs2_char;
The correct type for UCS2 characters is wchar_t. Fix the code to use the
correct type.
static const ucs2_char form_feed[] = L"\f";

The above like in gcc give me the compiler error: 'invalid initializer'

When I change it to the following, everything works fine.

static const ucs2_char *form_feed = L"\f";

What is up with this error?


What's up is you're using the wrong type; L"\f" is a wide character literal,
not an array of unsigned short ints. The latter should give you a warning
as well, since you're doing an implicit conversion between wchar_t[] and
unsigned short*, but your compiler may not be smart enough to catch that.
typedef unsigned short int ucs2_char;
static const ucs2_char form_feed[] = L"\f";
foo.c:2: warning: initialization from incompatible pointer type

typedef unsigned short int ucs2_char;
static const ucs2_char form_feed[] = L"\f";
foo.c:2: invalid initializer

#include <wchar.h>
static const wchar_t *form_feed = L"\f";
static const wchar_t form_feed[] = L"\f";
[ no compile warnings or errors ]

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

Nov 14 '05 #2
in comp.lang.c i read:
"Me" <bo***@bogus.com> wrote in message
news:op**************@danocdhcp011136.americas.no kia.com...

I am trying to compile some code Ive gotten from another and I know I
need a 16 bit unicode string, for he passes the pointer to functions
that take a (uint16 *), however there are initializations that look like
this.

typedef unsigned short int ucs2_char;


The correct type for UCS2 characters is wchar_t.


wchar_t is something -- perhaps ucs-2 or utf-16, or something else entirely.
i agree wchar_t should be used, but if each character must be a ucs-2 code-
point then wchar_t is not appropriate, and neither should L"" be used for a
literal string.

--
a signature
Nov 14 '05 #3
"those who know me have no need of my name" <no****************@usa.net>
wrote in message news:m1*************@usa.net...
in comp.lang.c i read:
"Me" <bo***@bogus.com> wrote in message
news:op**************@danocdhcp011136.americas.no kia.com...
I am trying to compile some code Ive gotten from another and I know I
need a 16 bit unicode string, for he passes the pointer to functions
that take a (uint16 *), however there are initializations that look like this.

typedef unsigned short int ucs2_char;
The correct type for UCS2 characters is wchar_t.


wchar_t is something -- perhaps ucs-2 or utf-16, or something else

entirely. i agree wchar_t should be used, but if each character must be a ucs-2 code- point then wchar_t is not appropriate, and neither should L"" be used for a literal string.


Good point; whcar_t is UCS-2 on every platform I've used so I didn't
consider it might differ on another platform. Either way, I think it's what
the original author (and our poster) intended to use, and it's the simplest
and most portable solution for dealing with Unicode.

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

Nov 14 '05 #4
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Stephen Sprunk wrote:
"those who know me have no need of my name" <no****************@usa.net>
wrote in message news:m1*************@usa.net...
in comp.lang.c i read:
"Me" <bo***@bogus.com> wrote in message
news:op**************@danocdhcp011136.americas. nokia.com...

I am trying to compile some code Ive gotten from another and I know I
need a 16 bit unicode string, for he passes the pointer to functions
that take a (uint16 *), however there are initializations that look
like
this.

typedef unsigned short int ucs2_char;

The correct type for UCS2 characters is wchar_t.


wchar_t is something -- perhaps ucs-2 or utf-16, or something else


entirely.
i agree wchar_t should be used, but if each character must be a ucs-2


code-
point then wchar_t is not appropriate, and neither should L"" be used for


a
literal string.

Good point; whcar_t is UCS-2 on every platform I've used so I didn't
consider it might differ on another platform.


FWIW, I believe that wchar_t can refer to one of the IBM
double-byte-character-set (DBCS) EBCDICs when used in IBM's C compiler
on the mainframe.
- --

Lew Pitcher, IT Consultant, Enterprise Application Architecture
Enterprise Technology Solutions, TD Bank Financial Group

(Opinions expressed here are my own, not my employer's)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (MingW32)

iD8DBQFAxfSVagVFX4UWr64RAlt4AKDyngVYstrafTQ42C0mFI i3jdVo6gCfTrNf
RI88VNCppIvIsrV9LFTNPpk=
=uB9O
-----END PGP SIGNATURE-----
Nov 14 '05 #5
Me

Thanx for your input, but here is the problem.
First, I work for a big telecom company (you probably are using their
phone right now).
In my project I am porting the phone code to run in Linux so developers
can debug it.
The CDMA specification uses two byte unicode characters and much of the
code uses the L""
initializer.

They create a type called ucs2_char that is unsigned short.

I at first made the ucs2_char to be wchar_t but I found out that in linux
wchar_t is 4 bytes in size (4 byte unicode UTF-32).

What do I do....?.... any suggestions?

Also, is there a type in linux for a 2 byte unicode (UTF-16)?

And....is the L"" initializer, in Linux, only for 4 byte unicode or can I
configure this in gcc or linux?

On Tue, 08 Jun 2004 00:01:59 GMT, Stephen Sprunk <st*****@sprunk.org>
wrote:
"Me" <bo***@bogus.com> wrote in message
news:op**************@danocdhcp011136.americas.nok ia.com...
I am trying to compile some code Ive gotten from another and
I know I need a 16 bit unicode string, for he passes the pointer to
functions
that take a (uint16 *), however there are initializations that look like
this.

typedef unsigned short int ucs2_char;


The correct type for UCS2 characters is wchar_t. Fix the code to use the
correct type.
static const ucs2_char form_feed[] = L"\f";

The above like in gcc give me the compiler error: 'invalid initializer'

When I change it to the following, everything works fine.

static const ucs2_char *form_feed = L"\f";

What is up with this error?


What's up is you're using the wrong type; L"\f" is a wide character
literal,
not an array of unsigned short ints. The latter should give you a
warning
as well, since you're doing an implicit conversion between wchar_t[] and
unsigned short*, but your compiler may not be smart enough to catch that.
typedef unsigned short int ucs2_char;
static const ucs2_char form_feed[] = L"\f";
foo.c:2: warning: initialization from incompatible pointer type

typedef unsigned short int ucs2_char;
static const ucs2_char form_feed[] = L"\f";
foo.c:2: invalid initializer

#include <wchar.h>
static const wchar_t *form_feed = L"\f";
static const wchar_t form_feed[] = L"\f";
[ no compile warnings or errors ]

S


--
Using Opera's revolutionary e-mail client: http://www.opera.com/m2/
Nov 14 '05 #6
"Me" <bo***@bogus.com> wrote in message
news:op**************@danocdhcp011136.americas.nok ia.com...
I at first made the ucs2_char to be wchar_t but I found out that in linux
wchar_t is 4 bytes in size (4 byte unicode UTF-32).

What do I do....?.... any suggestions?

Also, is there a type in linux for a 2 byte unicode (UTF-16)?

And....is the L"" initializer, in Linux, only for 4 byte unicode or can I
configure this in gcc or linux?


-fshort-wchar will give you a 2-byte wchar_t (UTF-16, not UCS-2) with gcc
2.97 and later. I haven't tested whether this makes wide string literals
compatible with unsigned short *, but it seems likely.

Any further questions on gcc should be directed to gnu.gcc.help, but this
should get you started:
http://gcc.gnu.org/onlinedocs/gcc-3....0Gen%20Options

S

--
Stephen Sprunk "Stupid people surround themselves with smart
CCIE #3723 people. Smart people surround themselves with
K5SSS smart people who disagree with them." --Aaron Sorkin

Nov 14 '05 #7
kal
Lew Pitcher <Le*********@td.com> wrote in message news:<uw********************@news20.bellglobal.com >...
FWIW, I believe that wchar_t can refer to one of the IBM
double-byte-character-set (DBCS) EBCDICs when used in IBM's C compiler
on the mainframe.


Perhaps so in so far as size of characters (in bits) are concerned.
Even in this regard sometimes what are called DBCS are actually MBCS
(MultiByte Character Set.)

EBCDIC descended from punched cards. It went from 6-bit BCD to 8-bit
extended BCD (EBCDIC). But ASCII descended from telegraph. It went
from 5-bit telegraph codes to 7-bit ASCII to 8-bit ASCII etc. These
two schemes implement entirely different code points.

Now, wchar_t almost always refers to UCS-2 or UTF-16. The differences
between UCS-2 and UTF-16 have been worked out a few years ago and as
far as code values are concerned they are both the same at present.
The first 128 characters of these are the same as the 7-bit ASCII.
Nov 14 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Blue | last post by:
I've been relying on gcc compiling this but now it doesn't so I'm hoping to learn the right way. See the header below. TCHAR Message = TEXT("Press a Key to Change Setting..."); this basicly...
2
by: Neil Schemenauer | last post by:
python-dev@python.org.] The PEP has been rewritten based on a suggestion by Guido to change str() rather than adding a new built-in function. Based on my testing, I believe the idea is...
15
by: wizardyhnr | last post by:
i want to try ANSI C99's unicode fuctions. so i write a test program. the function is simple, but i cannot compile it with dev c++ 4.9.9.2 under windows xp sp2, since the compiler always think that...
23
by: Jess | last post by:
Hello, I understand the default-initialization happens if we don't initialize an object explicitly. I think for an object of a class type, the value is determined by the constructor, and for...
4
by: Jess | last post by:
Hello, I tried several books to find out the details of object initialization. Unfortunately, I'm still confused by two specific concepts, namely default-initialization and...
1
by: John Nagle | last post by:
The code in urllib.quote fails on Unicode input, when called by robotparser. That bit of code needs some attention. - It still assumes ASCII goes up to 255, which hasn't been true in Python for...
20
by: JohnQ | last post by:
The way I understand the startup of a C++ program is: A.) The stuff that happens before the entry point. B.) The stuff that happens between the entry point and the calling of main(). C.)...
11
by: subramanian100in | last post by:
Suppose we have a class named Test. Test obj; // assuming default ctor is available Test direct_init(obj); // direct initialization happens here Test copy_init = obj; // copy initialization...
5
by: Thierry | last post by:
Hello fellow pythonists, I'm a relatively new python developer, and I try to adjust my understanding about "how things works" to python, but I have hit a block, that I cannot understand. I...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.