The need of Unicode types in C++0x

Ioannis Vranos

Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.
I don't consider being compatible with C99 as an excuse.

Oct 1 '08 #1

Subscribe Reply

2143

Ioannis Vranos

Correction:
Ioannis Vranos wrote:

Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

== For example its QString type provides a toStdWString()t hat returns a

std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.
I don't consider being compatible with C99 as an excuse.

Oct 1 '08 #2

REH

On Oct 1, 5:59*am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:

Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.

I don't consider being compatible with C99 as an excuse.

If I understand what you are asking...

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).

REH

Oct 1 '08 #3

Ioannis Vranos

REH wrote:

On Oct 1, 5:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
>Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.

I don't consider being compatible with C99 as an excuse.

If I understand what you are asking...

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).

I do not know much about encodings, only the necessary for me stuff, but
the question does not sound reasonable for me.

If that system supports Unicode as a system-specific type, why can't
wchar_t be made wide enough as that system-specific Unicode type, in
that system?

Oct 1 '08 #4

=?UTF-8?B?RXJpayBXaWtzdHLDtm0=?=

On 2008-10-01 18:57, Ioannis Vranos wrote:

REH wrote:
>On Oct 1, 5:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
>>Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstrin g with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.

I don't consider being compatible with C99 as an excuse.

If I understand what you are asking...

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).

I do not know much about encodings, only the necessary for me stuff, but
the question does not sound reasonable for me.

If that system supports Unicode as a system-specific type, why can't
wchar_t be made wide enough as that system-specific Unicode type, in
that system?

Because it has been to narrow for 5 to 10 years and the compiler vendor
does not want to take any chances with backward compatibility, and since
we will get Unicode types it is a good idea to use wchar_t for encodings
not the same size as the Unicode types.

--
Erik WikstrÃ¶m

Oct 1 '08 #5

Pete Becker

On 2008-10-01 12:57:27 -0400, Ioannis Vranos
<iv*****@no.spa m.nospamfreemai l.grsaid:

>
If that system supports Unicode as a system-specific type, why can't
wchar_t be made wide enough as that system-specific Unicode type, in
that system?

It can be. But the language definition doesn't require it to be, and
with many implementations it's not. So if you want to traffic in
Unicode you have basically three options: ensure that your character
type can handle 21 bits, drop down to a subset of Unicode (as REH
mentioned, the BMP fits in 16 bit code points), or use a variable-width
encoding like UTF-8 or UTF-16.

Or you can wait for C++0x, which will provide char16_t and char32_t.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Oct 1 '08 #6

James Kanze

On Oct 1, 11:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:

Hi, I am currently learning QT, a portable C++ framework which
comes with both a commercial and GPL license, and which
provides conversion operations to its various types to/from
standard C++ types.

For example its QString type provides a toWString() that
returns a std::wstring with its Unicode contents.

In what encoding format? And what if the "usual" encoding for
wstring isn't Unicode (the case on many Unix platforms).

So, since wstring supports the largest character set, why do
we need explicit Unicode types in C++?

Because wstring doesn't guarantee Unicode, and implementers
can't change what it does guarantee in their particular
implementation.

I think what is needed is a "unicode" locale or at the most,
some unicode locales.

Well, to begin with, there are only two sizes of character
types; the various Unicode encoding forms come in three sizes,
so you already have a size mismatch. And since wchar_t already
has a meaning, we can't just arbitrarily change it.

I don't consider being compatible with C99 as an excuse.

How about being compatible with C++03?

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 2 '08 #7

James Kanze

On Oct 1, 6:28 pm, REH <spamj...@stny. rr.comwrote:

[...]

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).

No. Most systems that claim Unicode support on 16 bits use
UTF-16. Granted, it's a multi-element encoding, but if you're
doing anything serious, effectively, so is UTF-32. (In
practice, I find that UTF-8 works fine for a lot of things.)

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 2 '08 #8

Hendrik Schober

James Kanze wrote:

On Oct 1, 11:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
>Hi, I am currently learning QT, a portable C++ framework which
comes with both a commercial and GPL license, and which
provides conversion operations to its various types to/from
standard C++ types.

>For example its QString type provides a toWString() that
returns a std::wstring with its Unicode contents.

In what encoding format? And what if the "usual" encoding for
wstring isn't Unicode (the case on many Unix platforms).

<curious>
What are those implementations using for 'wchar_t'?
</curious>

Schobi

Oct 2 '08 #9

Ioannis Vranos

Erik Wikström wrote:

>
Because it has been to narrow for 5 to 10 years and the compiler vendor
does not want to take any chances with backward compatibility,

How will it break backward compatibility, if the size of whcar_t changes?

and since
we will get Unicode types it is a good idea to use wchar_t for encodings
not the same size as the Unicode types.

I am talking about not needing those Unicode types since we have wchar_t
and locales.

Oct 2 '08 #10

Similar topics

637

Question: Unicode <-> HEX conversion in C source file?

by: ^_^ | last post by:

conversion from: a="a"; to a=0x????; If there are many unicode strings to convert, how can I do batch-conversion?

C / C++

2291

Converting a unicode byte array into a ASCII-string

by: Luk Vloemans | last post by:

Hey, I'm currently working on a project to get GPS-data onto a PDA. At this stage, I'm already getting data from the device, but my problem is: It's rubbish. At least, it looks as if it were rubbish. example: "`?~?\0?~?~????\0x?x?x?x?x?x~??x?x?xx?\0x?x?" I found patterns in the code I received, so it's just encoded.

C# / C Sharp

2708

Need a C++ compiler

by: Nikhil Bokare | last post by:

I wanted a C++ compiler which would follow the ANSI C++ standards. If you could tell me an IDE also, it would be more helpful. Thanks.

C / C++

1992

C++0x two Unicode proposals. A correction one and a different one

by: Ioannis Vranos | last post by:

Based on a discussion about Unicode in clc++ inside a discussion thread with subject "next ISO C++ standard", and the data provided in http://en.wikipedia.org/wiki/C%2B%2B0x , and with the design ideals: 1. To provide Unicode support in C++0x always and explicitly. 2. To provide support to all Unicode sets out there. I think the implementation of these as:

C / C++

3288

Unicode I/O

by: himanshu.garg | last post by:

Hi, The following std c++ program does not output the unicode character.:- %./a.out en_US.UTF-8 Infinity:

C / C++

9879

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...

Windows Server

11057

The easy way to turn off automatic updates for Windows 10/11

by: Hystou | last post by:

Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...

Windows Server

10541

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...

General

8099

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...

Microsoft Access / VBA

7250

Couldn’t get equations in html when convert word .docx file to html file in C#.

by: conductexam | last post by:

I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...

C# / C Sharp

5940

Trying to create a lan-to-lan vpn between two differents networks

by: TSSRALBI | last post by:

Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...

Networking - Hardware / Configuration

6142

Windows Forms - .Net 8.0

by: adsilva | last post by:

A Windows Forms form does not have the event Unload, like VB6. What one acts like?

Visual Basic .NET

4776

transfer the data from one system to another through ip address

by: 6302768590 | last post by:

Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

C# / C Sharp

4341

How to add payments to a PHP MySQL app.

by: muto222 | last post by:

How can i add a mobile payment intergratation into php mysql website.

PHP