473,695 Members | 2,531 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

The need of Unicode types in C++0x

Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.
I don't consider being compatible with C99 as an excuse.
Oct 1 '08 #1
29 2109
Correction:
Ioannis Vranos wrote:
Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.
== For example its QString type provides a toStdWString()t hat returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.
I don't consider being compatible with C99 as an excuse.
Oct 1 '08 #2
REH
On Oct 1, 5:59*am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.

I don't consider being compatible with C99 as an excuse.
If I understand what you are asking...

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).

REH
Oct 1 '08 #3
REH wrote:
On Oct 1, 5:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
>Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstring with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.

I don't consider being compatible with C99 as an excuse.

If I understand what you are asking...

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).

I do not know much about encodings, only the necessary for me stuff, but
the question does not sound reasonable for me.

If that system supports Unicode as a system-specific type, why can't
wchar_t be made wide enough as that system-specific Unicode type, in
that system?
Oct 1 '08 #4
On 2008-10-01 18:57, Ioannis Vranos wrote:
REH wrote:
>On Oct 1, 5:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
>>Hi, I am currently learning QT, a portable C++ framework which comes
with both a commercial and GPL license, and which provides conversion
operations to its various types to/from standard C++ types.

For example its QString type provides a toWString() that returns a
std::wstrin g with its Unicode contents.

So, since wstring supports the largest character set, why do we need
explicit Unicode types in C++?

I think what is needed is a "unicode" locale or at the most, some
unicode locales.

I don't consider being compatible with C99 as an excuse.

If I understand what you are asking...

wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).


I do not know much about encodings, only the necessary for me stuff, but
the question does not sound reasonable for me.

If that system supports Unicode as a system-specific type, why can't
wchar_t be made wide enough as that system-specific Unicode type, in
that system?
Because it has been to narrow for 5 to 10 years and the compiler vendor
does not want to take any chances with backward compatibility, and since
we will get Unicode types it is a good idea to use wchar_t for encodings
not the same size as the Unicode types.

--
Erik Wikström
Oct 1 '08 #5
On 2008-10-01 12:57:27 -0400, Ioannis Vranos
<iv*****@no.spa m.nospamfreemai l.grsaid:
>
If that system supports Unicode as a system-specific type, why can't
wchar_t be made wide enough as that system-specific Unicode type, in
that system?
It can be. But the language definition doesn't require it to be, and
with many implementations it's not. So if you want to traffic in
Unicode you have basically three options: ensure that your character
type can handle 21 bits, drop down to a subset of Unicode (as REH
mentioned, the BMP fits in 16 bit code points), or use a variable-width
encoding like UTF-8 or UTF-16.

Or you can wait for C++0x, which will provide char16_t and char32_t.

--
Pete
Roundhouse Consulting, Ltd. (www.versatilecoding.com) Author of "The
Standard C++ Library Extensions: a Tutorial and Reference
(www.petebecker.com/tr1book)

Oct 1 '08 #6
On Oct 1, 11:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
Hi, I am currently learning QT, a portable C++ framework which
comes with both a commercial and GPL license, and which
provides conversion operations to its various types to/from
standard C++ types.
For example its QString type provides a toWString() that
returns a std::wstring with its Unicode contents.
In what encoding format? And what if the "usual" encoding for
wstring isn't Unicode (the case on many Unix platforms).
So, since wstring supports the largest character set, why do
we need explicit Unicode types in C++?
Because wstring doesn't guarantee Unicode, and implementers
can't change what it does guarantee in their particular
implementation.
I think what is needed is a "unicode" locale or at the most,
some unicode locales.
Well, to begin with, there are only two sizes of character
types; the various Unicode encoding forms come in three sizes,
so you already have a size mismatch. And since wchar_t already
has a meaning, we can't just arbitrarily change it.
I don't consider being compatible with C99 as an excuse.
How about being compatible with C++03?

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Oct 2 '08 #7
On Oct 1, 6:28 pm, REH <spamj...@stny. rr.comwrote:

[...]
wstring in the standard defines neither the character set, nor the
encoding. Given that Unicode is currently a 21-bit standard, how can
wstring support the largest character set on a system where wchar_t is
16-bits (assuming a one-character-per-element encoding)? You could
only support the BMP (which is exactly what most systems and language
that "claim" Unicode support are really capable of).
No. Most systems that claim Unicode support on 16 bits use
UTF-16. Granted, it's a multi-element encoding, but if you're
doing anything serious, effectively, so is UTF-32. (In
practice, I find that UTF-8 works fine for a lot of things.)

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Oct 2 '08 #8
James Kanze wrote:
On Oct 1, 11:59 am, Ioannis Vranos <ivra...@no.spa m.nospamfreemai l.gr>
wrote:
>Hi, I am currently learning QT, a portable C++ framework which
comes with both a commercial and GPL license, and which
provides conversion operations to its various types to/from
standard C++ types.
>For example its QString type provides a toWString() that
returns a std::wstring with its Unicode contents.

In what encoding format? And what if the "usual" encoding for
wstring isn't Unicode (the case on many Unix platforms).
<curious>
What are those implementations using for 'wchar_t'?
</curious>

Schobi
Oct 2 '08 #9
Erik Wikström wrote:
>
Because it has been to narrow for 5 to 10 years and the compiler vendor
does not want to take any chances with backward compatibility,

How will it break backward compatibility, if the size of whcar_t changes?
and since
we will get Unicode types it is a good idea to use wchar_t for encodings
not the same size as the Unicode types.

I am talking about not needing those Unicode types since we have wchar_t
and locales.

Oct 2 '08 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

16
637
by: ^_^ | last post by:
conversion from: a="a"; to a=0x????; If there are many unicode strings to convert, how can I do batch-conversion?
4
2280
by: Luk Vloemans | last post by:
Hey, I'm currently working on a project to get GPS-data onto a PDA. At this stage, I'm already getting data from the device, but my problem is: It's rubbish. At least, it looks as if it were rubbish. example: "`?~?\0?~?~????\0x?x?x?x?x?x~??x?x?xx?\0x?x?" I found patterns in the code I received, so it's just encoded.
33
2679
by: Nikhil Bokare | last post by:
I wanted a C++ compiler which would follow the ANSI C++ standards. If you could tell me an IDE also, it would be more helpful. Thanks.
2
1980
by: Ioannis Vranos | last post by:
Based on a discussion about Unicode in clc++ inside a discussion thread with subject "next ISO C++ standard", and the data provided in http://en.wikipedia.org/wiki/C%2B%2B0x , and with the design ideals: 1. To provide Unicode support in C++0x always and explicitly. 2. To provide support to all Unicode sets out there. I think the implementation of these as:
10
3270
by: himanshu.garg | last post by:
Hi, The following std c++ program does not output the unicode character.:- %./a.out en_US.UTF-8 Infinity:
0
8582
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9122
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8832
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7670
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6496
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5841
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4348
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4587
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
1980
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.