473,325 Members | 2,785 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,325 software developers and data experts.

System functions + wchar_t

Hi all,

I've been thinking about all the system functions which accept wchar_t.
The point is that they don't define what encoding the wchar_t has to
be. Let us assume that all the exernal input is UTF-8 and all the
output is also UTF-8 and your internal representation is using wchar_t
encoded using UTF-16. So when you call wcout or other system functions
which accept wide characters, what encoding do they assume?

Regards

Dec 23 '05 #1
9 3243
gamehack wrote:
I've been thinking about all the system functions which accept wchar_t.
What system functions are those? Do you mean platform-specific ones?
The point is that they don't define what encoding the wchar_t has to
be.
It's probably implementation-defined or platform-defined. Have you tried
reading the documentation?
Let us assume that all the exernal input is UTF-8 and all the
output is also UTF-8 and your internal representation is using wchar_t
encoded using UTF-16. So when you call wcout or other system functions
which accept wide characters, what encoding do they assume?


I would venture a guess that _locales_ have something to do with it.

V
Dec 23 '05 #2
gamehack wrote:
Hi all,

I've been thinking about all the system functions which accept wchar_t.
The point is that they don't define what encoding the wchar_t has to
be. Let us assume that all the exernal input is UTF-8 and all the
output is also UTF-8 and your internal representation is using wchar_t
encoded using UTF-16. So when you call wcout or other system functions
which accept wide characters, what encoding do they assume?

Regards

Welcome to the piss poor implementation of internationalization in C++.
The implementation punts and assumes that you can always uniquely
convert from wide stream to multibyte unsing the woefully inadequate
C library function.
Dec 23 '05 #3
Victor Bazarov wrote:
gamehack wrote:
I've been thinking about all the system functions which accept wchar_t.


What system functions are those? Do you mean platform-specific ones?

Anything in the standard that takes an filename for one (fstreams,
etc..). The main args are another.
Dec 23 '05 #4
Ron Natalie wrote:
Victor Bazarov wrote:
gamehack wrote:
I've been thinking about all the system functions which accept wchar_t.

What system functions are those? Do you mean platform-specific ones?

Anything in the standard that takes an filename for one (fstreams,
etc..). The main args are another.


I guess all those functions, that gamehack has in mind, interpret
strings of chars according to locales on particular operating system.

Cheers
--
Mateusz Łoskot
http://mateusz.loskot.net
Dec 23 '05 #5
That's what I suspected :)

Dec 23 '05 #6
Mateusz Łoskot wrote:
Ron Natalie wrote:
Victor Bazarov wrote:
gamehack wrote:

I've been thinking about all the system functions which accept wchar_t.
What system functions are those? Do you mean platform-specific ones?

Anything in the standard that takes an filename for one (fstreams,
etc..). The main args are another.


I guess all those functions, that gamehack has in mind, interpret
strings of chars according to locales on particular operating system.

That is a nonsensical statement. There is no guarantee that there
exists a way to map wchar_t based strings into a string of chars
in any locale.

Dec 23 '05 #7
Ron Natalie wrote:
Mateusz Łoskot wrote:
Ron Natalie wrote:
Victor Bazarov wrote:

gamehack wrote:

> I've been thinking about all the system functions which accept
> wchar_t.

What system functions are those? Do you mean platform-specific ones?

Anything in the standard that takes an filename for one (fstreams,
etc..). The main args are another.

I guess all those functions, that gamehack has in mind, interpret
strings of chars according to locales on particular operating system.

That is a nonsensical statement. There is no guarantee that there
exists a way to map wchar_t based strings into a string of chars
in any locale.


I said I guess. So, please explain me how function like fopen knows
what is the codepage of ASCII string passed to it?
I think there must be some trick or so because fopen is able find path
given in many charsets.

Cheers
--
Mateusz Łoskot
http://mateusz.loskot.net
Dec 23 '05 #8
Mateusz Łoskot wrote:
I said I guess. So, please explain me how function like fopen knows
what is the codepage of ASCII string passed to it?
I think there must be some trick or so because fopen is able find path
given in many charsets.

It works on UNIX because you effectively have an 8 bit clean path.
Any character other than / and \0 is legitimate.
Dec 23 '05 #9
Ron Natalie wrote:
Mateusz Łoskot wrote:
I said I guess. So, please explain me how function like fopen knows
what is the codepage of ASCII string passed to it?
I think there must be some trick or so because fopen is able find path
given in many charsets.

It works on UNIX because you effectively have an 8 bit clean path.
Any character other than / and \0 is legitimate.


I'm not sure. There is still a possibility that filesystem is
"incompatible", in term of charset, with given path and the file can not
be found.

Cheers
--
Mateusz Łoskot
http://mateusz.loskot.net
Dec 24 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Julius Mong | last post by:
Hi all, I'm doing this: // Test char code wchar_t lookup = {0x8364, 0x5543, 0x3432, 0xabcd, 0xef01}; for (int x=0; x<5; x++) { wchar_t * string = (wchar_t*) malloc(sizeof(wchar_t)); string =...
27
by: Trep | last post by:
Hi there! I've been having a lot of difficult trying to figure out a way to convert a terminated char array to a system::string for use in Visual C++ .NET 2003. This is necessary because I...
1
by: mufasa | last post by:
have a type given by: typedef std::basic_string<wchar_t> mystringtype; at one instacne of my program, i have a vairable called 'abc' of type mystringtype. When i try to access the .c_str() or...
19
by: Ross A. Finlayson | last post by:
Hi, I hope you can help me understand the varargs facility. Say I am programming in ISO C including stdarg.h and I declare a function as so: void log_printf(const char* logfilename, const...
15
by: Yifan | last post by:
Hi Does anybody know how to convert System::String* to char*? I searched the System::String class members and did not find any. Thanks Yifan
10
by: Herby | last post by:
I seem to be having a problem with System namespace. There seem to two System namepaces when viewing through Object browser within .NET 2005. One global System:: Then one under <mscorlib> Each...
2
by: Alejandro Aleman | last post by:
Hello! i know this may be a newbie question, but i need to convert a string from System::String^ to char*, in the msdn page tells how, but i need to set to /clr:oldSyntax and i dont want it...
14
by: ThazKool | last post by:
I want to see if this code works the way it should on a Big-Endian system. Also if anyone has any ideas on how determine this at compile-time so that I use the right decoding or encoding...
14
by: =?Utf-8?B?Sm9hY2hpbQ==?= | last post by:
I have seen the following function to convert from a System::String^ to a const wchar_t*. I would like to get a LPCTSTR and AFAIK LPCTSTR is equal to const wchar_t*. Then it should all work right?...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, youll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.