473,320 Members | 1,949 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

wchar_t string literal depends on source file type?

I need to allow users to enter unicode string literals, such as the
following:

wchar_t* wfileName = L"aΈc粥粳籵ӑb";

I'm using Visual C++ 7.1, have set the _UNICODE preprocessor directive.

If the source file is saved as a UTF-16, then the result is:

0x0061 0x0388 0x0063 0x7ca5 0x7cb3 0x7c75 0x04d1 0x0062

This is the correct UTF-16 representation of the string.

However, if I save the file in UTF-8, I get the following:

0x0061 0x00ce 0x02c6 0x0063 0x00e7 0x00b2 0x00a5 0x00e7 0x00b2 0x00b3
0x00e7 0x00b1 0x00b5 0x00d3 0x2018 0x0062

This is close to but not quite the valid UTF-8 representation for this
string (the 0x02c6 character should be 0x0088 and the 0x2018 should be
0x0091)

When running the same program on gcc 4.0, the string above is converted
to UTF-32 regardless of the input format of the source file.

The issue is that I store the names of certain objects internally in
UTF-16. I want to be able to enter the name as a string literal and
look up the appropriate entity, but it seems that the encoding of the
string literal depends on how the source file was saved. Therefore,
I'm not aware of anything I can do in C++ to detect the format of the
string and convert it if needed.

Does anyone know if there is a way around this?

Mar 3 '06 #1
0 779

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Julius Mong | last post by:
Hi all, I'm doing this: // Test char code wchar_t lookup = {0x8364, 0x5543, 0x3432, 0xabcd, 0xef01}; for (int x=0; x<5; x++) { wchar_t * string = (wchar_t*) malloc(sizeof(wchar_t)); string =...
7
by: al | last post by:
char s = "This string literal"; or char *s= "This string literal"; Both define a string literal. Both suppose to be read-only and not to be modified according to Standard. And both have...
3
by: gamehack | last post by:
Hi all, I was doing a bit of research about writing yet another build tool but that's not the point of my mail. I'm going to ask a few questions about how to resolve a few internationalization...
3
by: Steven T. Hatton | last post by:
There's probably something obvious I'm missing here, but I can't seem to figure out how to get this to work: ostream_iterator<wstring, wchar_t>(wcout,"\n")); When I try to compile it, I get an...
23
by: Steven T. Hatton | last post by:
This is one of the first obstacles I encountered when getting started with C++. I found that everybody had their own idea of what a string is. There was std::string, QString, xercesc::XMLString,...
2
by: Heiner | last post by:
Hi! #define TEST "this is a test" const char * test = TEST; const wchar_t * wtest = ???; What must I write, to get TEST evaluated to L"this is a test" at compile time?
8
by: Rui Maciel | last post by:
I've just started learning how to use the wchar_t data type as the basis for Unicode strings and unfortunately I'm having quite a bit of problems, both in the C front and the Unicode front. In...
4
by: interec | last post by:
Hi Folks, I am writing a c++ program on redhat linux using main(int argc, wchar_t *argv). $LANG on console is set to "en_US.UTF-8". g++ compiler version is 3.4.6. Q1. what is the encoding of...
5
by: polas | last post by:
Good morning, I have a quick question to clear up some confusion in my mind. I understand that using a string literal in a declaration such as char *p = "string literal" declares a pointer to...
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
0
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.