Hi
I create a simple win32 project (VC2003, windows2003(Eng lish) ,
and do simple paint in WM_PAINT message, when the project use
multi-character set, it is OK.
but when I change to UNICODE, some Chinese characters are illegible( I see
sizeof(TCHAR)=2 being displayed). Your idea is welcome.
case WM_PAINT:
hdc = BeginPaint(hWnd , &ps);
{
LPCTSTR smsg = _T("pringÖÐÎÄ") ;
TextOut(hdc,0,0 ,smsg, _tcslen(smsg));
TCHAR buf[256];
wsprintf(buf, _T("sizeof(TCHA R)=%d"), sizeof(TCHAR));
TextOut(hdc,0,2 0,buf, _tcslen(buf));
}
EndPaint(hWnd, &ps);
break;
Best Regards
Onega 12 3854
>>Your idea is welcome.
why still use win32 for a new project? do you need to modify an existing
application?
for new projects i would recommend using a .NET windows application project.
these are MUCH simpler to use.
kind regards,
Bruno.
I would assume that when compiled as Unicode, the characters in your string
literal will each be interpreted as one Unicode character. You might want to
look at using the \x escape sequence.
"Onega" <no****@test.co m> wrote in message
news:eI******** ******@tk2msftn gp13.phx.gbl... Hi
I create a simple win32 project (VC2003, windows2003(Eng lish) , and do simple paint in WM_PAINT message, when the project use multi-character set, it is OK. but when I change to UNICODE, some Chinese characters are illegible( I see sizeof(TCHAR)=2 being displayed). Your idea is welcome.
case WM_PAINT: hdc = BeginPaint(hWnd , &ps); { LPCTSTR smsg = _T("pringÖÐÎÄ") ; TextOut(hdc,0,0 ,smsg, _tcslen(smsg)); TCHAR buf[256]; wsprintf(buf, _T("sizeof(TCHA R)=%d"), sizeof(TCHAR)); TextOut(hdc,0,2 0,buf, _tcslen(buf)); } EndPaint(hWnd, &ps); break;
Best Regards Onega
Thank you, Ted Miller
\x escape sequence is not friendly.
My code snippet works well under Windows XP. I'd like to know if it is a bug
of Windows 2003 or VC 2003?
Best Regards
Onega
"Ted Miller" <te*@nwlink.com > wrote in message
news:vt******** ****@corp.super news.com... I would assume that when compiled as Unicode, the characters in your
string literal will each be interpreted as one Unicode character. You might want
to look at using the \x escape sequence.
"Onega" <no****@test.co m> wrote in message news:eI******** ******@tk2msftn gp13.phx.gbl... Hi
I create a simple win32 project (VC2003, windows2003(Eng lish) , and do simple paint in WM_PAINT message, when the project use multi-character set, it is OK. but when I change to UNICODE, some Chinese characters are illegible( I
see sizeof(TCHAR)=2 being displayed). Your idea is welcome.
case WM_PAINT: hdc = BeginPaint(hWnd , &ps); { LPCTSTR smsg = _T("pringÖÐÎÄ") ; TextOut(hdc,0,0 ,smsg, _tcslen(smsg)); TCHAR buf[256]; wsprintf(buf, _T("sizeof(TCHA R)=%d"), sizeof(TCHAR)); TextOut(hdc,0,2 0,buf, _tcslen(buf)); } EndPaint(hWnd, &ps); break;
Best Regards Onega
> Thank you, Ted Miller \x escape sequence is not friendly. My code snippet works well under Windows XP. I'd like to know if it is a bug of Windows 2003 or VC 2003?
None of the above.
It is a bug in your code.
Your string is _T("pringÖÐÎÄ") ;
Because of the _T, the string will be left as is if the application is
ANSI or will be converted to Unicode if the application is Unicode.
When left as is (ANSI), you will get the byte sequence:
D6 D0 CE C4
When you run this on an Chinese Simplified system,
D6 D0 => will be interpreted as center/midle (unicode 4E2D)
CE C4 => will be interpreted as literature/culture/writing (unicode 6587)
(I guess this is what you want)
When you run this on an Chinese Traditional system,
D6 D0 => will be unicode 7B22 (no clue about meaning)
CE C4 => will be unicode 6045 (no clue about meaning)
(I guess this is not what you want)
When run on Russian system you will get Russian characters and so on.
This is the problem with code pages, the same sequence of byte can represent
different characters in different code pages.
For an Unicode application, whenm you compile the string is converted to
Unicode from the code page of your source code, which is assumed to be the
system code page.
If you compile on a US system, the result is the byte sequence
D6 00 D0 00 CE 00 C4 00
representing the Unicode characters U+00D6 U+00D0 U+00CE U+00C4
This will display identical on any system supporting Unicode:
LATIN CAPITAL LETTER O WITH DIAERESIS
LATIN CAPITAL LETTER ETH
LATIN CAPITAL LETTER I WITH CIRCUMFLEX
LATIN CAPITAL LETTER A WITH DIAERESIS
If you compile this on a Simplified Chinese system you get what you want.
The \x escape sequence is not friendly, but behave identical on all systems.
This letting aside that it is a verry bad practice to hard-code UI strings in
your code (you already discovered one of the reason).
Mihai
-------------------------
Replace _year_ with _ to get the real email
Hi Mihai N,
Thanks a lot for your informative explaination. I got a lot from it.
While I still have some doubt on this issue.
According to your theory, it seems that my code snippet should fail on both
Windows XP(English, SP1) and Windows 2003(English) . But it is fine on
Windows XP( English version , default codepage: Chinese, Region : Chinese),
althrough I set default codepage and Region to Chinese too under Windows
2003.
I appreciate your help!
TCHAR buf[256];
ZeroMemory(buf, sizeof(buf));
int n = GetLocaleInfo(L OCALE_SYSTEM_DE FAULT
,LOCALE_ILANGUA GE,buf,ARRAY_SI ZE(buf));
buf contains text "0804" under both Windows XP and Windows 2003
Best Regards
Onega
"Mihai N." <nm************ **@yahoo.com> wrote in message
news:Xn******** ************@21 6.148.227.77... Thank you, Ted Miller \x escape sequence is not friendly. My code snippet works well under Windows XP. I'd like to know if it is a bug of Windows 2003 or VC 2003? None of the above. It is a bug in your code.
Your string is _T("pringÖÐÎÄ") ; Because of the _T, the string will be left as is if the application is ANSI or will be converted to Unicode if the application is Unicode.
When left as is (ANSI), you will get the byte sequence: D6 D0 CE C4
When you run this on an Chinese Simplified system, D6 D0 => will be interpreted as center/midle (unicode 4E2D) CE C4 => will be interpreted as literature/culture/writing (unicode 6587) (I guess this is what you want)
When you run this on an Chinese Traditional system, D6 D0 => will be unicode 7B22 (no clue about meaning) CE C4 => will be unicode 6045 (no clue about meaning) (I guess this is not what you want)
When run on Russian system you will get Russian characters and so on. This is the problem with code pages, the same sequence of byte can
represent different characters in different code pages.
For an Unicode application, whenm you compile the string is converted to Unicode from the code page of your source code, which is assumed to be the system code page. If you compile on a US system, the result is the byte sequence D6 00 D0 00 CE 00 C4 00 representing the Unicode characters U+00D6 U+00D0 U+00CE U+00C4
This will display identical on any system supporting Unicode: LATIN CAPITAL LETTER O WITH DIAERESIS LATIN CAPITAL LETTER ETH LATIN CAPITAL LETTER I WITH CIRCUMFLEX LATIN CAPITAL LETTER A WITH DIAERESIS
If you compile this on a Simplified Chinese system you get what you want. The \x escape sequence is not friendly, but behave identical on all
systems. This letting aside that it is a verry bad practice to hard-code UI strings
in your code (you already discovered one of the reason).
Mihai ------------------------- Replace _year_ with _ to get the real email
> According to your theory, it seems that my code snippet should fail on both Windows XP(English, SP1) and Windows 2003(English) . But it is fine on Windows XP( English version , default codepage: Chinese, Region : Chinese), althrough I set default codepage and Region to Chinese too under Windows 2003.
Ok, maybe this is not the explanation.
Can you pleas answer some questions, maybe I can figure it out?
Is the code compiled already and you test the same executable on the two
systems?
Or you recompile?
The convestion of the string in the source happens at compile time.
What characters do you get see when you run your code on Windows 2003?
Mihai
-------------------------
Replace _year_ with _ to get the real email
Glad to see your post again.
Your tips is valuable.
I build ANSI and UNICODE version executable under windows XP, both works
well under windows 2003, then I rebuild under windows 2003, only ANSI
version works well.
my code looks like
case WM_PAINT:
hdc = BeginPaint(hWnd , &ps);
{
LPCTSTR smsg = _T("AÖÐÎÄ");
int nlen = _tcslen(smsg);
TextOut(hdc,0,0 ,smsg, _tcslen(smsg));
TCHAR buf[512];
wsprintf(buf, _T("sizeof(TCHA R)=%d, strlen = %d,"), sizeof(TCHAR), nlen);
TextOut(hdc,0,2 0,buf, _tcslen(buf));
ZeroMemory(buf, sizeof(buf));
TCHAR nbuf[16];
for(int ci=0;ci<nlen;ci ++)
{
ZeroMemory(nbuf ,sizeof(nbuf));
TCHAR tci = smsg[ci];
if(sizeof(TCHAR )==1)
wsprintf(nbuf, _T("%02X"),tci& 0xff);
else
wsprintf(nbuf, _T("%04X"),tci& 0xffff);
_tcscat(buf, nbuf);
}
TextOut(hdc,0,4 0,buf, _tcslen(buf));
}
EndPaint(hWnd, &ps);
break;
version build under win2003 gives the following output(I have only run it
under 2003):
ANSI : Chinese is fine, sizeof(TCHAR)=1 , strlen=5, 41D6D0CEC4
UNICODE: Chinese isn't fine, sizeof(TCHAR)=2 ,strlen=5,
004100D600D000C E00C4
version build under Windows XP gives the following output(run on both XP and
2003):
UNICODE: Chinese is fine, sizeof(TCHAR)=2 ,strlen=3, 00414E2D6587
ANSI: Chinese is fine, sizeof(TCHAR)=1 ,strlen=5, 41D6D0CEC4
I think there is something wrong with Windows 2003 or VS.NET 2003
Best Regards
Onega
"Mihai N." <nm************ **@yahoo.com> wrote in message
news:Xn******** **********@216. 148.227.77... According to your theory, it seems that my code snippet should fail on
both Windows XP(English, SP1) and Windows 2003(English) . But it is fine on Windows XP( English version , default codepage: Chinese, Region :
Chinese), althrough I set default codepage and Region to Chinese too under Windows 2003.
Ok, maybe this is not the explanation. Can you pleas answer some questions, maybe I can figure it out? Is the code compiled already and you test the same executable on the two systems? Or you recompile? The convestion of the string in the source happens at compile time. What characters do you get see when you run your code on Windows 2003?
Mihai ------------------------- Replace _year_ with _ to get the real email
> I build ANSI and UNICODE version executable under windows XP, both works well under windows 2003, then I rebuild under windows 2003, only ANSI version works well.
My guess: the XP you are using for building is Chinese Simplified,
the 2003 is English (or something else using code page 1252)
version build under win2003 gives the following output(I have only run it under 2003): ANSI : Chinese is fine, sizeof(TCHAR)=1 , strlen=5, 41D6D0CEC4 UNICODE: Chinese isn't fine, sizeof(TCHAR)=2 ,strlen=5, 004100D600D000C E00C4
This matches what I was saying in a previous email: If you compile on a US system, the result is the byte sequence D6 00 D0 00 CE 00 C4 00 representing the Unicode characters U+00D6 U+00D0 U+00CE U+00C4
Note: COMPILE on US system, not RUN on US system.
_T is solved at compile time.
This also points to the conclusion that you do compile on an English system.
Try to compile it on a Chinese Simplified system.
You can do it on your 2003 system, but you should set both the user
and the system locale to Chinese (RPC), then reboot.
There is nothing wrong with Windows 2003 or Dev. Studio (2003 or older)
But even if this solves the problem, please move the string in the resources.
This is "the right thing" to do.
Quoting Microsoft
"In fact, the C/C++ Language specification says that the source files are to be written in 7-bit ANSI."
Quoting the standard:
1. The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and newline, plus the following 91 graphical characters: a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _ { } [ ] # ( ) < > % : ; . ? * + * / ^ & | ~ ! = , \ " ’
2 The universal-character-name construct provides a way to name other characters. hexquad: hexadecimal-digit hexadecimal-digit hexadecimal-digit hexadecimal-digit universal-character-name: \u hex-quad \U hex-quad hex-quad
--
Mihai
-------------------------
Replace _year_ with _ to get the real email
Hi Mihai N,
Both Windows XP and Windows 2003 I worked with are English version.
At last I got a solution, by puting #pragma setlocale("chs" ) in .cpp file.
The idea is from Alexander Grigoriev. Show my respect to you for your
patience with it. I'll take your advice in future project. Thanks again!
Best Regards
Onega
"Mihai N." <nm************ **@yahoo.com> wrote in message
news:Xn******** ************@63 .240.76.16... I build ANSI and UNICODE version executable under windows XP, both works well under windows 2003, then I rebuild under windows 2003, only ANSI version works well. My guess: the XP you are using for building is Chinese Simplified, the 2003 is English (or something else using code page 1252)
version build under win2003 gives the following output(I have only run it under 2003): ANSI : Chinese is fine, sizeof(TCHAR)=1 , strlen=5, 41D6D0CEC4 UNICODE: Chinese isn't fine, sizeof(TCHAR)=2 ,strlen=5, 004100D600D000C E00C4
This matches what I was saying in a previous email: If you compile on a US system, the result is the byte sequence D6 00 D0 00 CE 00 C4 00 representing the Unicode characters U+00D6 U+00D0 U+00CE U+00C4 Note: COMPILE on US system, not RUN on US system. _T is solved at compile time. This also points to the conclusion that you do compile on an English
system. Try to compile it on a Chinese Simplified system. You can do it on your 2003 system, but you should set both the user and the system locale to Chinese (RPC), then reboot.
There is nothing wrong with Windows 2003 or Dev. Studio (2003 or older)
But even if this solves the problem, please move the string in the
resources. This is "the right thing" to do.
Quoting Microsoft
"In fact, the C/C++ Language specification says that the source files
are to be written in 7-bit ANSI."
Quoting the standard:
1. The basic source character set consists of 96 characters: the space character, the control characters representing horizontal tab, vertical tab, form feed, and newline, plus the following 91 graphical characters: a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 _ { } [ ] # ( ) < > % : ; . ? * + ?/ ^ & | ~ ! = , \ " ?
2 The universal-character-name construct provides a way to name other characters. hexquad: hexadecimal-digit hexadecimal-digit hexadecimal-digit
hexadecimal-digit universal-character-name: \u hex-quad \U hex-quad hex-quad
-- Mihai ------------------------- Replace _year_ with _ to get the real email This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics |
by: sebastien.hugues |
last post by:
Hi
I would like to retrieve the application data directory path of the
logged user on
windows XP. To achieve this goal i use the environment variable
APPDATA.
The logged user has this name: sébastien. The second character is not an
ascii one and when i try to encode the path that contains this name in
utf-8,
|
by: Thomas Heller |
last post by:
First I was astonished to see that _winreg.QueryValue doesn't accept
unicode key names, then I came up with this pattern:
def RegQueryValue(root, subkey):
if isinstance(subkey, unicode):
return _winreg.QueryValue(root, subkey.encode("mbcs"))
return _winreg.QueryValue(root, subkey)
Does this look ok?
|
by: Zenobia |
last post by:
Recently I was editing a document in GoLive 6. I like GoLive because it has some nice
features such as:
* rewrite source code
* check syntax
* global search & replace (through several files at once)
* regular expression search & replace.
Normally my documents are encoded with the ISO setting.
Recently I was writing an XHTML...
|
by: Grace |
last post by:
Dear Sir,
By default, an application build on .net framework 1.0 or 1.1 is it a unicode application??
and
If i use VS.net 2003 (VC# or VC++) by defualt is it a unicode or ANSI appliation?
|
by: Jamie |
last post by:
I have a file that was written using Java and the file has unicode
strings. What is the best way to deal with these in C? The file
definition reads:
Data Field Description
CHAR File identifier (64 bytes corresponding to Unicode character
string padded with '0' Unicode characters.
CHAR File format version (32 bytes corresponding...
| |
by: Ger |
last post by:
I have not been able to find a simple, straight forward Unicode to ASCII
string conversion function in VB.Net.
Is that because such a function does not exists or do I overlook it?
I found Encoding.Convert, but that needs byte arrays.
Thanks,
/Ger
|
by: Larry Hastings |
last post by:
I'm an indie shareware Windows game developer. In indie shareware
game development, download size is terribly important; conventional
wisdom holds that--even today--your download should be 5MB or less.
I'd like to use Python in my games. However, python24.dll is 1.86MB,
and zips down to 877k. I can't afford to devote 1/6 of my download...
|
by: bhc |
last post by:
Anybody know how to use unicode in vb.net 2003?
|
by: =?Utf-8?B?Q3JhaWcgSm9obnN0b24=?= |
last post by:
I am in the process of converting an application to Unicode that is built
with Visual C++ .NET 2003. On application startup in debug mode I get an
exception. The problem appears to be that code with #ifndef _UNICDODE is
executed in output.c, the library code for supporting printf functions. I
need to how to get the code that is defined with...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it.
First, let's disable language...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
| |
by: agi2029 |
last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
|
by: isladogs |
last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM).
In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules.
He will explain when you may want to use classes...
|
by: conductexam |
last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one.
At the time of converting from word file to html my equations which are in the word document file was convert...
|
by: TSSRALBI |
last post by:
Hello
I'm a network technician in training and I need your help.
I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs.
The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols.
I succeeded, with both firewalls in...
|
by: adsilva |
last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
|
by: 6302768590 |
last post by:
Hai team
i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
| |
by: muto222 |
last post by:
How can i add a mobile payment intergratation into php mysql website.
| |