_vsnwprintf_s seems to be broken

Norman Diamond

I think the current version of _vsnwprintf_s is broken, in ordinary Windows.

I'm not completely sure yet but it looks like this breakage is worse than
previously known Windows CE breakage of StringCchPrintf. For Windows CE
breakage of StringCchPrintf, since the %S format died instead of converting
ANSI to Unicode, a workaround was to call MultiByteToWideChar and then use
the %s format.

For ordinary Windows breakage of _vsnwprintf_s, the %s format is broken, as
far as I can tell.

The compilation environment is not internationalized. It's Visual Studio
2005 SP1 + hotfix for Vista, and SDK for Vista, all running on Vista, all in
Japanese, no foreign software involved in this environment. The project
setting for character set says to use Unicode not ANSI. Function name
_vsntprintf_s maps to _vsnwprintf_s, _T("") maps to L"", etc., and
everything except _vsnwprintf_s seems to perform properly at execution time.
MFC and ATL are not used. The CRT is used as a DLL.

The runtime environment where failure was observed is internationalized.
The Chinese MUI pack was downloaded. The user's locale (viewable format or
something like that), the user's display language, and the system locale
(viewable format for non-Unicode programs) are all set to Chinese
traditional Hong Kong. The settings were copied to all reserved and default
accounts. The execution PC was rebooted several times. The logon screen
and nearly everything else are displayed properly in Chinese. However, the
CRT DLL is from Vista RTM, not from Visual Studio 2005 SP1.

The user's username is "$BCfJ8(B2" (without the quotes). The user can log on
perfectly. The Start menu shows the user's name at the top. Windows
Explorer shows the user's name correctly. No renaming or anything else has
been done with this user. Ordinary Windows operations work. Execution of
my program works, except for calls to _vsnwprintf_s.

Code:
static TCHAR szBuf[2048];
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR),
_T("Username=\"%s\"\n"), userName);

Result:
Username="

_vsnwprintf_s dies as soon as the %s format hits a perfectly valid simple
Unicode character.

Jul 31 '07 #1

Subscribe Post Reply

4650

Cezary Noweta

Hello,

Code:
static TCHAR szBuf[2048];
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR),
_T("Username=\"%s\"\n"), userName);

Result:
Username="

_vsnwprintf_s dies as soon as the %s format hits a perfectly valid simple
Unicode character.

What is the type of userName? Is it va_arg or TCHAR *? v* functions take va_arg
params and not TCHAR * ones. Maybe you should use _sntprintf_s in place of
_vsntprintf_s?

-- best regards

Cezary Noweta

Jul 31 '07 #2

Norman Diamond

Ouch, I missumarized the source code when making this posting. No wonder it
looks like the source code was at fault. Here, I'll summarize it more
accurately.

_TCHAR userName[48];
DebugLog(_T("Other string=\"%s\"\n"), _T("Hello foreign language"));
DebugLog(_T("Username=\"%s\"\n"), userName);
[...]

void DebugLog(TCHAR* szForm, ...)
{
va_list args;
va_start(args, szForm); // init valiable length argument list
static TCHAR szBuf[2048]; // same size for HexDump
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR), szForm, args);
}

Result:
Other string="Hello foreign language"
Username="

_vsnwprintf_s dies as soon as the %s format hits a perfectly valid simple
Unicode character. In the Japanese version of Vista, in the Japanese
version of the CRT, the Japanese version of _vsnwprintf_s can't handle
Japanese characters (the Japanese user's username) in Unicode.
"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...

Hello,

>Code:
static TCHAR szBuf[2048];
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR),
_T("Username=\"%s\"\n"), userName);

Result:
Username="

>_vsnwprintf_s dies as soon as the %s format hits a perfectly valid simple
Unicode character.

What is the type of userName? Is it va_arg or TCHAR *? v* functions take
va_arg
params and not TCHAR * ones. Maybe you should use _sntprintf_s in place of
_vsntprintf_s?

-- best regards

Cezary Noweta

Aug 1 '07 #3

Norman Diamond

I have just determined that _vsnwprintf_s is broken in Chinese Vista too,
with no internationalization involved in the execution system.

As posted in my other message a few hours ago, here is a corrected summary
of the source code:

_TCHAR userName[48];
DebugLog(_T("Other string=\"%s\"\n"), _T("Hello foreign language"));
DebugLog(_T("Username=\"%s\"\n"), userName);
[...]

void DebugLog(TCHAR* szForm, ...)
{
va_list args;
va_start(args, szForm); // init valiable length argument list
static TCHAR szBuf[2048]; // same size for HexDump
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR), szForm, args);
}

Result:
Other string="Hello foreign language"
Username="

_vsnwprintf_s dies as soon as the %s format hits a perfectly valid simple
Unicode character. In the Chinese version of Vista, in the Chinese version
of the CRT, the Chinese version of _vsnwprintf_s can't handle Chinese
characters (the Chinese user's username) in Unicode.

The rest of the program works, all except the calls to _vsnwprintf_s.

(By the way the valiable spelling in comments was there in the original. I
don't know who the original coder was, only that it was coded in Japan.
Today I copied a bit too much source code when using the mouse, but I did
copy it correctly today.)
"Norman Diamond" <nd******@community.nospamwrote in message
news:%2****************@TK2MSFTNGP02.phx.gbl...

>I think the current version of _vsnwprintf_s is broken, in ordinary
Windows.

I'm not completely sure yet but it looks like this breakage is worse than
previously known Windows CE breakage of StringCchPrintf. For Windows CE
breakage of StringCchPrintf, since the %S format died instead of
converting
ANSI to Unicode, a workaround was to call MultiByteToWideChar and then use
the %s format.

For ordinary Windows breakage of _vsnwprintf_s, the %s format is broken,
as
far as I can tell.

The compilation environment is not internationalized. It's Visual Studio
2005 SP1 + hotfix for Vista, and SDK for Vista, all running on Vista, all
in
Japanese, no foreign software involved in this environment. The project
setting for character set says to use Unicode not ANSI. Function name
_vsntprintf_s maps to _vsnwprintf_s, _T("") maps to L"", etc., and
everything except _vsnwprintf_s seems to perform properly at execution
time.
MFC and ATL are not used. The CRT is used as a DLL.

The runtime environment where failure was observed is internationalized.
The Chinese MUI pack was downloaded. The user's locale (viewable format
or
something like that), the user's display language, and the system locale
(viewable format for non-Unicode programs) are all set to Chinese
traditional Hong Kong. The settings were copied to all reserved and
default
accounts. The execution PC was rebooted several times. The logon screen
and nearly everything else are displayed properly in Chinese. However,
the
CRT DLL is from Vista RTM, not from Visual Studio 2005 SP1.

The user's username is "$BCfJ8(B2" (without the quotes). The user can log on
perfectly. The Start menu shows the user's name at the top. Windows
Explorer shows the user's name correctly. No renaming or anything else
has
been done with this user. Ordinary Windows operations work. Execution of
my program works, except for calls to _vsnwprintf_s.

Code:
static TCHAR szBuf[2048];
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR),
_T("Username=\"%s\"\n"), userName);

Result:
Username="

_vsnwprintf_s dies as soon as the %s format hits a perfectly valid simple
Unicode character.

Aug 1 '07 #4

Jochen Kalmbach [MVP]

Hi Norman!

>I have just determined that _vsnwprintf_s is broken in Chinese Vista too,
with no internationalization involved in the execution system.

Can you please provide al *full* working example?

And please do not use non-ASCII chars in the source-code,
so that it can be compiled on other systems with the same result.

_TCHAR userName[48];
DebugLog(_T("Other string=\"%s\"\n"), _T("Hello foreign language"));
DebugLog(_T("Username=\"%s\"\n"), userName);

"userName" is not initialized...

Greetings
Jochen

Aug 1 '07 #5

Kalle Olavi Niemitalo

"Norman Diamond" <nd******@community.nospamwrites:

void DebugLog(TCHAR* szForm, ...)
{
va_list args;
va_start(args, szForm); // init valiable length argument list
static TCHAR szBuf[2048]; // same size for HexDump
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR), szForm, args);
}

It seems va_end and output of szBuf[] are missing from this function.

_vsnwprintf_s dies as soon as the %s format hits a perfectly
valid simple Unicode character.

Does _vsnwprintf_s crash or call the invalid parameter handler,
or does it return some value (which one)?

In the Chinese version of Vista, in the Chinese version of the
CRT, the Chinese version of _vsnwprintf_s can't handle Chinese
characters (the Chinese user's username) in Unicode.

So presumably you are initializing userName[] in some way.
It would be interesting to know the wchar_t values therein.
(You posted a string earlier but please give the numbers too.)

Aug 1 '07 #6

Norman Diamond

Can you please provide al *full* working example?

You mean that I should show the assignment of the value of userName? I
don't know if I can or not, because you proceed to say this:

And please do not use non-ASCII chars in the source-code,

The user name was "$BCfJ8(B2", without the quotes. I mentioned that part of it
correctly yesterday.

"userName" is not initialized...

It was not. It was retrieved from some decryption code which I will not
quote. Before being encrypted, it was originally retrieved from an API
which I think is one of the NetWksta____ APIs. The userName value was
retrieved correctly. The userName value was passed to other APIs for
authentication and succeeded. To repeat again, everything worked except for
calls to _vsnwprintf_s.

And please do not use non-ASCII chars in the source-code, so that it can
be compiled on other systems with the same result.

Hahahaha. Did I not show enough times that the Japanese and Chinese
versions of _vsnwprintf_s worked OK on ASCII characters? They only fail
when presented with strings in their own languages.
"Jochen Kalmbach [MVP]" <no********************@holzma.dewrote in message
news:%2****************@TK2MSFTNGP05.phx.gbl...

Hi Norman!

>>I have just determined that _vsnwprintf_s is broken in Chinese Vista too,
with no internationalization involved in the execution system.

Can you please provide al *full* working example?

And please do not use non-ASCII chars in the source-code, so that it can
be compiled on other systems with the same result.

>_TCHAR userName[48];
DebugLog(_T("Other string=\"%s\"\n"), _T("Hello foreign language"));
DebugLog(_T("Username=\"%s\"\n"), userName);

"userName" is not initialized...

Greetings
Jochen

Aug 1 '07 #7

Norman Diamond

It seems va_end and output of szBuf[] are missing from this function.

FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you also need a transcript of actions in Windows Explorer to open the log
file in Notepad and show the contents which my previous messages
transcribed?

Do you think that maybe the CRT's _vsnwprintf_s could handle the language of
its own version of Windows but the CRT's _ftprintf_s failed because it had
harder work to do? I don't quite think so.

Does _vsnwprintf_s crash or call the invalid parameter handler,
or does it return some value (which one)?

If it called the invalid parameter handler then I think the rest of the code
(the caller of DebugLog) would not proceed to get everything else working
properly with other Windows APIs, I think the rest of the code would abort.

Your question about the return value is a good one. I will add a meta debug
log of that information. I probably won't have time this week though
because higher priority work has just come in.

So presumably you are initializing userName[] in some way.
It would be interesting to know the wchar_t values therein.
(You posted a string earlier but please give the numbers too.)

The string is L"$BCfJ8(B2" (without the quotes). If you really need the
numbers, you can look them up as easily as I can. (The third character is
number U+0032.)
"Kalle Olavi Niemitalo" <ko*@iki.fiwrote in message
news:87************@Astalo.kon.iki.fi...

"Norman Diamond" <nd******@community.nospamwrites:

>void DebugLog(TCHAR* szForm, ...)
{
va_list args;
va_start(args, szForm); // init valiable length argument list
static TCHAR szBuf[2048]; // same size for HexDump
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR), szForm, args);
}

It seems va_end and output of szBuf[] are missing from this function.

>_vsnwprintf_s dies as soon as the %s format hits a perfectly
valid simple Unicode character.

Does _vsnwprintf_s crash or call the invalid parameter handler,
or does it return some value (which one)?

>In the Chinese version of Vista, in the Chinese version of the
CRT, the Chinese version of _vsnwprintf_s can't handle Chinese
characters (the Chinese user's username) in Unicode.

So presumably you are initializing userName[] in some way.
It would be interesting to know the wchar_t values therein.
(You posted a string earlier but please give the numbers too.)

Aug 1 '07 #8

Jochen Kalmbach [MVP]

Hi Norman!

>Can you please provide a *full* working example?

You mean that I should show the assignment of the value of userName? I
don't know if I can or not, because you proceed to say this:

>And please do not use non-ASCII chars in the source-code,

Maybe you can write:
TCHAR szUserName[] = {0x1234, 0x2345, 0x789A, 0x0000};
?????

>And please do not use non-ASCII chars in the source-code, so that it can
be compiled on other systems with the same result.

Hahahaha.

Maybe you can write:
TCHAR szUserName[] = {0x1234, 0x2345, 0x789A, 0x0000};
?????

Hahahahaha....
So... please provide a small, full working example with ASCII chars in the
source code!

Greetings
Jochen

Aug 1 '07 #9

Cezary Noweta

Hello,

FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you also need a transcript of actions in Windows Explorer to open the log
file in Notepad and show the contents which my previous messages
transcribed?

It would be nice but not necessary ;)

Do you think that maybe the CRT's _vsnwprintf_s could handle the language of
its own version of Windows but the CRT's _ftprintf_s failed because it had
harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert from wide
char to mbcs (current locale CP or console CP). This occurs when writing to the
console, text file and so on. Open the log file in UTF16 mode (i.e. _T("ab") instead
of _T("a")), or use the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932 before you are
calling ftprintf and compare the results.

If it called the invalid parameter handler then I think the rest of the code
(the caller of DebugLog) would not proceed to get everything else working
properly with other Windows APIs, I think the rest of the code would abort.

It called wctomb() which convert to the current locale (at the beginning it is "C"
which means that all chars >= U+0100 are not converted). After it failed fwprintf_s
has failed too and the foo returned number chars output so far. The rest of the code
runs fine.

The string is L"$BCfJ8(B2" (without the quotes). If you really need the
numbers, you can look them up as easily as I can. (The third character is
number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text. At the beginning I thought that the
first two char codes are confidential and you can not disclose it explicitly ;)
Really could not you enumerate codes even at the price of a solution of your problem?

-- best regards

Cezary Noweta

Aug 1 '07 #10

Cezary Noweta

Hello,

Hahahaha.

Hahahahaha....

Hey men, what are smoking? For me, it would be nice to have this stuff now ;-P

-- best regards

Cezary Noweta

Aug 1 '07 #11

Norman Diamond

"Jochen Kalmbach [MVP]" <no********************@holzma.dewrote in message
news:uJ**************@TK2MSFTNGP03.phx.gbl...

>>Can you please provide a *full* working example?

You mean that I should show the assignment of the value of userName? I
don't know if I can or not, because you proceed to say this:

>>And please do not use non-ASCII chars in the source-code,

Maybe you can write:
TCHAR szUserName[] = {0x1234, 0x2345, 0x789A, 0x0000};
?????

$BCf(B = U+4E2D
$BJ8(B = U+6587
2 = U+0032

>>And please do not use non-ASCII chars in the source-code, so that it can
be compiled on other systems with the same result.

Hahahaha.

Maybe you can write:
TCHAR szUserName[] = {0x1234, 0x2345, 0x789A, 0x0000};
?????

Hahahahaha....

Well, the user name isn't intended to be constant. The user name is
intended to be the actual user name of some actual user, and the DLL
receives it by decrypting information that was previously encrypted by some
other DLL that was running under control of the actual user.

So... please provide a small, full working example with ASCII chars in the
source code!

TCHAR szUserName[48] = {0x4E2D, 0x6587, 0x0032, 0x0000};

Not tested. I might have time to test it later today.

Aug 3 '07 #12

Norman Diamond

"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...
[Norman Diamond:]
[Quotation of additional parts of program not originally quoted:]

> FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you think that maybe the CRT's _vsnwprintf_s could handle the language
of its own version of Windows but the CRT's _ftprintf_s failed because it
had harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert
from wide char to mbcs (current locale CP or console CP). This occurs when
writing to the console, text file and so on.

That would be enormously odd. This problem was reproduced in Chinese Vista
with no internationalization whatsoever. At the moment I don't recall what
the code page number is, but it is only one code page number, used in
China - Hong Kong, with no customization of the system locale or user
locale. Language packs can't even be installed on that one because it's
Vista Business not Ultimate. I did add the Japanese keyboard layout though
because the laptop has a Japanese keyboard built in, not a Chinese keyboard.

Nonetheless, if wide printf foos stop output because they are too stupid to
understand their own native default built-in code page after not being
customized at all, then I understand your suggestion that maybe the breakage
occurs in _ftprintf_s instead of _vsnwprintf_s. I might have time to
investigate this later today, maybe.

Open the log file in UTF16 mode (i.e. _T("ab") instead of _T("a")), or use
the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

That meta-debugging code looks like a good suggestion, and I hope to have
time to try it later today.

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932
before you are calling ftprintf and compare the results.

That would be expected to cause problems. In both environments where the
problem has been observed, the actual code page was a Chinese code page not
Japanese:

(1) Japanese Vista Ultimate with system locale and user locale and MUI
language all set to Chinese (Hong Kong) and rebooted several times;

(2) Chinese (Hong Kong) Vista Business with default system locale and user
locale, and no MUI.

>If it called the invalid parameter handler then I think the rest of the
code (the caller of DebugLog) would not proceed to get everything else
working properly with other Windows APIs, I think the rest of the code
would abort.

It called wctomb() which convert to the current locale (at the beginning
it is "C" which means that all chars >= U+0100 are not converted).

Wait a minute. I understand the possibility that the CRT might have
initialized the locale to the "C" locale, and I should try to figure out if
that happened. But if it did, then the point where it breaks and stops
converting characters shouldn't be at U+0100, it should be at U+0080. And
it should happen no matter what the system locale and user locale are.

After it failed fwprintf_s has failed too and the foo returned number
chars output so far. The rest of the code runs fine.

>The string is L"$BCfJ8(B2" (without the quotes

WTF, Outlook Express and every other Microsoft tool involved in these
newsgroup postings, WTF.

I put the cursor after "quotes)." and before " If". I hit the Enter key to
put in a line break so I can type this next stuff. Outlook Express puts the
line break after "quotes" and before "). If". More incredible editing
capabilities from Microsoft.

OK, end of second digression, back to first digression.

In my previous posting, I didn't type a raw JIS string with escape sequences
for shift-in and shift-out, I typed the actual characters. The encoding
format going over the wire was in raw JIS, ISO-2022-JP. Reading my own
previous message in Outlook Express, the message survived the round trip,
with the characters ä¸* and æ–‡ and 2. But when reading your message which
quotes my previous message, Outlook Express is showing raw JIS with escape
sequences and 7-bit byte values. Oh I see, it's because your message format
is Central European. I think Central European encoding can't handle these
Chinese characters. Japanese encoding can hande them because these are
among the characters that were copied from China to Japan during recent
millennia.

Hmm, I guess I should set this current message to use UTF-8 encoding...
Done.

OK, where were we.

>). If you really need the numbers, you can look them up as easily as I
can. (The third character is number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text.

No, you're getting garbage because you're missing fonts and you couldn't
even display the original characters correctly. I looked them up this
morning so here they are:

ä¸* = U+4E2D
æ–‡ = U+6587
2 = U+0032

At the beginning I thought that the first two char codes are confidential
and you can not disclose it
explicitly ;)
Really could not you enumerate codes even at the price of a solution of
your problem?

Well, a high-priority task came in two days ago and yes it was higher
priority than meta-debugging of debugging routines that look like they're
depending on broken library routines. (The actual working code of this DLL
had already been successfully debugged.) But this morning I had time to
look up the codes.

Aug 3 '07 #13

Marc

Here is my test program:

#include <tchar.h>

#include <cstdio>
#include <cstdarg>

void DebugLog(TCHAR* szForm, ...)
{
va_list args;
va_start(args, szForm);
static TCHAR szBuf[2048];
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR), szForm, args);
_vstprintf_s(szBuf, szForm, args);
vwprintf(szForm, args);
va_end(args);
}

int __cdecl _tmain(int argc, _TCHAR* argv[])
{
_TCHAR userName[48] = _T("\u6211\u662f\u4e2d\u570b\u4eba");
DebugLog(_T("Username=%s\n"), userName);
return 0;
}

Tested on Windows XP (SysLocale 0x411), VS 2005 Express (SP0), and it
works like a charm
(minus the question marks on the console, but this was expected).
Cannot
test on WiVi.

Aug 3 '07 #14

Norman Diamond

Thank you for suggesting a test program, but it doesn't look like you ran a
useful test.

To repeat for the nth time, the environments where this failed have a
Chinese system locale and user locale, not Japanese. Only the development
environment was Japanese. Your test used the Japanese system locale and
unstated user locale.

You said you didn't try Vista, so I think we agree that you didn't observe
if you have a repro on Vista. But later today I will try your program on
Vista. (I'll have to see what your characters are though, since we might
perhaps expect failure if they're non-Chinese characters such as kana or
Greek or Cyrillic or accented Italian or whatever.)
"Marc" <pa***********@gmail.comwrote in message
news:11**********************@57g2000hsv.googlegro ups.com...

Here is my test program:

#include <tchar.h>

#include <cstdio>
#include <cstdarg>

void DebugLog(TCHAR* szForm, ...)
{
va_list args;
va_start(args, szForm);
static TCHAR szBuf[2048];
_vsntprintf_s(szBuf, sizeof(szBuf) / sizeof (TCHAR), szForm, args);
_vstprintf_s(szBuf, szForm, args);
vwprintf(szForm, args);
va_end(args);
}

int __cdecl _tmain(int argc, _TCHAR* argv[])
{
_TCHAR userName[48] = _T("\u6211\u662f\u4e2d\u570b\u4eba");
DebugLog(_T("Username=%s\n"), userName);
return 0;
}

Tested on Windows XP (SysLocale 0x411), VS 2005 Express (SP0), and it
works like a charm
(minus the question marks on the console, but this was expected).
Cannot
test on WiVi.

Aug 3 '07 #15

Norman Diamond

OK, I ran approximately this test. The log file contains a lot of lines.
After every line of ordinary debugging information, there is a line with
STRLEN and OUTCHARS exactly as defined by Cezary Noweta.

After every line that doesn't contain a username, the values of STRLEN and
OUTCHARS are equal.

After every line that does contain a username, the value of STRLEN is what
it should be if the value of szBuf includes the entire formatted string,
i.e. some constant text before the username, the username itself, and some
constant text after the username. However, the value of OUTCHARS is -1.
The value of OUTCHARS isn't even the number of characters that _ftprintf_s
wrote before aborting, the value is -1.

So _vsnwprintf_s isn't broken, but at the moment _ftprintf_s seems to be
broken. _ftprintf_s might not be broken though, if the thing is executing
in the "C" locale as someone guessed. I'll have to figure that out next.

I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);
as suggested by Cezary Noweta instead of the safer code
_fputts(szBuf, pf);
as recommended by Kalle Olavi Niemitalo because when _fputts succeeds it
returns a nonzero value which doesn't have to match the number of
characters.

After the above experiment, I tried another one. Using Notepad, I created
the log file in Unicode with no text. But _tfopen_s with _T("a") did not
inspect the existing file to decide whether to keep Unicode as Unicode, it
barged ahead and converted Unicode to ANSI and wrote the ANSI. Then opening
the result in Notepad, since the BOM was still there, Notepad faithfully
tried to display garbage ^_^

Now I have to add some calls to find out what locale the thing is executing
in at the time, is it the Chinese Hong Kong locale (matching the system
locale and user locale) or is it the "C" locale.
"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...

Hello,

> FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you also need a transcript of actions in Windows Explorer to open the
log
file in Notepad and show the contents which my previous messages
transcribed?

It would be nice but not necessary ;)

>Do you think that maybe the CRT's _vsnwprintf_s could handle the language
of
its own version of Windows but the CRT's _ftprintf_s failed because it
had
harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert
from wide
char to mbcs (current locale CP or console CP). This occurs when writing
to the
console, text file and so on. Open the log file in UTF16 mode (i.e.
_T("ab") instead
of _T("a")), or use the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932
before you are
calling ftprintf and compare the results.

>If it called the invalid parameter handler then I think the rest of the
code
(the caller of DebugLog) would not proceed to get everything else working
properly with other Windows APIs, I think the rest of the code would
abort.

It called wctomb() which convert to the current locale (at the beginning
it is "C"
which means that all chars >= U+0100 are not converted). After it failed
fwprintf_s
has failed too and the foo returned number chars output so far. The rest
of the code
runs fine.

>The string is L"$BCfJ8(B2" (without the quotes). If you really need
the
numbers, you can look them up as easily as I can. (The third character
is
number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text. At the beginning I thought
that the
first two char codes are confidential and you can not disclose it
explicitly ;)
Really could not you enumerate codes even at the price of a solution of
your problem?

-- best regards

Cezary Noweta

Aug 3 '07 #16

Norman Diamond

It does get worse.

I deleted the file and then ran the program with this code:
_tfopen_s(&pf, LOG_FILE_NAME, _T("a, ccs=UNICODE"));

http://msdn2.microsoft.com/en-us/lib...e9(VS.80).aspx
* The flag is only used when no BOM is present or if the file is a new
* file.

That is a lie. _tfopen_s created a new file and it created the thing with
ANSI encoding not Unicode.

I deleted the file again, created a file in Notepad containing only an empty
line (CR-LF pair), saved it in Unicode, and then again ran the program with
this code:
_tfopen_s(&pf, LOG_FILE_NAME, _T("a, ccs=UNICODE"));

http://msdn2.microsoft.com/en-us/lib...e9(VS.80).aspx
* If mode is "a, ccs=<encoding>", fopen_s will first try to open the file
* with both read and write access. If it succeeds, it will read the BOM to
* determine the encoding for this file;

This time _tfopen_s seems to have performed correctly. Now let's continue.

http://msdn2.microsoft.com/en-us/lib...e9(VS.80).aspx
* When a Unicode stream-I/O function operates in text mode (the default),
* the source or destination stream is assumed to be a sequence of multibyte
* characters. Therefore, the Unicode stream-input functions convert
* multibyte characters to wide characters (as if by a call to the mbtowc
* function). For the same reason, the Unicode stream-output functions
* convert wide characters to multibyte characters (as if by a call to the
* wctomb function).

In other words, it doesn't matter if _tfopen_s performed correctly because
_ftprintf_s is still going to screw it up. Let's look for confirmation of
this screw-up.

http://msdn2.microsoft.com/en-us/lib...8e(VS.80).aspx
* For the same reason, the Unicode stream-output functions convert wide
* characters to multibyte characters (as if by a call to the wctomb
* function).

Yup, no provision at all for keeping Unicode as Unicode.

However, both of those are half-lies. Half the time, _ftprintf_s violated
MSDN and it kept Unicode as Unicode in the spirit (but not the letter) of
http://msdn2.microsoft.com/en-us/lib...e9(VS.80).aspx.
The other half of the time, _ftprintf_s screwed up worse.

Notepad opened the file in Unicode. The display alternates, a bunch of
readable lines, a few lines of garbage, a bunch of readable lines, a few
lines of garbage, etc.

It seems that ccs=UNICODE is unusable. It changes the result from being
mostly readable (with a little bit of lossage) to being half readable (with
half garbage).
"Norman Diamond" <nd******@community.nospamwrote in message
news:us*************@TK2MSFTNGP06.phx.gbl...

OK, I ran approximately this test. The log file contains a lot of lines.
After every line of ordinary debugging information, there is a line with
STRLEN and OUTCHARS exactly as defined by Cezary Noweta.

After every line that doesn't contain a username, the values of STRLEN and
OUTCHARS are equal.

After every line that does contain a username, the value of STRLEN is what
it should be if the value of szBuf includes the entire formatted string,
i.e. some constant text before the username, the username itself, and some
constant text after the username. However, the value of OUTCHARS is -1.
The value of OUTCHARS isn't even the number of characters that _ftprintf_s
wrote before aborting, the value is -1.

So _vsnwprintf_s isn't broken, but at the moment _ftprintf_s seems to be
broken. _ftprintf_s might not be broken though, if the thing is executing
in the "C" locale as someone guessed. I'll have to figure that out next.

I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);
as suggested by Cezary Noweta instead of the safer code
_fputts(szBuf, pf);
as recommended by Kalle Olavi Niemitalo because when _fputts succeeds it
returns a nonzero value which doesn't have to match the number of
characters.

After the above experiment, I tried another one. Using Notepad, I created
the log file in Unicode with no text. But _tfopen_s with _T("a") did not
inspect the existing file to decide whether to keep Unicode as Unicode, it
barged ahead and converted Unicode to ANSI and wrote the ANSI. Then
opening
the result in Notepad, since the BOM was still there, Notepad faithfully
tried to display garbage ^_^

Now I have to add some calls to find out what locale the thing is
executing
in at the time, is it the Chinese Hong Kong locale (matching the system
locale and user locale) or is it the "C" locale.
"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...
>Hello,

>> FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you also need a transcript of actions in Windows Explorer to open the
log
file in Notepad and show the contents which my previous messages
transcribed?

It would be nice but not necessary ;)

>>Do you think that maybe the CRT's _vsnwprintf_s could handle the
language
of
its own version of Windows but the CRT's _ftprintf_s failed because it
had
harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert
from wide
char to mbcs (current locale CP or console CP). This occurs when writing
to the
console, text file and so on. Open the log file in UTF16 mode (i.e.
_T("ab") instead
of _T("a")), or use the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932
before you are
calling ftprintf and compare the results.

>>If it called the invalid parameter handler then I think the rest of the
code
(the caller of DebugLog) would not proceed to get everything else
working
properly with other Windows APIs, I think the rest of the code would
abort.

It called wctomb() which convert to the current locale (at the beginning
it is "C"
which means that all chars >= U+0100 are not converted). After it failed
fwprintf_s
has failed too and the foo returned number chars output so far. The rest
of the code
runs fine.

>>The string is L"$BCfJ8(B2" (without the quotes). If you really need
the
numbers, you can look them up as easily as I can. (The third character
is
number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text. At the beginning I thought
that the
first two char codes are confidential and you can not disclose it
explicitly ;)
Really could not you enumerate codes even at the price of a solution of
your problem?

-- best regards

Cezary Noweta

Aug 3 '07 #17

Norman Diamond

It gets even more worse.

I added this call:
_ftprintf_s(pf, _T("%s\n"), _tsetlocale(LC_CTYPE, _T("")));
The output was:
Chinese_Hong Kong S.A.R..950

So there is absolutely no excuse for _ftprintf_s to screw up on Chinese
characters. The DLL is not running in the C locale, it's running in the
Chinese Hong Kong locale, code page 950, exactly as it should be.

Here's more MSDN stuff too.
http://msdn2.microsoft.com/en-us/lib...1d(VS.80).aspx
* LC_CTYPE
* The character-handling functions (except isdigit, isxdigit, mbstowcs, and
* mbtowc, which are unaffected).

So mbtowc is one of the exceptions, it wouldn't have been affected even if
the C locale were in use, and presumably it would always use code page 950
and screw up because it's miscoded -- however, wctomb isn't one of the
exceptions, so it would have been affected if the C locale were in use, and
it would screw up differently from the way it actually screws up.

Anyway, thank you whoever it was who said that _vsnwprintf_s isn't broken
and _ftprintf_s. Sorry I found it hard to believe you. You're absolutely
right. _ftprintf_s is broken.
"Norman Diamond" <nd******@community.nospamwrote in message
news:us*************@TK2MSFTNGP06.phx.gbl...

OK, I ran approximately this test. The log file contains a lot of lines.
After every line of ordinary debugging information, there is a line with
STRLEN and OUTCHARS exactly as defined by Cezary Noweta.

After every line that doesn't contain a username, the values of STRLEN and
OUTCHARS are equal.

After every line that does contain a username, the value of STRLEN is what
it should be if the value of szBuf includes the entire formatted string,
i.e. some constant text before the username, the username itself, and some
constant text after the username. However, the value of OUTCHARS is -1.
The value of OUTCHARS isn't even the number of characters that _ftprintf_s
wrote before aborting, the value is -1.

So _vsnwprintf_s isn't broken, but at the moment _ftprintf_s seems to be
broken. _ftprintf_s might not be broken though, if the thing is executing
in the "C" locale as someone guessed. I'll have to figure that out next.

I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);
as suggested by Cezary Noweta instead of the safer code
_fputts(szBuf, pf);
as recommended by Kalle Olavi Niemitalo because when _fputts succeeds it
returns a nonzero value which doesn't have to match the number of
characters.

After the above experiment, I tried another one. Using Notepad, I created
the log file in Unicode with no text. But _tfopen_s with _T("a") did not
inspect the existing file to decide whether to keep Unicode as Unicode, it
barged ahead and converted Unicode to ANSI and wrote the ANSI. Then
opening
the result in Notepad, since the BOM was still there, Notepad faithfully
tried to display garbage ^_^

Now I have to add some calls to find out what locale the thing is
executing
in at the time, is it the Chinese Hong Kong locale (matching the system
locale and user locale) or is it the "C" locale.
"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...
>Hello,

>> FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you also need a transcript of actions in Windows Explorer to open the
log
file in Notepad and show the contents which my previous messages
transcribed?

It would be nice but not necessary ;)

>>Do you think that maybe the CRT's _vsnwprintf_s could handle the
language
of
its own version of Windows but the CRT's _ftprintf_s failed because it
had
harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert
from wide
char to mbcs (current locale CP or console CP). This occurs when writing
to the
console, text file and so on. Open the log file in UTF16 mode (i.e.
_T("ab") instead
of _T("a")), or use the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932
before you are
calling ftprintf and compare the results.

>>If it called the invalid parameter handler then I think the rest of the
code
(the caller of DebugLog) would not proceed to get everything else
working
properly with other Windows APIs, I think the rest of the code would
abort.

It called wctomb() which convert to the current locale (at the beginning
it is "C"
which means that all chars >= U+0100 are not converted). After it failed
fwprintf_s
has failed too and the foo returned number chars output so far. The rest
of the code
runs fine.

>>The string is L"$BCfJ8(B2" (without the quotes). If you really need
the
numbers, you can look them up as easily as I can. (The third character
is
number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text. At the beginning I thought
that the
first two char codes are confidential and you can not disclose it
explicitly ;)
Really could not you enumerate codes even at the price of a solution of
your problem?

-- best regards

Cezary Noweta

Aug 3 '07 #18

Norman Diamond

OMFG.

When I added this call:
_ftprintf_s(pf, _T("%s\n"), _tsetlocale(LC_CTYPE, _T("")));
it didn't query the DLL's current locale the way MSDN says it will. It SET
the current locale, and returned it:
Chinese_Hong Kong S.A.R..950

And, the result of this setting activity did affect the way wctomb operates.
And the result of that setting activity did affect the way _ftprintf_s
operates.

The result was that _ftprintf_s wrote the user name correctly.

In ANSI.

ä¸*æ–‡2

The good news is that there's a workaround for the breakage in _ftprintf_s.
The bad news is that I haven't finished learning how bad Windows can be.
"Norman Diamond" <nd******@community.nospamwrote in message
news:OY**************@TK2MSFTNGP04.phx.gbl...

It gets even more worse.

I added this call:
_ftprintf_s(pf, _T("%s\n"), _tsetlocale(LC_CTYPE, _T("")));
The output was:
Chinese_Hong Kong S.A.R..950

So there is absolutely no excuse for _ftprintf_s to screw up on Chinese
characters. The DLL is not running in the C locale, it's running in the
Chinese Hong Kong locale, code page 950, exactly as it should be.

Here's more MSDN stuff too.
http://msdn2.microsoft.com/en-us/lib...1d(VS.80).aspx
* LC_CTYPE
* The character-handling functions (except isdigit, isxdigit, mbstowcs,
and
* mbtowc, which are unaffected).

So mbtowc is one of the exceptions, it wouldn't have been affected even if
the C locale were in use, and presumably it would always use code page 950
and screw up because it's miscoded -- however, wctomb isn't one of the
exceptions, so it would have been affected if the C locale were in use,
and
it would screw up differently from the way it actually screws up.

Anyway, thank you whoever it was who said that _vsnwprintf_s isn't broken
and _ftprintf_s. Sorry I found it hard to believe you. You're absolutely
right. _ftprintf_s is broken.
"Norman Diamond" <nd******@community.nospamwrote in message
news:us*************@TK2MSFTNGP06.phx.gbl...
>OK, I ran approximately this test. The log file contains a lot of lines.
After every line of ordinary debugging information, there is a line with
STRLEN and OUTCHARS exactly as defined by Cezary Noweta.

After every line that doesn't contain a username, the values of STRLEN
and
OUTCHARS are equal.

After every line that does contain a username, the value of STRLEN is
what
it should be if the value of szBuf includes the entire formatted string,
i.e. some constant text before the username, the username itself, and
some
constant text after the username. However, the value of OUTCHARS is -1.
The value of OUTCHARS isn't even the number of characters that
_ftprintf_s
wrote before aborting, the value is -1.

So _vsnwprintf_s isn't broken, but at the moment _ftprintf_s seems to be
broken. _ftprintf_s might not be broken though, if the thing is
executing
in the "C" locale as someone guessed. I'll have to figure that out next.

I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);
as suggested by Cezary Noweta instead of the safer code
_fputts(szBuf, pf);
as recommended by Kalle Olavi Niemitalo because when _fputts succeeds it
returns a nonzero value which doesn't have to match the number of
characters.

After the above experiment, I tried another one. Using Notepad, I
created
the log file in Unicode with no text. But _tfopen_s with _T("a") did not
inspect the existing file to decide whether to keep Unicode as Unicode,
it
barged ahead and converted Unicode to ANSI and wrote the ANSI. Then
opening
the result in Notepad, since the BOM was still there, Notepad faithfully
tried to display garbage ^_^

Now I have to add some calls to find out what locale the thing is
executing
in at the time, is it the Chinese Hong Kong locale (matching the system
locale and user locale) or is it the "C" locale.
"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...
>>Hello,

FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);

Do you also need a transcript of actions in Windows Explorer to open
the
log
file in Notepad and show the contents which my previous messages
transcribed?

It would be nice but not necessary ;)

Do you think that maybe the CRT's _vsnwprintf_s could handle the
language
of
its own version of Windows but the CRT's _ftprintf_s failed because it
had
harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert
from wide
char to mbcs (current locale CP or console CP). This occurs when writing
to the
console, text file and so on. Open the log file in UTF16 mode (i.e.
_T("ab") instead
of _T("a")), or use the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932
before you are
calling ftprintf and compare the results.

If it called the invalid parameter handler then I think the rest of the
code
(the caller of DebugLog) would not proceed to get everything else
working
properly with other Windows APIs, I think the rest of the code would
abort.

It called wctomb() which convert to the current locale (at the beginning
it is "C"
which means that all chars >= U+0100 are not converted). After it failed
fwprintf_s
has failed too and the foo returned number chars output so far. The rest
of the code
runs fine.

The string is L"$BCfJ8(B2" (without the quotes). If you really
need
the
numbers, you can look them up as easily as I can. (The third character
is
number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text. At the beginning I
thought
that the
first two char codes are confidential and you can not disclose it
explicitly ;)
Really could not you enumerate codes even at the price of a solution of
your problem?

-- best regards

Cezary Noweta

Aug 3 '07 #19

Norman Diamond

Oh, I screwed up this last test. MSDN says my call to _tsetlocale() does
set the locale instead of querying. A null pointer does a query but a null
string sets it to the default from the OS.

OK, good news anyway, when setting to the default from the OS, it worked.

So I still don't know if the DLL actually started up in the C locale
instead, but I'm not in the mood to test it again.
"Norman Diamond" <nd******@community.nospamwrote in message
news:O3**************@TK2MSFTNGP06.phx.gbl...

OMFG.

When I added this call:
_ftprintf_s(pf, _T("%s\n"), _tsetlocale(LC_CTYPE, _T("")));
it didn't query the DLL's current locale the way MSDN says it will. It
SET the current locale, and returned it:
Chinese_Hong Kong S.A.R..950

And, the result of this setting activity did affect the way wctomb
operates. And the result of that setting activity did affect the way
_ftprintf_s operates.

The result was that _ftprintf_s wrote the user name correctly.

In ANSI.

ä¸*æ–‡2

The good news is that there's a workaround for the breakage in
_ftprintf_s. The bad news is that I haven't finished learning how bad
Windows can be.
"Norman Diamond" <nd******@community.nospamwrote in message
news:OY**************@TK2MSFTNGP04.phx.gbl...
>It gets even more worse.

I added this call:
_ftprintf_s(pf, _T("%s\n"), _tsetlocale(LC_CTYPE, _T("")));
The output was:
Chinese_Hong Kong S.A.R..950

So there is absolutely no excuse for _ftprintf_s to screw up on Chinese
characters. The DLL is not running in the C locale, it's running in the
Chinese Hong Kong locale, code page 950, exactly as it should be.

Here's more MSDN stuff too.
http://msdn2.microsoft.com/en-us/lib...1d(VS.80).aspx
* LC_CTYPE
* The character-handling functions (except isdigit, isxdigit, mbstowcs,
and
* mbtowc, which are unaffected).

So mbtowc is one of the exceptions, it wouldn't have been affected even
if
the C locale were in use, and presumably it would always use code page
950
and screw up because it's miscoded -- however, wctomb isn't one of the
exceptions, so it would have been affected if the C locale were in use,
and
it would screw up differently from the way it actually screws up.

Anyway, thank you whoever it was who said that _vsnwprintf_s isn't broken
and _ftprintf_s. Sorry I found it hard to believe you. You're
absolutely
right. _ftprintf_s is broken.
"Norman Diamond" <nd******@community.nospamwrote in message
news:us*************@TK2MSFTNGP06.phx.gbl...
>>OK, I ran approximately this test. The log file contains a lot of
lines.
After every line of ordinary debugging information, there is a line with
STRLEN and OUTCHARS exactly as defined by Cezary Noweta.

After every line that doesn't contain a username, the values of STRLEN
and
OUTCHARS are equal.

After every line that does contain a username, the value of STRLEN is
what
it should be if the value of szBuf includes the entire formatted string,
i.e. some constant text before the username, the username itself, and
some
constant text after the username. However, the value of OUTCHARS is -1.
The value of OUTCHARS isn't even the number of characters that
_ftprintf_s
wrote before aborting, the value is -1.

So _vsnwprintf_s isn't broken, but at the moment _ftprintf_s seems to be
broken. _ftprintf_s might not be broken though, if the thing is
executing
in the "C" locale as someone guessed. I'll have to figure that out
next.

I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);
as suggested by Cezary Noweta instead of the safer code
_fputts(szBuf, pf);
as recommended by Kalle Olavi Niemitalo because when _fputts succeeds it
returns a nonzero value which doesn't have to match the number of
characters.

After the above experiment, I tried another one. Using Notepad, I
created
the log file in Unicode with no text. But _tfopen_s with _T("a") did
not
inspect the existing file to decide whether to keep Unicode as Unicode,
it
barged ahead and converted Unicode to ANSI and wrote the ANSI. Then
opening
the result in Notepad, since the BOM was still there, Notepad faithfully
tried to display garbage ^_^

Now I have to add some calls to find out what locale the thing is
executing
in at the time, is it the Chinese Hong Kong locale (matching the system
locale and user locale) or is it the "C" locale.
"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...
Hello,

FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if (pf) _ftprintf_s(pf, szBuf);
if (pf) fclose(pf);
va_end(args);
>
Do you also need a transcript of actions in Windows Explorer to open
the
log
file in Notepad and show the contents which my previous messages
transcribed?

It would be nice but not necessary ;)

Do you think that maybe the CRT's _vsnwprintf_s could handle the
language
of
its own version of Windows but the CRT's _ftprintf_s failed because it
had
harder work to do? I don't quite think so.

Yes - I think so. Wide printf foos stop output when they cannot convert
from wide
char to mbcs (current locale CP or console CP). This occurs when
writing
to the
console, text file and so on. Open the log file in UTF16 mode (i.e.
_T("ab") instead
of _T("a")), or use the following code:

======
FILE* pf;
_tfopen_s(&pf, LOG_FILE_NAME, _T("a"));
if ( pf ) {
int outchars;
outchars = _ftprintf_s(pf, szBuf);
_ftprintf_s(pf, _T("STRLEN: %u; OUTCHARS: %i\n"),
_tcslen(szBuf),
outchars);
fclose(pf);
}
va_end(args);
======

or try to set locale (,,_tsetlocale(LC_CTYPE, _T(".932"))'') to CP 932
before you are
calling ftprintf and compare the results.

If it called the invalid parameter handler then I think the rest of
the
code
(the caller of DebugLog) would not proceed to get everything else
working
properly with other Windows APIs, I think the rest of the code would
abort.

It called wctomb() which convert to the current locale (at the
beginning
it is "C"
which means that all chars >= U+0100 are not converted). After it
failed
fwprintf_s
has failed too and the foo returned number chars output so far. The
rest
of the code
runs fine.

The string is L"$BCfJ8(B2" (without the quotes). If you really
need
the
numbers, you can look them up as easily as I can. (The third
character
is
number U+0032.)

Oooo... ,,92 86 95 B6 32'' - 14 chars of text. At the beginning I
thought
that the
first two char codes are confidential and you can not disclose it
explicitly ;)
Really could not you enumerate codes even at the price of a solution of
your problem?

-- best regards

Cezary Noweta

Aug 3 '07 #20

Cezary Noweta

Hello,

Oooo... ,,92 86 95 B6 32'' - 14 chars of text.

No, you're getting garbage because you're missing fonts and you couldn't
even display the original characters correctly. I looked them up this
morning so here they are:

No - this is good. This code sequence is Japanes ANSI (932) You have written that
your dev environment is Japanese and you coded your posts in ISO-2022-JP, so I have
sent JAP ANSI codes and not CHS ones.

This problem was reproduced in Chinese Vista
with no internationalization whatsoever.

Nonetheless, if wide printf foos stop output because they are too stupid to
understand their own native default built-in code page after not being
customized at all, then I understand your suggestion that maybe the breakage
occurs in _ftprintf_s instead of _vsnwprintf_s.

It does not matter what is your system ANSI CP. Also wide printf is not so stupid.
ISO states that:

"At program startup, the equivalent of
setlocale(LC_ALL, "C");
is executed."

That's all. Whan you want to play with mixed international language streams, and
especially with Unicde or double-byte character sets, then you must use setlocale().

I deleted the file and then ran the program with this code:
_tfopen_s(&pf, LOG_FILE_NAME, _T("a, ccs=UNICODE"));

http://msdn2.microsoft.com/en-us/lib...e9(VS.80).aspx
* The flag is only used when no BOM is present or if the file is a new
* file.

That is a lie. _tfopen_s created a new file and it created the thing with
ANSI encoding not Unicode.

I deleted the file again, created a file in Notepad containing only an empty
line (CR-LF pair), saved it in Unicode, and then again ran the program with
this code:
_tfopen_s(&pf, LOG_FILE_NAME, _T("a, ccs=UNICODE"));

Everything is OK - look at the table below. When you are creating a new file you
should use "ccs=UTF-16LE" and not "ccs=UNICODE" as "UNICODE" creates ANSI files.

I added this call:
_ftprintf_s(pf, _T("%s\n"), _tsetlocale(LC_CTYPE, _T("")));
The output was:
Chinese_Hong Kong S.A.R..950

So now you know that you have set locale to your system codepage. setlocale(LC_CTYPE,
NULL) returns your current locale.

_ftprintf_s is broken.

The good news is that there's a workaround for the breakage in _ftprintf_s.

fwprintf is not broken. This is not workaround. This is normal way to achieve an
effect you wanted. Even according to the ISO standerd and not to MSDN or MS at all.

ISO states:

"The wide character output functions convert wide characters to multibyte characters
and write them to the stream as if they were written by successive calls to the
fputwc function. Each conversion occurs as if by a call to the wcrtomb function, with
the conversion state described by the stream’s own mbstate_t object. The byte output
functions write characters to the stream as if by successive calls to the fputc
function."

and

"An encoding error occurs if the character sequence presented to the underlying
mbrtowc function does not form a valid (generalized) multibyte character, or if the
code value passed to the underlying wcrtomb does not correspond to a valid
(generalized) multibyte character. The wide character input/output functions and the
byte input/output functions store the value of the macro EILSEQ in errno if and only
if an encoding error occurs."

After your _ftprintf() has been executed within "C" locale, errno contains EILSEQ.
After you have called setlocale(LC_CTYPE, "") your locale is set to your system
codepage and _ftprintf() works OK. Everything is OK.

The bad news is that I haven't finished learning how bad Windows can be.

Yea, but not this time ;)

-- best regards

Cezary Noweta

Aug 3 '07 #21

Kalle Olavi Niemitalo

"Norman Diamond" <nd******@community.nospamwrites:

I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);

Surely you could fix the bug with _ftprintf_s(pf, _T("%s"), szBuf).

(In standard C, L"%s" in a format string takes a char * argument,
just like "%s". In Microsoft C, it instead takes a wchar_t *
argument. I don't know if there is a way for Microsoft to
correct this violation without breaking countless programs.)

Aug 4 '07 #22

Norman Diamond

"Cezary Noweta" <ch***@noemail.noemailwrote in message
news:46***************@noemail.noemail...

Oooo... ,,92 86 95 B6 32'' - 14 chars of text.

No, you're getting garbage because you're missing fonts and you couldn't
even display the original characters correctly. I looked them up this
morning so here they are:

No - this is good. This code sequence is Japanes ANSI (932) You have
written that your dev environment is Japanese and you coded your posts in
ISO-2022-JP, so I have sent JAP ANSI codes and not CHS ones.

The default coding for new posts is ISO-2022-JP. The default for followup
posts is to use the encoding of the post that is being quoted. But you set
the encoding of your posts to Central European.

The program was running in a Chinese environment where the system and user
code page was 950.

14 bytes of text is not 14 chars of text.

Internal to the program, the coding was Unicode not ANSI. Ordinarily it
wouldn't convert to ANSI until _ftprintf_s writes to a file. From this
discussion I learned that even _ftprintf_s won't convert it to ANSI unless
the program does a call to use the system's code page instead of C locale.

It does not matter what is your system ANSI CP. Also wide printf is not so
stupid. ISO states that:

"At program startup, the equivalent of
setlocale(LC_ALL, "C");
is executed."

That's all.

I understand that now. Thank you.

Whan you want to play with mixed international language streams, and
especially with Unicde or double-byte character sets, then you must use
setlocale().

You still don't understand this part of it though. There was no mixing of
international language streams. The execution environment was 100% Chinese.
And the need to use setlocale() applies to single-byte character sets as
much as it does to double-byte character sets. _ftprintf_s should fail on
several characters in your language when using your code page, and
_ftprintf_s should fail on the English character £ when using Western
Europe's code page, just as quickly as it failed on Chinese characters when
using a Chinese code page.

Everyone has to call setlocale() and tell the CRT to use the system's code
page. Hopefully we both understand this now.

>I deleted the file and then ran the program with this code:
_tfopen_s(&pf, LOG_FILE_NAME, _T("a, ccs=UNICODE"));

http://msdn2.microsoft.com/en-us/lib...e9(VS.80).aspx
* The flag is only used when no BOM is present or if the file is a new
* file.

That is a lie. _tfopen_s created a new file and it created the thing
with ANSI encoding not Unicode.

I deleted the file again, created a file in Notepad containing only an
empty line (CR-LF pair), saved it in Unicode, and then again ran the
program with this code:
_tfopen_s(&pf, LOG_FILE_NAME, _T("a, ccs=UNICODE"));

Everything is OK - look at the table below. When you are creating a new
file you should use "ccs=UTF-16LE" and not "ccs=UNICODE" as "UNICODE"
creates ANSI files.

Oh, you are right. The flag UNICODE means ANSI. I wonder why the flag ANSI
doesn't mean UNICODE. Who allowed some programmer to develop a CRT without
a flag named ANSI and meaning UNICODE? At least it's a relief to see that I
wasn't the least competent programmer in this discussion ^_^

Anyway now I understand to put this at the beginning of every program:
_tsetlocale(LC_CTYPE, _T(""));
Now I wonder how to find out if it's safe to call this from DllMain.

Aug 6 '07 #23

Norman Diamond

"Kalle Olavi Niemitalo" <ko*@iki.fiwrote in message
news:87************@Astalo.kon.iki.fi...

"Norman Diamond" <nd******@community.nospamwrites:

>I had to use the unsafe code
outchars = _ftprintf_s(pf, szBuf);

Surely you could fix the bug with _ftprintf_s(pf, _T("%s"), szBuf).

No, the reason this entire discussion started was because _T("%s") fails.
Finally we understand the reason why _T("%s") fails (it was my fault for not
adding a call to setlocale). Nonetheless changing _T("%s") to _T("%s")
wouldn't fix it.

Aug 6 '07 #24

Norman Diamond

"Norman Diamond" <nd******@community.nospamwrote in message
news:u8**************@TK2MSFTNGP04.phx.gbl...

14 bytes of text is not 14 chars of text.

Ooops, internally I read "char" as "character" instead of "char". I get a
C- today.

In C, 14 bytes of text is 14 chars of text. It isn't 14 characters and you
didn't say that it is. Sorry.

Aug 6 '07 #25

Kalle Olavi Niemitalo

"Norman Diamond" <nd******@community.nospamwrites:

Anyway now I understand to put this at the beginning of every program:
_tsetlocale(LC_CTYPE, _T(""));
Now I wonder how to find out if it's safe to call this from DllMain.

setlocale() and _wsetlocale() are not exported from Kernel32.dll,
so you should assume that they are not safe to call from DllMain.
If you link the C runtime statically into your DLL and audit its
source code, then it may be safe, until the next library upgrade.

If you link to the C runtime DLL, then setlocale() can also
affect other modules of the program, thereby increasing
dependencies between them. With _create_locale() and
_ftprintf_s_l(), you could better isolate DLLs from each other.

Aug 6 '07 #26

_vsnwprintf_s seems to be broken

Similar topics