Interesting warnings from latest MS compiler

Noah Roberts

The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it is
part of POSIX. The question I have, is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?

Jun 14 '06 #1

Subscribe Post Reply

2194

Phlip

Noah Roberts wrote:

The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it is
part of POSIX. The question I have, is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?

That's not legal deprecation. It's because MS programmers are not known for
their ability to refrain from permitting unchecked buffer overruns. So MS
now marks all the unchecked buffer functions as deprecated.

As an example of the problem, I have a lite web server on my home computer.
Every once in a while, something hits it with an URL containing a couple
hundred pad characters, 00000000000 or something, and then a little bit of
code.

Malware, somewhere, is probing for a known bug in a known server. (Not _my_
server, thank you!) So the main cause of all these security issues is
unchecked buffer handling in low-level code. So use fgets() instead of
gets(), and use snprintf() instead of sprintf().

And, heck, use std::string for gosh's sakes!!!

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!

Jun 14 '06 #2

red floyd

Noah Roberts wrote:

The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it is
part of POSIX. The question I have, is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?

aren't all identifiers starting with "str" reserved by C89 (and
therefore C++)?

Jun 14 '06 #3

Markus Schoder

Noah Roberts wrote:

The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it is
part of POSIX. The question I have, is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?

stricmp is not POSIX. POSIX has strcasecmp though.

Jun 14 '06 #4

Phlip

Noah Roberts wrote:

...is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?

That's not why they did it.

I thought the Standard(s) reserved all str.* functions for the
implementation, the same way as _[A-Z] are all reserved. So some str are
Standard Library, and some are whatever the implementation sez they are.

--
Phlip
http://c2.com/cgi/wiki?ZeekLand <-- NOT a blog!!!

Jun 14 '06 #5

Default User

Default User wrote:

Noah Roberts wrote:
The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it
is part of POSIX. The question I have, is it really necissary for
MS to move from that name to _stricmp in order to be compliant with
the C++ standard?

The str* functions are reserved in the C standard for future library
directions. I don't know if that transfers to C++ that way.

Err, checking the standard, what I said is true for values of * == "a
lower case letter".

Brian

Jun 14 '06 #6

Default User

Noah Roberts wrote:

The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it is
part of POSIX. The question I have, is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?

The str* functions are reserved in the C standard for future library
directions. I don't know if that transfers to C++ that way.

Brian

Jun 14 '06 #7

Noah Roberts

Phlip wrote:
]

And, heck, use std::string for gosh's sakes!!!]

Interestingly, several of the operations in the standard library,
including some in basic_string, are "depricated" ;)
Potentially unsafe method
Safer equivalent

basic_string::copy
basic_string::_Copy_s

basic_istream::read
basic_istream::_Read_s

basic_istream::readsome
basic_istream::_Readsome_s

basic_streambuf::sgetn
basic_streambuf::_Sgetn_s

basic_streambuf::xsgetn
basic_streambuf::_Xsgetn_s

char_traits::copy
char_traits::_Copy_s

char_traits::move
char_traits::_Move_s

ctype::narrow
ctype::_Narrow_s

ctype::do_narrow
ctype::_Do_narrow_s

ctype::widen
ctype::_Widen_s

ctype::do_widen
ctype::_Do_widen_s

Jun 15 '06 #8

Noah Roberts

Phlip wrote:

Noah Roberts wrote:
The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it is
part of POSIX. The question I have, is it really necissary for MS to
move from that name to _stricmp in order to be compliant with the C++
standard?
And, heck, use std::string for gosh's sakes!!!

It should be mentioned that there is no stricmp equiv in std::string.
You can create a string that does do that with basic_string and an
overridden char_traits, but this introduces other difficulties.

Jun 15 '06 #9

Victor Bazarov

Noah Roberts wrote:

Phlip wrote:
Noah Roberts wrote:
The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it
is part of POSIX. The question I have, is it really necissary for
MS to move from that name to _stricmp in order to be compliant with
the C++ standard?

And, heck, use std::string for gosh's sakes!!!

It should be mentioned that there is no stricmp equiv in std::string.

It should probably also be mentioned that there is no stricmp in the C
Standard library either. Had there been one, std::string *would* have
its equivalent, most likely.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jun 15 '06 #10

Markus Schoder

Victor Bazarov wrote:

Noah Roberts wrote:
Phlip wrote:
Noah Roberts wrote:

The latest MS compiler "depricates" certain functions including
stricmp. The function stricmp is not part of the C++ standard, it
is part of POSIX. The question I have, is it really necissary for
MS to move from that name to _stricmp in order to be compliant with
the C++ standard?

And, heck, use std::string for gosh's sakes!!!

It should be mentioned that there is no stricmp equiv in std::string.

It should probably also be mentioned that there is no stricmp in the C
Standard library either. Had there been one, std::string *would* have
its equivalent, most likely.

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why C++
could not have fixed it in basic_string. It is just silly IMHO.

Jun 15 '06 #11

Victor Bazarov

Markus Schoder wrote:

Victor Bazarov wrote:
Noah Roberts wrote:
Phlip wrote:
Noah Roberts wrote:

> The latest MS compiler "depricates" certain functions including
> stricmp. The function stricmp is not part of the C++ standard, it
> is part of POSIX. The question I have, is it really necissary for
> MS to move from that name to _stricmp in order to be compliant
> with the C++ standard?

And, heck, use std::string for gosh's sakes!!!

It should be mentioned that there is no stricmp equiv in
std::string.

It should probably also be mentioned that there is no stricmp in the
C Standard library either. Had there been one, std::string *would*
have its equivalent, most likely.

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why
C++ could not have fixed it in basic_string. It is just silly IMHO.

I do not think so. There has to be some balance between what's provided
in the standard library and what's left to the programmer to implement.
Theoretically, you don't pay for what you don't use. Practically, though,
you always pay for the size of the library and the compiler and/or the
run-time environment. Why, FCOL, didn't C implement stricmp if there are
already all pieces there? Catch my drift?...

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jun 15 '06 #12

Noah Roberts

Victor Bazarov wrote:

Theoretically, you don't pay for what you don't use. Practically, though,
you always pay for the size of the library and the compiler and/or the
run-time environment. Why, FCOL, didn't C implement stricmp if there are
already all pieces there? Catch my drift?...

Well, in this case (basic_string) the function is actually not
implemented if not used. As I understand it, member functions of
templates are not compiled unless called.

Jun 15 '06 #13

Victor Bazarov

Noah Roberts wrote:

Victor Bazarov wrote:
Theoretically, you don't pay for what you don't use. Practically,
though, you always pay for the size of the library and the compiler
and/or the run-time environment. Why, FCOL, didn't C implement
stricmp if there are already all pieces there? Catch my drift?...

Well, in this case (basic_string) the function is actually not
implemented if not used. As I understand it, member functions of
templates are not compiled unless called.

I guessed so much... You don't catch my drift. Oh well..

Jun 15 '06 #14

Markus Schoder

Victor Bazarov wrote:

Markus Schoder wrote:
Victor Bazarov wrote:
Noah Roberts wrote:
Phlip wrote:
> Noah Roberts wrote:
>
>> The latest MS compiler "depricates" certain functions including
>> stricmp. The function stricmp is not part of the C++ standard, it
>> is part of POSIX. The question I have, is it really necissary for
>> MS to move from that name to _stricmp in order to be compliant
>> with the C++ standard?

> And, heck, use std::string for gosh's sakes!!!

It should be mentioned that there is no stricmp equiv in
std::string.

It should probably also be mentioned that there is no stricmp in the
C Standard library either. Had there been one, std::string *would*
have its equivalent, most likely.

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why
C++ could not have fixed it in basic_string. It is just silly IMHO.

I do not think so. There has to be some balance between what's provided
in the standard library and what's left to the programmer to implement.
Theoretically, you don't pay for what you don't use. Practically, though,
you always pay for the size of the library and the compiler and/or the
run-time environment. Why, FCOL, didn't C implement stricmp if there are
already all pieces there? Catch my drift?...

Sure. But the reason is actually a different one as I just found out.
The case sensitivity stuff is considered to be locale dependent i.e.
the C function to use would be strcoll with the LC_COLLATE locale
category set to the appropriate value (this is also honored by tolower
et al).

C++ supports this as well through the localization library. There is
e.g. an operator() in the std::locale class that compares basic_string
objects.

Jun 15 '06 #15

Markus Schoder

Markus Schoder wrote:

Victor Bazarov wrote:
Markus Schoder wrote:
Victor Bazarov wrote:
> Noah Roberts wrote:
>> Phlip wrote:
>>> Noah Roberts wrote:
>>>
>>>> The latest MS compiler "depricates" certain functions including
>>>> stricmp. The function stricmp is not part of the C++ standard, it
>>>> is part of POSIX. The question I have, is it really necissary for
>>>> MS to move from that name to _stricmp in order to be compliant
>>>> with the C++ standard?
>>
>>> And, heck, use std::string for gosh's sakes!!!
>>
>> It should be mentioned that there is no stricmp equiv in
>> std::string.
>
> It should probably also be mentioned that there is no stricmp in the
> C Standard library either. Had there been one, std::string *would*
> have its equivalent, most likely.

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why
C++ could not have fixed it in basic_string. It is just silly IMHO.

I do not think so. There has to be some balance between what's provided
in the standard library and what's left to the programmer to implement.
Theoretically, you don't pay for what you don't use. Practically, though,
you always pay for the size of the library and the compiler and/or the
run-time environment. Why, FCOL, didn't C implement stricmp if there are
already all pieces there? Catch my drift?...

Sure. But the reason is actually a different one as I just found out.
The case sensitivity stuff is considered to be locale dependent i.e.
the C function to use would be strcoll with the LC_COLLATE locale
category set to the appropriate value (this is also honored by tolower
et al).

C++ supports this as well through the localization library. There is
e.g. an operator() in the std::locale class that compares basic_string
objects.

Did a few tests and it turns out that strcoll at least for the locales
that I have available does not compare e.g. "a" and "A" as equal they
are just besides each other in the collation order.

So for true case insensitive comparisons you still have to roll your
own.

Jun 15 '06 #16

Noah Roberts

Victor Bazarov wrote:

Noah Roberts wrote:
Victor Bazarov wrote:
Theoretically, you don't pay for what you don't use. Practically,
though, you always pay for the size of the library and the compiler
and/or the run-time environment. Why, FCOL, didn't C implement
stricmp if there are already all pieces there? Catch my drift?...

Well, in this case (basic_string) the function is actually not
implemented if not used. As I understand it, member functions of
templates are not compiled unless called.

I guessed so much... You don't catch my drift. Oh well..

Oh, are you drifting again?

Jun 15 '06 #17

Richard Herring

In message <11**********************@h76g2000cwa.googlegroups .com>,
Markus Schoder <a3*************@yahoo.de> writes

Victor Bazarov wrote:
Noah Roberts wrote:
> Phlip wrote:
>> Noah Roberts wrote:
>>
>>> The latest MS compiler "depricates" certain functions including
>>> stricmp. The function stricmp is not part of the C++ standard, it
>>> is part of POSIX. The question I have, is it really necissary for
>>> MS to move from that name to _stricmp in order to be compliant with
>>> the C++ standard?
>
>> And, heck, use std::string for gosh's sakes!!!
>
> It should be mentioned that there is no stricmp equiv in std::string.

It should probably also be mentioned that there is no stricmp in the C
Standard library either. Had there been one, std::string *would* have
its equivalent, most likely.

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why C++
could not have fixed it in basic_string. It is just silly IMHO.

char what = toupper('ß');

--
Richard Herring

Jun 15 '06 #18

Kirit Sælensminde

Markus Schoder wrote:

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why C++
could not have fixed it in basic_string. It is just silly IMHO.

But the C functions only work on ASCII characters don't they? Or is the
character spec undefined? (probably undefined given that C will work
with EBCDIC or whatever it's called) You can't (generally) do case
conversion unless you know the language of the string in question
because the case rules vary from one language to another.

So, you can write some software that does the case conversion for ASCII
characters, but you don't really need a library to work out a-z == A-Z.

The most famous example of this is probably the "Turkish i". Turkish
Windows systems won't boot if you use the proper case conversion rules
for Turkish so they hobbled it (much to the annoyance of Turkish
speaking people, but much the favour of anybody from outside Turkey
trying to sell them software).

I gues Unix systems get around this by insisting that everything has
the right case (a file called 'A.txt' isn't the same as 'a.txt') and
this is what I've now done on my company's framework too.

Jun 15 '06 #19

Phlip

Markus Schoder wrote:

There is a islower, isupper, tolower and toupper function in the C and C++
standard libraries which means they are case aware. There is no good
reason why case insensitive comparison is not available or why C++ could
not have fixed it in basic_string. It is just silly IMHO.

The Standards are (sometimes) careful to leave things out that must then
get changed. islower etc are _not_ case-aware. They only do specific
case-like things to raw ASCII letters, so the Standards must leave them in
as the rock-bottom must-have functions.

So when C achieves a useful locale system, it may then support a
high-level strcompare() routine that rates encoded strings for equivalence.

Like Joel Spolsky sez somewhere, "There's no such thing as raw text!"

--
Phlip

Jun 15 '06 #20

Phlip

Noah Roberts wrote:

Interestingly, several of the operations in the standard library,
including some in basic_string, are "depricated" ;)

Potentially unsafe method
Safer equivalent

basic_string::copy
basic_string::_Copy_s

Are the equivalents safer because they are harder to overflow?

(And could you practice writing "deprecated"? That spelling doesn't
inspire my newsreader to underline it with a wavy red line...)

--
Phlip

Jun 15 '06 #21

Noah Roberts

Phlip wrote:

(And could you practice writing "deprecated"? That spelling doesn't
inspire my newsreader to underline it with a wavy red line...)

Get a less annoying newsreader. Might help you to refrain from being a
pedantic, lecturing, butthead.

Jun 15 '06 #22

Victor Bazarov

Noah Roberts wrote:

Victor Bazarov wrote:
Noah Roberts wrote:
Victor Bazarov wrote:

Theoretically, you don't pay for what you don't use. Practically,
though, you always pay for the size of the library and the compiler
and/or the run-time environment. Why, FCOL, didn't C implement
stricmp if there are already all pieces there? Catch my drift?...

Well, in this case (basic_string) the function is actually not
implemented if not used. As I understand it, member functions of
templates are not compiled unless called.

I guessed so much... You don't catch my drift. Oh well..

Oh, are you drifting again?

Catch me! Catch me! Oh, I am drifting away!...

Jun 15 '06 #23

Markus Schoder

Phlip wrote:

Markus Schoder wrote:
There is a islower, isupper, tolower and toupper function in the C and C++
standard libraries which means they are case aware. There is no good
reason why case insensitive comparison is not available or why C++ could
not have fixed it in basic_string. It is just silly IMHO.

The Standards are (sometimes) careful to leave things out that must then
get changed. islower etc are _not_ case-aware. They only do specific
case-like things to raw ASCII letters, so the Standards must leave them in
as the rock-bottom must-have functions.

So when C achieves a useful locale system, it may then support a
high-level strcompare() routine that rates encoded strings for equivalence.

int strcompare(const char *s1, const char *s2)
{
while(tolower(*s1) == tolower(*s2) && *s1)
++s1, ++s2;
return *s1 - *s2;
}

A locale aware case insensitive string compare function. Why should
there anything be missing?

The question is just if it is common enough to put it in the standard
library or not. I think it is.

Jun 15 '06 #24

Victor Bazarov

Markus Schoder wrote:

Phlip wrote:
Markus Schoder wrote:
There is a islower, isupper, tolower and toupper function in the C
and C++ standard libraries which means they are case aware. There
is no good reason why case insensitive comparison is not available
or why C++ could not have fixed it in basic_string. It is just
silly IMHO.
The Standards are (sometimes) careful to leave things out that must
then get changed. islower etc are _not_ case-aware. They only do
specific case-like things to raw ASCII letters, so the Standards
must leave them in as the rock-bottom must-have functions.

So when C achieves a useful locale system, it may then support a
high-level strcompare() routine that rates encoded strings for
equivalence.

int strcompare(const char *s1, const char *s2)
{
while(tolower(*s1) == tolower(*s2) && *s1)

... && *s1 && *s2)
++s1, ++s2;
return *s1 - *s2;
}

A locale aware case insensitive string compare function. Why should
there anything be missing?
Missing? Wide char processing, maybe? What's it called, Unicode?
The question is just if it is common enough to put it in the standard
library or not. I think it is.

Well, with so many Unicode versions, stuffing all the things into the
library doesn't make much sense to me.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jun 15 '06 #25

Phlip

Markus Schoder wrote:

int strcompare(const char *s1, const char *s2) {
while(tolower(*s1) == tolower(*s2) && *s1)
++s1, ++s2;
return *s1 - *s2;
}
}
A locale aware case insensitive string compare function. Why should there
anything be missing?

The question is just if it is common enough to put it in the standard
library or not. I think it is.

You aren't allowed to call it str[a-z].*.

If you didn't, then the Committee did its job. You found that function
very easy to write, because the Committee provided tolower(). And the
Committee prevented your code from breaking when a future version of a C
language comes along with a real locale system, which can detect upper
case, lower case, and title case correctly in all the scripts that have
cases. Your code would continue to work correctly for ASCII, per your
present requirements, and would not conflict with any str function they
added.

--
Phlip

Jun 15 '06 #26

Markus Schoder

Phlip wrote:

Markus Schoder wrote:
int strcompare(const char *s1, const char *s2) {
while(tolower(*s1) == tolower(*s2) && *s1)
++s1, ++s2;
return *s1 - *s2;
}
}
A locale aware case insensitive string compare function. Why should there
anything be missing?

The question is just if it is common enough to put it in the standard
library or not. I think it is.
You aren't allowed to call it str[a-z].*.

That's understood I was putting myself in the role of a library
implementor.
If you didn't, then the Committee did its job. You found that function
very easy to write, because the Committee provided tolower(). And the
Committee prevented your code from breaking when a future version of a C
language comes along with a real locale system, which can detect upper
case, lower case, and title case correctly in all the scripts that have
cases. Your code would continue to work correctly for ASCII, per your
present requirements, and would not conflict with any str function they
added.

The function is fully locale aware. You make it sound like we are
waiting for some kind of addition or change to the standard until such
a function can be part of the standard library. I just have no idea
what that would be.

Jun 15 '06 #27

kwikius

Noah Roberts wrote:

Phlip wrote:
(And could you practice writing "deprecated"? That spelling doesn't
inspire my newsreader to underline it with a wavy red line...)

Get a less annoying newsreader. Might help you to refrain from being a
pedantic, lecturing, butthead.

Yeah... but Phlip's a lovely, pedantic, lecturing, butthead though aint
he ?

:-)

regards
Andy Little

Jun 15 '06 #28

Markus Schoder

Richard Herring wrote:

In message <11**********************@h76g2000cwa.googlegroups .com>,
Markus Schoder <a3*************@yahoo.de> writes
Victor Bazarov wrote:
Noah Roberts wrote:
> Phlip wrote:
>> Noah Roberts wrote:
>>
>>> The latest MS compiler "depricates" certain functions including
>>> stricmp. The function stricmp is not part of the C++ standard, it
>>> is part of POSIX. The question I have, is it really necissary for
>>> MS to move from that name to _stricmp in order to be compliant with
>>> the C++ standard?
>
>> And, heck, use std::string for gosh's sakes!!!
>
> It should be mentioned that there is no stricmp equiv in std::string.

It should probably also be mentioned that there is no stricmp in the C
Standard library either. Had there been one, std::string *would* have
its equivalent, most likely.

There is a islower, isupper, tolower and toupper function in the C and
C++ standard libraries which means they are case aware. There is no
good reason why case insensitive comparison is not available or why C++
could not have fixed it in basic_string. It is just silly IMHO.

char what = toupper('ß');

toupper('ß') == 'ß'
tolower('ß') == 'ß'

Jun 15 '06 #29

Phlip

Markus Schoder wrote:

The function is fully locale aware.

?

Okay, maybe I don't understand tolower(). Will it handle LATIN SMALL
LIGATURE OE (Å“) correctly?

--
Phlip

Jun 15 '06 #30

Phlip

kwikius wrote:

Yeah... but Phlip's a lovely, pedantic, lecturing, butthead though aint he
?

ain't

--
Phlip

Jun 15 '06 #31

Victor Bazarov

Markus Schoder wrote:

[..]
toupper('ß') == 'ß'
tolower('ß') == 'ß'

But isn't it wrong? How about toupper('?') or tolower('?')?
At least on my computer I naively expect it to be '?' and '?',
respectively. (Yes, I said *naively*, I know it most likely
not going to work)

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jun 15 '06 #32

Markus Schoder

Victor Bazarov wrote:

Markus Schoder wrote:
Phlip wrote:
Markus Schoder wrote:

There is a islower, isupper, tolower and toupper function in the C
and C++ standard libraries which means they are case aware. There
is no good reason why case insensitive comparison is not available
or why C++ could not have fixed it in basic_string. It is just
silly IMHO.

The Standards are (sometimes) careful to leave things out that must
then get changed. islower etc are _not_ case-aware. They only do
specific case-like things to raw ASCII letters, so the Standards
must leave them in as the rock-bottom must-have functions.

So when C achieves a useful locale system, it may then support a
high-level strcompare() routine that rates encoded strings for
equivalence.
int strcompare(const char *s1, const char *s2)
{
while(tolower(*s1) == tolower(*s2) && *s1)

... && *s1 && *s2)

No this is unnecessary. Good example though why not everybody should be
required to think this through again.

++s1, ++s2;
return *s1 - *s2;
}

A locale aware case insensitive string compare function. Why should
there anything be missing?

Missing? Wide char processing, maybe? What's it called, Unicode?
The question is just if it is common enough to put it in the standard
library or not. I think it is.

Well, with so many Unicode versions, stuffing all the things into the
library doesn't make much sense to me.

There is just one additional wide character function required
(wcscompare). The different Unicode versions are handled by the locale
specific low-level functions which are already part of the standard
(e.g. towlower(wint_t)).

Jun 15 '06 #33

kwikius

Phlip wrote:

kwikius wrote:
Yeah... but Phlip's a lovely, pedantic, lecturing, butthead though aint he
?

ain't

Geez! Butthead!

........... ;-)

regards
Andy Little

Jun 15 '06 #34

Markus Schoder

Phlip wrote:

Markus Schoder wrote:
The function is fully locale aware.

?

Okay, maybe I don't understand tolower(). Will it handle LATIN SMALL
LIGATURE OE (œ) correctly?

If it is a valid letter in the currently set locale it will.

Some letters may be only representable in a wide character set for
those you would need the wide character version of the compare function
which would use the towlower function instead (also standard). But that
is a different issue since you obviously need a complete set of new
functions to cover wide character sets.

Jun 15 '06 #35

Phlip

Markus Schoder wrote:

Okay, maybe I don't understand tolower(). Will it handle LATIN SMALL
LIGATURE OE (œ) correctly?

If it is a valid letter in the currently set locale it will.

Please examine the source to your tolower(). One of mine calls this:

ctype<char>::do_tolower(char __c) const
{ return (char) _S_lower[(unsigned char) __c]; }

And _S_lower is a big static table of character mappings. The top half
of the table trivially maps each character to itself. I'm aware that more
advanced versions of tolower() are possible, but this one appears
locale-proof. It's STLPort, and I don't know how compliant it is.

So let's simplify the question by picking ISO Latin 1 (ISO/IEC 8859-1)
letters. Most desktops default to that.

So here's Æ, LATIN CAPITAL LIGATURE AE, at '\xC6'. Its lowercase is at
'\xE6'. You think you can make this assertion pass:

assert('\xE6' == tolower('\xC6'));

Is there some way to set the locale to ISO Latin 1 first, to get that to
pass?

--
Phlip

Jun 15 '06 #36

Victor Bazarov

Phlip wrote:

[..]
So here's Æ, LATIN CAPITAL LIGATURE AE, at '\xC6'. Its lowercase is at
'\xE6'. You think you can make this assertion pass:

assert('\xE6' == tolower('\xC6'));

Since both chars are not present in the basic character set, your question
cannot be answered in implementation-independent manner, I believe. But
once you enter implementation-specific behaviour, anything is possible, no?

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Jun 15 '06 #37

Markus Schoder

Phlip wrote:

Markus Schoder wrote:
Okay, maybe I don't understand tolower(). Will it handle LATIN SMALL
LIGATURE OE (œ) correctly?

If it is a valid letter in the currently set locale it will.

Please examine the source to your tolower(). One of mine calls this:

ctype<char>::do_tolower(char __c) const
{ return (char) _S_lower[(unsigned char) __c]; }

And _S_lower is a big static table of character mappings. The top half
of the table trivially maps each character to itself. I'm aware that more
advanced versions of tolower() are possible, but this one appears
locale-proof. It's STLPort, and I don't know how compliant it is.

So let's simplify the question by picking ISO Latin 1 (ISO/IEC 8859-1)
letters. Most desktops default to that.

So here's Æ, LATIN CAPITAL LIGATURE AE, at '\xC6'. Its lowercase is at
'\xE6'. You think you can make this assertion pass:

assert('\xE6' == tolower('\xC6'));

Is there some way to set the locale to ISO Latin 1 first, to get that to
pass?

You can try

setlocale(LC_ALL, "");

which should set the locale to some sane value (may depend on
environment variables).

The only locale that must exist is "C" which is also the default until
you call setlocale(). This of course is just plain ASCII.

Anyway the following program

#include <cctype>
#include <iostream>
#include <clocale>

using namespace std;

int main()
{
cout << hex << tolower('\xC6') << endl;
setlocale(LC_ALL, "");
cout << hex << tolower('\xC6') << endl;
}

produces:

c6
e6

So yes works like a charm for me.

Jun 15 '06 #38

Phlip

Markus Schoder wrote:

int main()
{
cout << hex << tolower('\xC6') << endl; setlocale(LC_ALL, "");
cout << hex << tolower('\xC6') << endl;
}
}
produces:

c6
e6

So yes works like a charm for me.

Yay! I learned something new about tolower()! (And STLport!)

Your strcompare() still won't work, because it won't handle multiple byte
character sets, such as UTF-8. ;-)

--
Phlip

Jun 15 '06 #39

Phlip

kwikius wrote:

> Yeah... but Phlip's a lovely, pedantic, lecturing, butthead though
> aint he ?

ain't

Geez! Butthead!

How necessary, the apostrophe.

So small, so cute, so quaint.

It fits between the letters

To point out where they ain't.

--
Phlip

Jun 15 '06 #40

Markus Schoder

Phlip wrote:

Your strcompare() still won't work, because it won't handle multiple byte
character sets, such as UTF-8. ;-)

Yep, kind of proofs my point. There is a whole slew of multibyte
character/string handling functions in the standard. So everybody is
supposed to use these and a host of wide character functions to
implement a case insensitive compare function on his own. This way lies
madness.

A locale aware case insensitive compare function should be in the
standard. Period.

Jun 15 '06 #41

kwikius

Phlip wrote:

kwikius wrote:
> Yeah... but Phlip's a lovely, pedantic, lecturing, butthead though
> aint he ?

ain't

Geez! Butthead!

How necessary, the apostrophe.

So small, so cute, so quaint.

It fits between the letters

To point out where they ain't.

BUTTHEAD!

regards
Andy Little

Jun 15 '06 #42

Richard Herring

In message <11*********************@y41g2000cwy.googlegroups. com>,
Markus Schoder <a3*************@yahoo.de> writes

Richard Herring wrote:
In message <11**********************@h76g2000cwa.googlegroups .com>,
Markus Schoder <a3*************@yahoo.de> writes
>Victor Bazarov wrote:
>> Noah Roberts wrote:
>> > Phlip wrote:
>> >> Noah Roberts wrote:
>> >>
>> >>> The latest MS compiler "depricates" certain functions including
>> >>> stricmp. The function stricmp is not part of the C++ standard, it
>> >>> is part of POSIX. The question I have, is it really necissary for
>> >>> MS to move from that name to _stricmp in order to be compliant with
>> >>> the C++ standard?
>> >
>> >> And, heck, use std::string for gosh's sakes!!!
>> >
>> > It should be mentioned that there is no stricmp equiv in std::string.
>>
>> It should probably also be mentioned that there is no stricmp in the C
>> Standard library either. Had there been one, std::string *would* have
>> its equivalent, most likely.
>
>There is a islower, isupper, tolower and toupper function in the C and
>C++ standard libraries which means they are case aware. There is no
>good reason why case insensitive comparison is not available or why C++
>could not have fixed it in basic_string. It is just silly IMHO.
char what = toupper('ß');

toupper('ß') == 'ß'

Indeed. But shouldn't it be "SS"?
tolower('ß') == 'ß'

--
Richard Herring

Jun 16 '06 #43

Interesting warnings from latest MS compiler

Similar topics