On Mar 27, 9:52 pm, Paul Brettschneider <paul.brettschn ei...@yahoo.fr>
wrote:
James Kanze wrote:
[...]
On a 2's complement machine with 8 bit bytes, where plain char
is signed. Exceptions to the first two conditions are fairly
rare today. But unsigned plain char is an option with a lot of
compilers, and the default on some as well (and would make life
a lot easier if you could count on it).
Some hardware might be optimised for signed chars and other hardware for
unsigned chars (I think this is the reason that Linux/PPC defaults to
unsigned whereas Linux/x86 traditionally uses signed).
That was the original motivation. The first two machines to
support C were the PDP-11 and the Interdata 8/32. On the first,
making char unsigned would have had a significant performance
penalty, and on the second, making it signed would have slowed
things down. And at the time, only US ASCII mattered, so even
signed, all of the encodings actually being used were positive,
and it didn't seem to make a difference.
Today, of course, most machines support either equally well, and
seven bit ASCII has been almost universally superceded by eight
bit codes: ISO 8859-n or UTF-8 (or some propriatary code pages
in console windows under Windows---probably for historical
reasons). But of course, most implementations continue to make
plain char signed, because that's the way it was on a PDP-11,
code was written which counted on it (although K&R warned from
the very beginning not to), and implementors don't want to risk
breaking it.
The result is, of course, that something like:
std::transform( s.begin(), s.end(),
s.begin(),
::tolower ) ;
results in undefined behavior.
--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34