By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,538 Members | 2,225 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,538 IT Pros & Developers. It's quick & easy.

conversion of string to all lower case

P: n/a
DJ
Can someone tell me the library call that converts strings to lower case or
retrns a new string that is lower case of the original, thanks

im using <string>

David
Jul 22 '05 #1
Share this Question
Share on Google+
22 Replies


P: n/a
DJ
Or perhaps even better a compare that ignores case.

thanks

"DJ" <ch*****@earthlink.net> wrote in message
news:vV****************@newsread3.news.pas.earthli nk.net...
Can someone tell me the library call that converts strings to lower case or retrns a new string that is lower case of the original, thanks

im using <string>

David

Jul 22 '05 #2

P: n/a
DJ wrote:

Or perhaps even better a compare that ignores case.

thanks

"DJ" <ch*****@earthlink.net> wrote in message
news:vV****************@newsread3.news.pas.earthli nk.net...
Can someone tell me the library call that converts strings to lower case

or
retrns a new string that is lower case of the original, thanks

im using <string>

David


The discussion regarding the (international) caveats of lower/upper case and
case-insensitive *word* comparisons comes up monthly. Check the Google Groups
archives for more blather than you want to read, as well as a couple of
(somewhat) portable/internationalized solutions.
Jul 22 '05 #3

P: n/a
Julie wrote:
The discussion regarding the (international) caveats of lower/upper case and
case-insensitive *word* comparisons comes up monthly. Check the Google Groups
archives for more blather than you want to read, as well as a couple of
(somewhat) portable/internationalized solutions.

I am confused by your terminology "international" here. What do you mean?

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #4

P: n/a
DJ wrote:
Can someone tell me the library call that converts strings to lower case or
retrns a new string that is lower case of the original, thanks

im using <string>

David

Check std::toupper() and std::tolower() functions of <cctype>.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #5

P: n/a
Ioannis Vranos wrote:
Julie wrote:
The discussion regarding the (international) caveats of lower/upper case
and case-insensitive *word* comparisons comes up monthly.
Check the Google Groups archives for more blather than you want to read,
as well as a couple of (somewhat) portable/internationalized solutions.


I am confused by your terminology "international" here. What do you mean?


One example is the german character ß that doesn't have a single uppercase
equivalent. 'Fuß' would need to compare equal to 'FUSS'.

Jul 22 '05 #6

P: n/a
Rolf Magnus wrote:
One example is the german character ß that doesn't have a single uppercase
equivalent. 'Fuß' would need to compare equal to 'FUSS'.

This is not the case here, since we are talking about std::string.

About multilingual characters, one should use wchar_t, std::wstring and
the std::towlower(), std::towupper() of <cwctype>, all guaranteed to work.
C++98:

"Type wchar_t is a distinct type whose values can represent distinct
codes for all members of the largest extended character set specified
among the supported locales (22.1.1). Type wchar_t shall have the same
size, signedness, and alignment requirements (3.9) as one of the other
integral types, called its underlying type."


--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #7

P: n/a
Ioannis Vranos wrote:
Rolf Magnus wrote:
One example is the german character ß that doesn't have a single
uppercase equivalent. 'Fuß' would need to compare equal to 'FUSS'.

This is not the case here, since we are talking about std::string.

About multilingual characters, one should use wchar_t, std::wstring and
the std::towlower(), std::towupper() of <cwctype>, all guaranteed to work.


How do those handle such a conversion? The main point here is that the
number of characters in the uppercase version and in the lowercase version
are not equal. Character-based toupper and tolower can't handle that.

Jul 22 '05 #8

P: n/a
Rolf Magnus wrote:
How do those handle such a conversion? The main point here is that the
number of characters in the uppercase version and in the lowercase version
are not equal. Character-based toupper and tolower can't handle that.



However they work for Greek and English and I assume all languages with
one to one, lower-case to upper-case correspondence, so I guess it is
for such languages and up to the programmer to take this decision.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #9

P: n/a
Ioannis Vranos wrote:

Julie wrote:
The discussion regarding the (international) caveats of lower/upper case and
case-insensitive *word* comparisons comes up monthly. Check the Google Groups
archives for more blather than you want to read, as well as a couple of
(somewhat) portable/internationalized solutions.


I am confused by your terminology "international" here. What do you mean?


I mean that there are languages that apparently do not have a 1-1
correspondence between upper and lower case words (and characters).

For English, u/l case comparisons are trivial. For German, there are issues.

This is what I mean about 'international' -- if the OP is writing a
locale-independent application (assumed to be the case unless indicated
otherwise), they will have to contend w/ such 'international' issues.
Jul 22 '05 #10

P: n/a
Julie wrote:
I mean that there are languages that apparently do not have a 1-1
correspondence between upper and lower case words (and characters).

For English, u/l case comparisons are trivial. For German, there are issues.

This is what I mean about 'international' -- if the OP is writing a
locale-independent application (assumed to be the case unless indicated
otherwise), they will have to contend w/ such 'international' issues.

However the OP was talking about std::string and not std::wstring.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #11

P: n/a
Ioannis Vranos wrote:

Julie wrote:
I mean that there are languages that apparently do not have a 1-1
correspondence between upper and lower case words (and characters).

For English, u/l case comparisons are trivial. For German, there are issues.

This is what I mean about 'international' -- if the OP is writing a
locale-independent application (assumed to be the case unless indicated
otherwise), they will have to contend w/ such 'international' issues.


However the OP was talking about std::string and not std::wstring.


OP:

"im using <string>"

No further information was provided about specific type or locale dependence,
therefore not assumed in my responses.
Jul 22 '05 #12

P: n/a
Julie wrote:
However the OP was talking about std::string and not std::wstring.

OP:

"im using <string>"

No further information was provided about specific type or locale dependence,
therefore not assumed in my responses.

From the subject "conversion of string to all lower case" and the question

"Can someone tell me the library call that converts strings to lower
case or retrns a new string that is lower case of the original, thanks

im using <string>"
it looks like he is asking about the usual stuff.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #13

P: n/a
Ioannis Vranos wrote:

Julie wrote:
However the OP was talking about std::string and not std::wstring.

OP:

"im using <string>"

No further information was provided about specific type or locale dependence,
therefore not assumed in my responses.


From the subject "conversion of string to all lower case" and the question

"Can someone tell me the library call that converts strings to lower
case or retrns a new string that is lower case of the original, thanks

im using <string>"

it looks like he is asking about the usual stuff.


Right -- and I gave the usual answer.

nfc
Jul 22 '05 #14

P: n/a
Julie wrote:
it looks like he is asking about the usual stuff.

Right -- and I gave the usual answer.

I don't think so. In simple words, he is talking about chars and you
about wchar_ts.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #15

P: n/a

"Ioannis Vranos" <iv*@guesswh.at.grad.com> wrote in message
news:1098928197.123403@athnrd02...
Julie wrote:
it looks like he is asking about the usual stuff.

Right -- and I gave the usual answer.

I don't think so. In simple words, he is talking about chars and you about
wchar_ts.

--
Ioannis Vranos

http://www23.brinkster.com/noicys


Is the header <string> or the class <string>?

Catalin
Jul 22 '05 #16

P: n/a
In message <1098928197.123403@athnrd02>, Ioannis Vranos
<iv*@guesswh.at.grad.com> writes
Julie wrote:
it looks like he is asking about the usual stuff.

Right -- and I gave the usual answer.

I don't think so. In simple words, he is talking about chars and you
about wchar_ts.


German is part of ISO8859-1, which is commonly stored in char, not
wchar_t.

--
Richard Herring
Jul 22 '05 #17

P: n/a
Richard Herring wrote:
I don't think so. In simple words, he is talking about chars and you
about wchar_ts.

German is part of ISO8859-1, which is commonly stored in char, not
wchar_t.

Nope.
TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".

"A type wchar_t is provided to hold characters of a larger character set
such as Unicode. It is a distinct type. The size of wchar_t is
implementation-defined and large enough to hold the largest character
set supported by the implementations locale (see 21.7, C.3.3)."
To give an example, in Windows GUI applications, char is guaranteed to
work only for English characters, for any other language you should use
wchar_t.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #18

P: n/a
Ioannis Vranos wrote:
Richard Herring wrote:
I don't think so. In simple words, he is talking about chars and you
about wchar_ts.

German ß is part of ISO8859-1, which is commonly stored in char, not
wchar_t.

Nope.


Yup.
TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".
Right. Nothing here forbids non-ASCII characters.
"A type wchar_t is provided to hold characters of a larger character set
such as Unicode. It is a distinct type. The size of wchar_t is
implementation-defined and large enough to hold the largest character
set supported by the implementation’s locale (see 21.7, C.3.3)."
And what does that have to do with ISO-8895-1? It's neither a unicode
character set, not a multibyte character set. It's an 8bit character set,
so each character of it will always fit into a byte. So char is perfect for
holding it.
To give an example, in Windows GUI applications, char is guaranteed to
work only for English characters, for any other language you should use
wchar_t.


Is that so?

Jul 22 '05 #19

P: n/a
Rolf Magnus wrote:
TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".

Right. Nothing here forbids non-ASCII characters.

This discussion can't reach a reasonable conclusion. Just give more
thought on the subject.


--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #20

P: n/a
In message <1098969343.609953@athnrd02>, Ioannis Vranos
<iv*@guesswh.at.grad.com> writes
Richard Herring wrote:
I don't think so. In simple words, he is talking about chars and you
about wchar_ts. German ß is part of ISO8859-1, which is commonly stored in char,
not wchar_t.


Nope.


Yes, actually. German sz ligature is code point 223 in ISO8859-1,
better known as the Latin-1 character set, and was for example the
standard character set and encoding for HTML up to 3.2.
TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".
And how many C++ implementations do you know of where char is less than
8 bits? ISO8859-1 has only 256 code points and can happily be
accommodated in 8-bit chars.

ISO/IEC 14882 says it better:
"Objects declared as characters (char) shall be large enough to store
any member of the implementation's basic character set."

.... and it's quite possible that that basic character set *is*
ISO8859-1.
"A type wchar_t is provided to hold characters of a larger character
set such as Unicode.
Yes. So what? ISO8859-1 is not Unicode, and wchar_t is not necessary to
hold it.
It is a distinct type. The size of wchar_t is implementation-defined
and large enough to hold the largest character set supported by the
implementation’s locale (see 21.7, C.3.3)."
To give an example, in Windows GUI applications, char is guaranteed to
work only for English characters, for any other language you should use
wchar_t.


Even if that were true (it isn't - there's an API for messing with
"code pages"), what does the Windows GUI have to do with standard C++?

--
Richard Herring
Jul 22 '05 #21

P: n/a
In message <1098972077.25470@athnrd02>, Ioannis Vranos
<iv*@guesswh.at.grad.com> writes
Rolf Magnus wrote:
TC++PL says it well:

"A char variable is of the natural size to hold a character on a given
machine (typically a byte)".

Right. Nothing here forbids non-ASCII characters.

This discussion can't reach a reasonable conclusion. Just give more
thought on the subject.


There is a perfectly reasonable conclusion, which had already been
stated before you tried to contradict it. That is, that there is an
insoluble problem with applying character-by-character toupper() to some
alphabets, which is present whether you use char or wchar_t.

--
Richard Herring
Jul 22 '05 #22

P: n/a
Richard Herring wrote:
There is a perfectly reasonable conclusion, which had already been
stated before you tried to contradict it. That is, that there is an
insoluble problem with applying character-by-character toupper() to some
alphabets, which is present whether you use char or wchar_t.


OK I can accept this. In any case toupper(), tolower() of <cctype>, and
towupper(), towlower() of <cwctype>, are all guaranteed to work for
languages with one to one, lower-case to upper-case correspondence, that
fit in the supported characters, and is up to the programmer to take
this decision.
Even if Greek is provided in the extended ASCII character set, and char
implementation is unsigned in my platform, when I use Greek or some
language other than English, I am using wchar_t which is Unicode in my
system and fits it 100% (wchar_t is the largest character set supported
by any platform).
So for languages with one to one correspondence of lower-case to
upper-case characters, these facilities are guaranteed to work.
In any case, I am 100% certain the OP was talking about English anyway.

--
Ioannis Vranos

http://www23.brinkster.com/noicys
Jul 22 '05 #23

This discussion thread is closed

Replies have been disabled for this discussion.