473,383 Members | 1,762 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,383 software developers and data experts.

char * signedness

Hi group,
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?

For instance, I have to compare two unsigned char * strings.
Can I safely use strcmp? Do I need to cast the two strings to char *?

Thank you

--
Pietro Cerutti

PGP Public Key:
http://gahr.ch/pgp
Jul 5 '07 #1
16 1814
Pietro Cerutti wrote:
Hi group,
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?
For the standard library functions, yes, because while they take char *
arguments, they convert it to unsigned char * internally anyway.
For instance, I have to compare two unsigned char * strings.
Can I safely use strcmp?
Yes.
Do I need to cast the two strings to char *?
You need to convert them to char *. You do not necessarily need a cast for
that; you could use an implicit convertion from unsigned char * to void *,
and then another implicit convertion from void * to char *. In this case, a
cast would be a good idea though.
Jul 5 '07 #2
Harald van Dijk wrote:
Pietro Cerutti wrote:
>Hi group,
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?
For the standard library functions, yes, because while they take char *
arguments, they convert it to unsigned char * internally anyway.
This is not true for the implementation of strncmp on my system, which is:

/*** BEGIN STRNCMP ON FREEBSD ***/
int
strncmp(s1, s2, n)
const char *s1, *s2;
size_t n;
{

if (n == 0)
return (0);
do {
if (*s1 != *s2++)
return (*(const unsigned char *)s1 -
*(const unsigned char *)(s2 - 1));
if (*s1++ == 0)
break;
} while (--n != 0);
return (0);
}
/*** END STRNCMP ON FREEBSD ***/

I think I'm missing something about chars and/or implicit conversions.

Could you please explain the output of the following program to me?
The two chars c[0] and d[0] have different values (220 and -36), are not
equal (the comparison operator returns 0) but the two strings c and d
are equal to strncmp (which returns 0) and represent the same string to
printf ("ü").

/*** BEGIN DUMMY TEST PROGRAM ***/
#include <stdio.h>
#include <string.h>

int main(void)
{
unsigned char c[2];
char d[2];

c[0] = 220; c[1] = '\0';
d[0] = c[0]; d[1] = '\0';

printf("c is %s\n", c);
printf("d is %s\n", d);
printf("c[0] is %02x\n", c[0]);
printf("d[0] is %02x\n", d[0]);
printf("c[0] == d[0] is %d\n", (c[0] == d[0]));
printf("strncmp(c, d, 1) is %d\n", strncmp(c, d, 1));

return(0);
}
/*** END DUMMY TEST PROGRAM ***/
Thank you!

--
Pietro Cerutti

PGP Public Key:
http://gahr.ch/pgp
Jul 6 '07 #3
Pietro Cerutti <g...@gahr.chwrote:
Harald van D k wrote:
Pietro Cerutti wrote:
is it always safe to pass unsigned char * variables as
parameters to functions accepting char * arguments?
For the standard library functions, yes, because while
they take char * arguments, they convert it to unsigned
char * internally anyway.

This is not true for the implementation of strncmp on
my system, which is:
Yes it is, under the 'as if' rule.
>
/*** BEGIN STRNCMP ON FREEBSD ***/
int
strncmp(s1, s2, n)
const char *s1, *s2;
size_t n;
{

if (n == 0)
return (0);
do {
if (*s1 != *s2++)
On systems where plain char is signed but unpadded,
this will find differences irrespective of whether
the bytes are treated as signed or unsigned char.
return (*(const unsigned char *)s1 -
*(const unsigned char *)(s2 - 1));
Here the unsigned char rule is applied explicitly as
required by the language specification. Note that
on your system, unsigned char promotes to int which
allows for negative results.
if (*s1++ == 0)
break;
} while (--n != 0);
return (0);}

/*** END STRNCMP ON FREEBSD ***/

I think I'm missing something about chars and/or implicit
conversions.
The problem is that plain char can be signed or unsigned.
Character codings are all non-negative, but char is only
required to be able to store positive values for characters
in the basic execution character set. So characters in the
extended character set may be negative.
Could you please explain the output of the following
program to me? The two chars c[0] and d[0] have different
values (220 and -36), are not equal (the comparison
operator returns 0) but the two strings c and d are equal
to strncmp (which returns 0) and represent the same string
to printf ("ü").

/*** BEGIN DUMMY TEST PROGRAM ***/
#include <stdio.h>
#include <string.h>

int main(void)
{
unsigned char c[2];
char d[2];

c[0] = 220; c[1] = '\0';
d[0] = c[0]; d[1] = '\0';
If plain char is signed (and 8-bits) on your system, this
will put an implementation defined value into d[0]. Most
likely is 220 - 256 == -36. The representation of -36 in
two's complement is the same as the representation of 220
in pure binary notation of an unsigned char.
printf("c is %s\n", c);
printf("d is %s\n", d);
For the reason above, this should print the same thing.
[Note that assuming character codings will make your code
non-portable.]
printf("c[0] is %02x\n", c[0]);
printf("d[0] is %02x\n", d[0]);
printf("c[0] == d[0] is %d\n", (c[0] == d[0]));
Both char and unsigned char values will promote to int
which is capable of supporting the full range of both
character types. Hence, -36 is not the same value as 220.
printf("strncmp(c, d, 1) is %d\n", strncmp(c, d, 1));
Here you are using a function which _must_ compare the
unsigned char values of the character representation.
Not surprisingly, 220 is the same as 220.
return(0);}

/*** END DUMMY TEST PROGRAM ***/
--
Peter

Jul 6 '07 #4
Peter Nilsson wrote:
Pietro Cerutti <g...@gahr.chwrote:
>Harald van D k wrote:
>>Pietro Cerutti wrote:
is it always safe to pass unsigned char * variables as
parameters to functions accepting char * arguments?
For the standard library functions, yes, because while
they take char * arguments, they convert it to unsigned
char * internally anyway.
This is not true for the implementation of strncmp on
my system, which is:
Yes it is, under the 'as if' rule.
>/*** BEGIN STRNCMP ON FREEBSD ***/
int
strncmp(s1, s2, n)
const char *s1, *s2;
size_t n;
{

if (n == 0)
return (0);
do {
if (*s1 != *s2++)
On systems where plain char is signed but unpadded,
this will find differences irrespective of whether
the bytes are treated as signed or unsigned char.
> return (*(const unsigned char *)s1 -
*(const unsigned char *)(s2 - 1));
Here the unsigned char rule is applied explicitly as
required by the language specification. Note that
on your system, unsigned char promotes to int which
allows for negative results.
> if (*s1++ == 0)
break;
} while (--n != 0);
return (0);}

/*** END STRNCMP ON FREEBSD ***/

I think I'm missing something about chars and/or implicit
conversions.
The problem is that plain char can be signed or unsigned.
Character codings are all non-negative, but char is only
required to be able to store positive values for characters
in the basic execution character set. So characters in the
extended character set may be negative.
>Could you please explain the output of the following
program to me? The two chars c[0] and d[0] have different
values (220 and -36), are not equal (the comparison
operator returns 0) but the two strings c and d are equal
to strncmp (which returns 0) and represent the same string
to printf ("ü").

/*** BEGIN DUMMY TEST PROGRAM ***/
#include <stdio.h>
#include <string.h>

int main(void)
{
unsigned char c[2];
char d[2];

c[0] = 220; c[1] = '\0';
d[0] = c[0]; d[1] = '\0';
If plain char is signed (and 8-bits) on your system, this
will put an implementation defined value into d[0]. Most
likely is 220 - 256 == -36. The representation of -36 in
two's complement is the same as the representation of 220
in pure binary notation of an unsigned char.
> printf("c is %s\n", c);
printf("d is %s\n", d);
For the reason above, this should print the same thing.
[Note that assuming character codings will make your code
non-portable.]
> printf("c[0] is %02x\n", c[0]);
printf("d[0] is %02x\n", d[0]);
printf("c[0] == d[0] is %d\n", (c[0] == d[0]));
Both char and unsigned char values will promote to int
which is capable of supporting the full range of both
character types. Hence, -36 is not the same value as 220.
> printf("strncmp(c, d, 1) is %d\n", strncmp(c, d, 1));
Here you are using a function which _must_ compare the
unsigned char values of the character representation.
Not surprisingly, 220 is the same as 220.
> return(0);}

/*** END DUMMY TEST PROGRAM ***/
Thank you for the exhaustive explanation!

Regards,
--
Peter

--
Pietro Cerutti

PGP Public Key:
http://gahr.ch/pgp
Jul 6 '07 #5

Harald van D k wrote:
Pietro Cerutti wrote:
Hi group,
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?

For the standard library functions, yes, because while they take char *
arguments, they convert it to unsigned char * internally anyway.
I don't agree with Harald van D k, long time back I had similar
sort of
querry, please refer the below link, and follow the therad, as it
will
help you in getting the insight behaviour of the unsigned and
signed
values.

http://groups.google.co.in/group/alt.comp.lang.learn.c- c++/
browse_thread/thread/6b06d071ddda12bc/b6aba0a74dff26a0?
lnk=st&q=&rnum=9&hl=en#b6aba0a74dff26a0

Look for the explanation given by BARAT and KARL

HTH
~Ranjeet Gupta

For instance, I have to compare two unsigned char * strings.
Can I safely use strcmp?

Yes.
Do I need to cast the two strings to char *?

You need to convert them to char *. You do not necessarily need a cast for
that; you could use an implicit convertion from unsigned char * to void *,
and then another implicit convertion from void * to char *. In this case, a
cast would be a good idea though.
Jul 6 '07 #6
Pietro Cerutti wrote:
Hi group,
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?

For instance, I have to compare two unsigned char * strings.
Can I safely use strcmp? Do I need to cast the two strings to char *?

Thank you
Given C89 and prototypes:

int strcmp(const char *_s1, const char *_s2);

Your unsigned char * arguments will be coerced automatically to the type
required.

--
Joe Wright
"Everything should be made as simple as possible, but not simpler."
--- Albert Einstein ---
Jul 6 '07 #7
ra***********@gmail.com writes:
Harald van D k wrote:
>Pietro Cerutti wrote:
Hi group,
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?

For the standard library functions, yes, because while they take char *
arguments, they convert it to unsigned char * internally anyway.
I don't agree with Harald van D k, long time back I had similar
sort of
querry, please refer the below link, and follow the therad, as it
will
help you in getting the insight behaviour of the unsigned and
signed
values.

http://groups.google.co.in/group/alt.comp.lang.learn.c- c++/
browse_thread/thread/6b06d071ddda12bc/b6aba0a74dff26a0?
lnk=st&q=&rnum=9&hl=en#b6aba0a74dff26a0
I see nothing there that has a bearing on this thread. You were
asking about signed representations and got the usual mix of correct
and incorrect replies.
Look for the explanation given by BARAT and KARL
I could not find anything by BARAT but Karl misled you (as least as
far as C is concerned) by suggesting that a left shift of a signed
integer with negative value was well-defined.

--
Ben.
Jul 6 '07 #8
Harald van D?k <tr*****@gmail.comwrote:
Pietro Cerutti wrote:
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?
For the standard library functions, yes, because while they take char *
arguments, they convert it to unsigned char * internally anyway.
If by "the standard library functions", you mean strcmp() and
strncmp(), then yes, by 7.21.4. If you intended that statement to
include the rest of the str*() functions, then I would like to see
C&V, as I was not able to locate any text that suggests that any of
the other str*() functions interpret their arguments as unsigned char
*.

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Jul 6 '07 #9
Christopher Benson-Manica wrote:
Harald van D?k <tr*****@gmail.comwrote:
>Pietro Cerutti wrote:
is it always safe to pass unsigned char * variables as parameters to
functions accepting char * arguments?
>For the standard library functions, yes, because while they take char *
arguments, they convert it to unsigned char * internally anyway.

If by "the standard library functions", you mean strcmp() and
strncmp(), then yes, by 7.21.4. If you intended that statement to
include the rest of the str*() functions, then I would like to see
C&V, as I was not able to locate any text that suggests that any of
the other str*() functions interpret their arguments as unsigned char
*.
7.21.1p3 (from n1124; it might have been added even after C99):
"For all functions in this subclause, each character shall be interpreted as
if it had the type unsigned char (and therefore every possible object
representation is valid and has a different value)."
Jul 6 '07 #10
Harald van D?k <tr*****@gmail.comwrote:
7.21.1p3 (from n1124; it might have been added even after C99):
"For all functions in this subclause, each character shall be interpreted as
if it had the type unsigned char (and therefore every possible object
representation is valid and has a different value)."
Thanks. That text is indeed not present in n869, and it's nice to see
that the issue was (eventually) addressed. As long as OP isn't
running on a C89 DS9K implementation, all would seem likely to be well.

--
C. Benson Manica | I *should* know what I'm talking about - if I
cbmanica(at)gmail.com | don't, I need to know. Flames welcome.
Jul 6 '07 #11
On Jul 6, 9:33 am, CBFalconer <cbfalco...@yahoo.comwrote:
Pietro Cerutti wrote:

... snip ...
/*** BEGIN STRNCMP ON FREEBSD ***/
int
strncmp(s1, s2, n)
const char *s1, *s2;
size_t n;
{

... snip ...
Could you please explain the output of the following program to
me? The two chars c[0] and d[0] have different values (220 and
-36), are not equal (the comparison operator returns 0) but the
two strings c and d are equal to strncmp (which returns 0) and
represent the same string to printf ("ü").

Of course not. Your test program classifies one as unsigned char,
and the other as signed char. The same bit pattern represents both
(at least in 2's complement). The freebsd implementation does not
have a proper prototype (uses old fashioned K&R I header),
>so all arguments are passed in as received, and then treated as "const
char *". This makes them equal.
I thought that passing unsigned char* when char* is expected is a UB,
when we have old K&R style function declaration.
There is a possibility of having trap values for char when it is
signed, and a pointer pointing to the trap signed char value might be
passed using unsigned char* pointer. This produces UB when
dereferenced with char* pointer.

IMHO, as per the standards, the function call passing unsigned char*
for this implementation of strncmp() is a UB.

If it isn't a UB, please cite the relevant words of the standards,
which make the behavior well-defined.

Jul 6 '07 #12
Harald van Dijk <tr*****@gmail.comwrites:
[...]
7.21.1p3 (from n1124; it might have been added even after C99):
"For all functions in this subclause, each character shall be interpreted as
if it had the type unsigned char (and therefore every possible object
representation is valid and has a different value)."
Yes, that paragraph is new in n1124. It was added by TC 2, in response
to DR 274, <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_274.htm>.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
"We must do something. This is something. Therefore, we must do this."
-- Antony Jay and Jonathan Lynn, "Yes Minister"
Jul 6 '07 #13
Keith Thompson wrote:
>
Harald van Dijk <tr*****@gmail.comwrites:
[...]
7.21.1p3 (from n1124; it might have been added even after C99):
"For all functions in this subclause,
each character shall be interpreted as
if it had the type unsigned char
(and therefore every possible object
representation is valid and has a different value)."

Yes, that paragraph is new in n1124.
It was added by TC 2, in response
to DR 274, <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_274.htm>.
Then I would suggest changing the description of strchr
so that the value of the c parameter, is converted to
(unsigned char) instead of (char).

Another problem concerning a situation where the standard
can't possibly mean what it says,
is that the rules concerning rank,
prevent char from being signed.

N1124.pdf

6.3.1 Arithmetic operands
6.3.1.1 Boolean, characters, and integers
1 Every integer type has an integer conversion rank
defined as follows:
— No two signed integer types shall have the same rank,
even if they have the same representation.

— The rank of char shall equal the rank of signed char
and unsigned char.
--
pete
Aug 1 '07 #14
pete wrote:
Keith Thompson wrote:
>Harald van Dijk <tr*****@gmail.comwrites:
[...]
7.21.1p3 (from n1124; it might have been added even after C99):
"For all functions in this subclause,
each character shall be interpreted as
if it had the type unsigned char
(and therefore every possible object
representation is valid and has a different value)."

Yes, that paragraph is new in n1124.
It was added by TC 2, in response
to DR 274, <http://www.open-std.org/jtc1/sc22/wg14/www/docs/dr_274.htm>.

Then I would suggest changing the description of strchr
so that the value of the c parameter, is converted to
(unsigned char) instead of (char).
That's an interesting find. You may be right that there's a problem here.
Another problem concerning a situation where the standard
can't possibly mean what it says,
is that the rules concerning rank,
prevent char from being signed.

N1124.pdf

6.3.1 Arithmetic operands
6.3.1.1 Boolean, characters, and integers
1 Every integer type has an integer conversion rank
defined as follows:
— No two signed integer types shall have the same rank,
even if they have the same representation.

— The rank of char shall equal the rank of signed char
and unsigned char.
Plain char may be signed, and an integer type, but it is never a signed
integer type, because signed integer type has a specific definition which
doesn't include plain char, regardless of its signedness. See 6.2.5p4.
Aug 1 '07 #15
Harald van Dijk wrote:
Plain char may be signed, and an integer type, but it is never a signed
integer type, because signed integer type has a specific definition which
doesn't include plain char, regardless of its signedness. See 6.2.5p4.
Sorry, it appears that it isn't an integer type, for the same reason that it
isn't a signed integer type: integer type also has a specific definition
that doesn't include plain char.
Aug 1 '07 #16
Harald van =?UTF-8?B?RMSzaw==?= wrote:
>
Harald van Dijk wrote:
Plain char may be signed, and an integer type,
but it is never a signed integer type,
because signed integer type has a specific definition which
doesn't include plain char, regardless of its signedness.
See 6.2.5p4.

Sorry, it appears that it isn't an integer type,
for the same reason that it
isn't a signed integer type:
integer type also has a specific definition
that doesn't include plain char.
Thank you.
I see now that char is one of the "basic types"
and distinct from the signed and unsigned integer types.

--
pete
Aug 1 '07 #17

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: John Devereux | last post by:
Hi, I would like some advice on whether I should be using plain "chars" for strings. I have instead been using "unsigned char" in my code (for embedded systems). In general the strings contain...
22
by: juanitofoo | last post by:
Hello, I've just switched to gcc 4 and I came across a bunch of warnings that I can't fix. Example: #include <stdio.h> int main() { signed char *p = "Hola";
4
by: ravinderthakur | last post by:
hi all experts, can anybody explain me the difference between the unsigned char and char in c/c++ langugage. specifically how does this affects the c library fucntion such as strcat,strtok...
51
by: Pedro Graca | last post by:
I run into a strange warning (for me) today (I was trying to improve the score of the UVA #10018 Programming Challenge). $ gcc -W -Wall -std=c89 -pedantic -O2 10018-clc.c -o 10018-clc...
6
by: Steven Jones | last post by:
Can anybody illustrate the usefulness of having char and unsigned char? I mean, under what circumstances would one want to use unsigned char (or unsigned char *) rather than char (or char *,...
8
by: Marcin Kalicinski | last post by:
Are 3 types: signed char, char and unsigned char distinct? My compiler is treating char as signed char (i.e. it has sign, and range from -128 to 127), but the following code does not call f<char>...
33
by: Michael B Allen | last post by:
Hello, Early on I decided that all text (what most people call "strings" ) in my code would be unsigned char *. The reasoning is that the elements of these arrays are decidedly not signed. In...
0
by: Ole Nielsby | last post by:
How should a C++ preprocessor interpret char literals in #if directives? From the mingw linits.h (the comment is in the original): ===code snippet begin=== #define SCHAR_MIN (-128) #define...
13
by: Andreas Eibach | last post by:
Hi, let's say I have this: #include <string.h> #define BLAH "foo" Later on, I do this:
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.