C variable retyping

Groups User

C allows type casting in which a variable is converted from one type
to another.

Does C (whatever standard) allow the type of a variable to change,
within a statement, avoiding the conversion? And if so, provide an
example of how its done.

Example:
int a=23;
int b=34;
char c='a';
int * p_int=(int *) &c;

// type casting
// c is converted into type integer, then b is added.
(int) c + b;

//retypeing
(type int) c + b;
// similar action
*p_int + b

Ignoring alignment issues, storage size issues, etc., c is retyped to
an integer, no conversion is done, and c is used, within the
statement, as an integer, not a char.

Thanks

Nov 14 '05 #1

Subscribe Post Reply

4988

Eric Sosman

Groups User wrote:

C allows type casting in which a variable is converted from one type
to another.
Not quite: A cast converts a *value* from one type to
another. The distinction may seem petty, but it's central
to your misunderstanding of casts (a misunderstanding many
others share; don't feel bad).
Does C (whatever standard) allow the type of a variable to change,
within a statement, avoiding the conversion? And if so, provide an
example of how its done.

Example:
int a=23;
int b=34;
char c='a';
int * p_int=(int *) &c;
You are aware, I hope, that `p_int' has now been given
a value that may not be usable for any purpose at all.
Even if you convert that value back to a `char*', there's
no guarantee that the re-converted value will point to `c'
any more -- on many machines it will, but on some I've
heard of it could fail 75% of the time.
// type casting
// c is converted into type integer, then b is added.
(int) c + b;
`c' is not changed in any way. The value stored in
`c' is retrieved, that value is converted to `int', and
then `b' is added (and then the result is thrown away).
`c' itself participated only by providing the value that
started the whole chain.
//retypeing
(type int) c + b;
// similar action
*p_int + b

Ignoring alignment issues, storage size issues, etc., c is retyped to
an integer, no conversion is done, and c is used, within the
statement, as an integer, not a char.

No; there's no way to do this in C. Even if there
were some way to change the type of an object (I've argued
in other threads that objects have no types and Obviously
I Am Right, but some people disagree), there would be no
way to change the types of the expressions that manipulate
it, that fetch and store its values. The `+' in your
example is an "int,int plus" as opposed to a "long,long
plus" or a "float,float plus", and nothing will change its
nature. It will continue to grab two `int' operands and
deliver an `int' sum, and nothing in C can persuade it to
behave differently.

--
Er*********@sun.com

Nov 14 '05 #2

Mark A. Odell

go**************@yahoo.com (Groups User) wrote in
news:14**************************@posting.google.c om:

C allows type casting in which a variable is converted from one type
to another.

Does C (whatever standard) allow the type of a variable to change,
within a statement, avoiding the conversion? And if so, provide an
example of how its done.
I don't understand this question and I don't think there is such a thing
as 'retyping'.
Example:
int a=23;
int b=34;
char c='a';
int * p_int=(int *) &c;
I'm not sure this is portable or save. Chars don't usually have any
alignement constraints whereas int surely may.
// type casting
// c is converted into type integer, then b is added.
(int) c + b;
Note 'b' is promoted to int then added to 'c'.
//retypeing
(type int) c + b;

I've never seen this (type int) thing, are you sure you're using ISO C?

--
- Mark ->
--

Nov 14 '05 #3

Martin Ambuhl

Groups User wrote:

C allows type casting in which a variable is converted from one type
to another.
No, casting changes the value, not the variable.
Does C (whatever standard) allow the type of a variable to change,
within a statement, avoiding the conversion?
Variables never change their types.
And if so, provide an
example of how its done.

Example:
int a=23;
int b=34;
char c='a';
int * p_int=(int *) &c;
This is nuts. Suppose sizeof(int) > 1. &c points to one byte, and no
one knows what lives in &c+1. Casting &c to (int *) creates a pointer
value pointing to an area including all those unknown bytes after c.
// type casting
// c is converted into type integer, then b is added.
(int) c + b;
No matter what you think this means, it is at best a noop.
//retypeing
(type int) c + b;
If this weren't a syntax error, it would be a noop.
// similar action
*p_int + b
Similar noop, but not terminated by a semicolon, so who knows?
Ignoring alignment issues, storage size issues, etc., c is retyped to
an integer, no conversion is done, and c is used, within the
statement, as an integer, not a char.

If you want to make sure that there are not multiple conversions of the
value of a variable to another type, just use a temporary variable:
char c='a';
{
int ic = c;
/* use ic */
}

Nov 14 '05 #4

Chris Torek

In article <14**************************@posting.google.com >
Groups User <go**************@yahoo.com> writes:

C allows type casting in which a variable is converted from one type
to another.
This statement is not correct: C's casts convert a *value* from
one type to another, as if by assignment to a temporary variable
whose type is given by the cast.

For instance:

(double)3

"means" the same thing as:

double tmp;

tmp = 3;
tmp

except that the result of the cast is not an object (the variable
"tmp" above is clearly an object -- if you declare tmp, you can
also have "double *dp = &tmp;" and so on).
Does C (whatever standard) allow the type of a variable to change,
within a statement, avoiding the conversion?
No. (Although first we might have to pin down the meaning of
"variable", since the C Standards do not define it, and it turns
out different people have different ideas about this. :-) )

C does, however, allow a subterfuge that, I think, expresses
what you mean to do here:
Example:
int a=23;
int b=34;
char c='a';
int * p_int=(int *) &c;

// type casting
// c is converted into type integer, then b is added.
(int) c + b;

//retypeing
(type int) c + b;
// similar action
*p_int + b

Ignoring alignment issues, storage size issues, etc., c is retyped to
an integer, no conversion is done, and c is used, within the
statement, as an integer, not a char.

I assume you know that, on typical implementations -- at least,
those where *p_int works at all despite those things like alignment
issues -- *p_int accesses some byte(s) that are not at all part of
the object named "c", and hence produces a bizarre value largely
unrelated to the machine's character-code for the letter 'a' (0x61
or 97 if ASCII). On a 16-bit-int machine, for instance, the value
at *p_int might be 0x4061 or 0x6140 if the adjacent byte happens
to be 0x40. Thus, this is not all that useful to begin with.

Nonetheless, we can press on, and use a cast to do the same thing
that *p_int does without using the object named "p_int" here:

int b = 34;
char c = 'a';
...
use(*(int *)&c + b);

By taking the address of "c" (as before) -- a value of type "char
*" pointing to the object named "c" -- and converting that pointer
value to "int *" (as before), we get some implementation-defined
and probably useless pointer value of type "int *". The unary "*"
(indirection) operator can be applied immediately to this value,
following this probably-useless pointer and attempting to retrieve
an entire "int", in just the same way that "*p_int" does.

In other words, you may take a pointer value, then use a cast to
"reshape" the value -- perhaps altering it in some major way just
as int-to-double and double-to-int conversions do on today's machines
-- and then, if the new value is actually correct and useful,
indirect through it immediately. If that new value is correct and
useful -- in this case, almost certainly not -- the result is an
object whose type is determined by the cast you used a moment
earlier (minus the initial "pointer to" of course).

Where this is useful, at least in portable, standard-conforming C,
is where the C Standards guarantee that some "intermediate" pointer
form holds all the necessary information so that the result of the
cast is valid. This holds in C99 for all "struct" pointers, and
in both C89 and C99 for "void *" and "char *" pointers, provided
the "intermediate" pointer is first obtained by converting a pointer
that points to the "final" type. For instance:

double x;
char *cp;

cp = (char *)&x;
*(double *)cp = 3.1415926535897932384626433832795;

is strictly conforming C code, and is guaranteed to set "x" to this
approximation to pi (well, modulo any fuzziness in the implementation's
"double" anyway -- not many will do 30+ decimal digits :-) ). This
is a bizarre thing to do, but it *is* strictly conforming. In more
complicated code, something like this can actually be useful.

In C99, a "struct T *" can always retain all the important bits for
some other "struct U *":

struct U some_var;
struct T *tp;
...
tp = (struct T *)&some_var;
... /* code that does not change "tp" */ ...
(*(struct U *)tp).some_u_field = some_val; /* valid */
((struct U *)tp)->other_u_field = other_val; /* also valid */

Nobody seems to know of any C89 systems on which the above fails
either, which is probably why the C99 folks decided to allow it
in C99.

Note that the "cast back" may put some "important bits" back into
the pointer we use. In particular, in the "cp = (char *)&x;" case,
word-oriented machines may do shift-and-mask operations on the
underlying pointer values. The (double *)cp operation will "un-shift"
and "re-mask" the pointer, so that *(double *)cp gets at "x" instead
of some unrelated object.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #5

Groups User

> > C allows type casting in which a variable is converted from one type

to another. As everyone pointed out, "converted" should be "promoted", and "during
evaluation" should have been tacked on the end. That would have helped
on the semantics of the question, but is still vague.

I don't think there is such a thing
as 'retyping'. That seems to be the concensus of opinion. I certainly never heard of
it, but thought it may be an obscure, rarely used feature.
int * p_int=(int *) &c;

I'm not sure this is portable or save. Chars don't usually have any
alignement constraints whereas int surely may.

Wasn't attempting to start a huge thread on pointers, just using it to
illustrate a point.
// type casting
// c is converted into type integer, then b is added.
(int) c + b;
Note 'b' is promoted to int then added to 'c'.

Exactly. The pomotion happens according to the standard, but only the
value of c is used, and because c is type char, 1 byte.

//retypeing
(type int) c + b;

I've never seen this (type int) thing, are you sure you're using ISO C?

I have not either. But the point is IF c could be "retyped" to an int,
then sizeof(int) bytes would be used during evaluation. As other
posters pointed out, there would be alignment issues, unknown bytes at
&c+1 to &c+sizeof(int), etc...

I'll take this as a resounding NO. No such thing as "retyping".
Thanks

Nov 14 '05 #6

Groups User

> I assume you know that, on typical implementations -- at least,

those where *p_int works at all despite those things like alignment
issues -- *p_int accesses some byte(s) that are not at all part of
the object named "c",
The range of bytes would be (assuming sizeof(int)>1) &c+1 thru
&c+(sizeof(int)-1).
and hence produces a bizarre value largely
unrelated to the machine's character-code for the letter 'a' (0x61
or 97 if ASCII). On a 16-bit-int machine, for instance, the value
at *p_int might be 0x4061 or 0x6140 if the adjacent byte happens
to be 0x40. Thus, this is not all that useful to begin with.
Those bytes, if even allocated in storage, within a valid address
range, and/or at valid addresses, would be dependent on the machine
endianess and architecture.

Nonetheless, we can press on, and use a cast to do the same thing
that *p_int does without using the object named "p_int" here:

int b = 34;
char c = 'a';
...
use(*(int *)&c + b);

By taking the address of "c" (as before) -- a value of type "char
*" pointing to the object named "c" -- and converting that pointer
value to "int *" (as before), we get some implementation-defined
and probably useless pointer value of type "int *". The unary "*"
(indirection) operator can be applied immediately to this value,
following this probably-useless pointer and attempting to retrieve
an entire "int", in just the same way that "*p_int" does.

In other words, you may take a pointer value, then use a cast to
"reshape" the value -- perhaps altering it in some major way just
as int-to-double and double-to-int conversions do on today's machines
-- and then, if the new value is actually correct and useful,
indirect through it immediately. If that new value is correct and
useful -- in this case, almost certainly not -- the result is an
object whose type is determined by the cast you used a moment
earlier (minus the initial "pointer to" of course).

Where this is useful, at least in portable, standard-conforming C,
is where the C Standards guarantee that some "intermediate" pointer
form holds all the necessary information so that the result of the
cast is valid. This holds in C99 for all "struct" pointers, and
in both C89 and C99 for "void *" and "char *" pointers, provided
the "intermediate" pointer is first obtained by converting a pointer
that points to the "final" type. For instance:

double x;
char *cp;

cp = (char *)&x;
*(double *)cp = 3.1415926535897932384626433832795;

is strictly conforming C code, and is guaranteed to set "x" to this
approximation to pi (well, modulo any fuzziness in the implementation's
"double" anyway -- not many will do 30+ decimal digits :-) ). This
is a bizarre thing to do, but it *is* strictly conforming. In more
complicated code, something like this can actually be useful.

In C99, a "struct T *" can always retain all the important bits for
some other "struct U *":

struct U some_var;
struct T *tp;
...
tp = (struct T *)&some_var;
... /* code that does not change "tp" */ ...
(*(struct U *)tp).some_u_field = some_val; /* valid */
((struct U *)tp)->other_u_field = other_val; /* also valid */

Nobody seems to know of any C89 systems on which the above fails
either, which is probably why the C99 folks decided to allow it
in C99.

Note that the "cast back" may put some "important bits" back into
the pointer we use. In particular, in the "cp = (char *)&x;" case,
word-oriented machines may do shift-and-mask operations on the
underlying pointer values. The (double *)cp operation will "un-shift"
and "re-mask" the pointer, so that *(double *)cp gets at "x" instead
of some unrelated object.

Similar to casting, "retyping" would allow subterfuge and introduce
opportunity for non portable code (as in this example). Only persons
integrating non portable assembly and C code would probably find it
useful (expecially with mmx instructions). Still, thought I'd ask. And
thanks for the pointer lecture.

Nov 14 '05 #7

Chris Torek

Regarding *(int *)&c, where c is a "char", I wrote, in part:

I assume you know that, on typical implementations -- at least,
those where *p_int works at all despite those things like alignment
issues -- *p_int accesses some byte(s) that are not at all part of
the object named "c",

In article <14**************************@posting.google.com >,
Groups User <go**************@yahoo.com> wrote:The range of bytes would be (assuming sizeof(int)>1) &c+1 thru
&c+(sizeof(int)-1).
Not necessarily! The ARM, for instance, ignores low-order address
bits based on the size of the access. Here, sizeof(int) is most
likely 4 (I believe there are sizeof(int)==2 ARM compilers, or
at least Thumb compilers -- the Thumb is a stripped-down ARM, as
it were, for "even more embedded" systems than those the ARM is
usually found in). Suppose sizeof(int) is 4 but "c" is at an
address that is congruent to 2 mod 4, so that in a hex-dump of
memory we might have:

(address) (data)
xxxxxx70: 00 11 22 33 44 55 66 77 ...
^^
||
variable named "c", set to 0x22

Here, the bytes making up *(int *)&c are those from xxxxxx70 through
xxxxxx73, not those from xxxxxx72 through xxxxxx75 -- in this case,
&c - 2 through &c + 1. Thus *(int *)&c is either 0x00112233 or
0x33221100, depending on endian-ness.

[much snippage]
Similar to casting, "retyping" would allow subterfuge and introduce
opportunity for non portable code (as in this example). Only persons
integrating non portable assembly and C code would probably find it
useful (expecially with mmx instructions). Still, thought I'd ask. And
thanks for the pointer lecture.

In general, even for this sort of non-portable trickery, you may be
better off with "union"s, or at least with using "unsigned char *"
to get at individual bytes of some object. Unions will guarantee
alignment of the most-strictly-aligned object within the union:

union {
unsigned char c;
int i;
} c;

Now c.c will be at an address congruent to 0 mod 4 on the ARM, so
that you do not access bytes "before" those making up c.i.

One thing to be particularly wary of is that C leaves undefined
the effect of "bad pointer aliasing" using anything *other* than
character data types. With an optimizing compiler (gcc), I have
seen "real world" code break due to making nonportable-but-true-
on-hardware-X assumptions. In particular, we had some code to do
printf's %f and strtod() conversions that did this:

double d;
int32_t *ip = (int32_t *)&d;

assert(sizeof(d) == 2 * sizeof(*ip));

... code that works with ip[0] and ip[1] sometimes ...

The problem that came up was that "d" was stuck into a floating
point register, so that ip[0] and ip[1] were manipulating memory
that *was not even used* (for d).

The rules for the C language allow this kind of optimization.
The compiler may assume that, because ip[i] is not a "char"
(signed or unsigned), it must necessarily not be part of the
variable named "d". Operations accessing ip[i] therefore cannot
read or write d, so it must be OK to stick d in an FPU register
and leave ip[i] in memory! :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #8

Groups User

> Not necessarily! The ARM, for instance, ignores low-order address

bits based on the size of the access.
The "26" bit ARMs had a 24 bit address buss, but a 26 bit PC. All
memory access was word aligned.
Here, sizeof(int) is most
likely 4 (I believe there are sizeof(int)==2 ARM compilers, or
at least Thumb compilers -- the Thumb is a stripped-down ARM, as
it were, for "even more embedded" systems than those the ARM is
usually found in). Suppose sizeof(int) is 4 but "c" is at an
address that is congruent to 2 mod 4, so that in a hex-dump of
memory we might have:

(address) (data)
xxxxxx70: 00 11 22 33 44 55 66 77 ...
^^
||
variable named "c", set to 0x22

Here, the bytes making up *(int *)&c are those from xxxxxx70 through
xxxxxx73, not those from xxxxxx72 through xxxxxx75 -- in this case,
&c - 2 through &c + 1. Thus *(int *)&c is either 0x00112233 Big Endian or 0x33221100 Little Endian, depending on endian-ness.
In general, even for this sort of non-portable trickery, you may be
better off with "union"s, or at least with using "unsigned char *"
to get at individual bytes of some object. Unions will guarantee
alignment of the most-strictly-aligned object within the union:

union {
unsigned char c;
int i;
} c;

Now c.c will be at an address congruent to 0 mod 4 on the ARM, so
that you do not access bytes "before" those making up c.i.
However, in a usual "normal?" allocation of the union "variable"
(identifier) c, the storage allocation would be almost always greater
than sizeof(char), or 1. The code is portable, yes, but not doing
exacly what is desired, which may only be accomplished with the non
portable code. Same with casting or "retyping".

One thing to be particularly wary of is that C leaves undefined
the effect of "bad pointer aliasing" using anything *other* than
character data types. With an optimizing compiler (gcc),
Or even an enforced register storage class "variable."
I have
seen "real world" code break due to making nonportable-but-true-
on-hardware-X assumptions. In particular, we had some code to do
printf's %f and strtod() conversions that did this:

double d;
int32_t *ip = (int32_t *)&d;

assert(sizeof(d) == 2 * sizeof(*ip));

... code that works with ip[0] and ip[1] sometimes ...

The problem that came up was that "d" was stuck into a floating
point register,
Sort of an automaticly enforced register storage class.
so that ip[0] and ip[1] were manipulating memory
that *was not even used* (for d).

The rules for the C language allow this kind of optimization.
The compiler may assume that, because ip[i] is not a "char"
(signed or unsigned), it must necessarily not be part of the
variable named "d". Operations accessing ip[i] therefore cannot
read or write d, so it must be OK to stick d in an FPU register
and leave ip[i] in memory! :-)

The language continues to evolve via the standards, but there will
always be these types of situations.

Nov 14 '05 #9

C variable retyping

Similar topics