By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
446,406 Members | 1,028 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 446,406 IT Pros & Developers. It's quick & easy.

Aliasing rules - int and long

P: n/a
Consider the following program:

#include <stdio.h>

int main(void)
{
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );

if ( p && sizeof(long) == sizeof(int) )
{
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}

free(p);
return 0;
}

Is there any undefined behaviour here? The aliasing rules section
in the standard (C99 6.5) does not seem to permit this, but I can't
see how it would fail, since unsigned int and unsigned long are
required to have pure binary representations.

To clarify, I am expecting the above program will either produce
no output, or output 30, no other options.

Mar 14 '07 #1
Share this Question
Share on Google+
10 Replies


P: n/a
"Old Wolf" <oldw...@inspire.net.nzwrote:
Consider the following program:

#include <stdio.h>

int main(void)
{
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );

if ( p && sizeof(long) == sizeof(int) )
Same size does not mean same range or representation.
{
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}

free(p);
return 0;
}

Is there any undefined behaviour here? The aliasing rules section
in the standard (C99 6.5) does not seem to permit this, but I can't
see how it would fail,
There doesn't need to be an existing architecture on which
something fails in order for undefined behaviour to be undefined
behaviour.
since unsigned int and unsigned long are
required to have pure binary representations.
But there's no guarantee that the value bits in unsigned int
correspond precisely to the value bits in an unsigned long.
For example, one might be big endian, the other might be
little endian.

--
Peter

Mar 14 '07 #2

P: n/a
On Mar 14, 10:19 pm, "Old Wolf" <oldw...@inspire.net.nzwrote:
Consider the following program:

#include <stdio.h>

int main(void)
{
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );

if ( p && sizeof(long) == sizeof(int) )
{
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}

free(p);
return 0;
}

Is there any undefined behaviour here? The aliasing rules section
in the standard (C99 6.5) does not seem to permit this, but I can't
see how it would fail, since unsigned int and unsigned long are
required to have pure binary representations.

To clarify, I am expecting the above program will either produce
no output, or output 30, no other options.
Undefined behavior.

A compiler can assume that whenever an unsigned long is written and an
int is read, both locations are different because otherwise there
would be undefined behavior, and therefore the order of a write and a
read can be reversed under the "as if" rule. Consider the following
example:

int f (int* p, unsigned long* q)
{
*p = 1; *q = 2; return *p;
}

Here the compiler is free to generate code that will always return 1.
If you call it as

unsigned long l;
int i = f ((int *) &l, &l);

then there is undefined behavior.

(And I don't think there is any guarantee that int and unsigned long
have their value bits in the same bit positions. On the Deathstation
8000, int is bigendian and long is littleendian. On the new and
improved Deathstation 9000, all the bits in an int and and unsigned
long are in reversed order, so your example would return 0x78000000.
They are working on a new version where value bits in unsigned long
are in a different random permutation every time you start a program).

Mar 14 '07 #3

P: n/a
Peter Nilsson wrote:
"Old Wolf" <oldw...@inspire.net.nzwrote:
>Consider the following program:

#include <stdio.h>

int main(void)
{
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );

if ( p && sizeof(long) == sizeof(int) )

Same size does not mean same range or representation.
> {
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}

free(p);
return 0;
}

Is there any undefined behaviour here? The aliasing rules section
in the standard (C99 6.5) does not seem to permit this, but I can't
see how it would fail,

There doesn't need to be an existing architecture on which
something fails in order for undefined behaviour to be undefined
behaviour.
>since unsigned int and unsigned long are
required to have pure binary representations.

But there's no guarantee that the value bits in unsigned int
correspond precisely to the value bits in an unsigned long.
For example, one might be big endian, the other might be
little endian.
It would only affect the number printed. You can randomly set bits in
an unsigned long, assuming there are no padding bits, and print it.
I.e. the problems with representation here are the same as if you did

unsigned long a = 30;
unsigned int b;
memcpy(&b, &a, sizeof b);
printf("%u", b);

Only aliasing rules seem to make the original code undefined
(yes we assume no padding bits here, we can test it in runtime,
the example can be modified to avoid any problems with it).

Yevgen
Mar 15 '07 #4

P: n/a
you didn't use "include <malloc.h>"

is it right?

Mar 15 '07 #5

P: n/a
softwindow wrote:
you didn't use "include <malloc.h>"

is it right?
Context?

<malloc.hisn't a standard header. malloc and friends are declared in
<stdlib.h>

--
Ian Collins.
Mar 15 '07 #6

P: n/a
Yevgen Muntyan <muntyan.removet...@tamu.eduwrote:
Peter Nilsson wrote:
"Old Wolf" <oldw...@inspire.net.nzwrote:
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );
if ( p && sizeof(long) == sizeof(int) )
Same size does not mean same range or representation.
{
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}
<snip>
The aliasing rules section in the standard (C99 6.5) does
not seem to permit this, but I can't see how it would fail,
since unsigned int and unsigned long are
required to have pure binary representations.
But there's no guarantee that the value bits in unsigned int
correspond precisely to the value bits in an unsigned long.
For example, one might be big endian, the other might be
little endian.

It would only affect the number printed.
No, both unsigned int and unsigned long can have trap
representations, and Christian Bau has pointed out that
the purpose of aliasing undefined behaviour relates to
optimisation. So a compiler needn't see the intent of
the source.
You can randomly set bits in an unsigned long, assuming there
are no padding bits, and print it.
If you're going to assume a vanilla machine, then there's little
point discussing standard C semantics.

Can I ask you (and the OP): Why is so important to be able
to alias ints through longs and vice versa? What exactly
is wrong with the normal conversion by value? It has
considerably fewer problems than aliasing.

--
Peter

Mar 15 '07 #7

P: n/a
On Mar 15, 1:32 pm, "softwindow" <softwin...@gmail.comwrote:
you didn't use "include <malloc.h>"
Who didn't? Please quote the relevant context.
is it right?
Nope. The malloc function is declared in <stdlib.h>

--
Peter
Mar 15 '07 #8

P: n/a
Peter Nilsson wrote:
Yevgen Muntyan <muntyan.removet...@tamu.eduwrote:
>Peter Nilsson wrote:
>>"Old Wolf" <oldw...@inspire.net.nzwrote:
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );
if ( p && sizeof(long) == sizeof(int) )
Same size does not mean same range or representation.

{
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}
<snip>
>>>The aliasing rules section in the standard (C99 6.5) does
not seem to permit this, but I can't see how it would fail,
since unsigned int and unsigned long are
required to have pure binary representations.
But there's no guarantee that the value bits in unsigned int
correspond precisely to the value bits in an unsigned long.
For example, one might be big endian, the other might be
little endian.
It would only affect the number printed.

No, both unsigned int and unsigned long can have trap
representations,
Not if there are no padding bits.
>You can randomly set bits in an unsigned long, assuming there
are no padding bits, and print it.

If you're going to assume a vanilla machine, then there's little
point discussing standard C semantics.
Um, are you saying there is no point in discussion standard C
semantics because on vanilla machine this stuff just works?
Compilers which follow letter of standard and break existing
code is not exactly news, even on those "PC" computers.
The OP's example might not be strict enough, but it can
easily check presence of padding bytes. There is huge difference
between code which invokes undefined or unspecified behavior on
some implementation and code which doesn't, even if this code is
not strictly-conforming.
Can I ask you (and the OP): Why is so important to be able
to alias ints through longs and vice versa?
You missed an important thing here. That was malloc'ed area,
and its first bytes were set using assignment; there wasn't
a 'real' object declared to have type long or int. Question is:
why is that different from memcpy(), what do aliasing rules
mean.

double *p = malloc (sizeof (double));
*p = 3.14;
printf("%u\n", *((unsigned*)p));

void *p = malloc (sizeof (double));
*((double*)p) = 3.14;
printf("%u\n", *((unsigned*)p));

double d = 3.14;
unsigned u;
void *p = malloc (sizeof (double));
memcpy(p, &d, sizeof d);
memcpy(&u, p, sizeof u);
printf("%u\n", u);

What is permitted and what isn't? Then replace double with
unsigned long from OP example.
What exactly
is wrong with the normal conversion by value? It has
considerably fewer problems than aliasing.
And may have totally different properties from type punning.
Certainly nothing is wrong with normal conversions.

Yevgen
Mar 15 '07 #9

P: n/a
Yevgen Muntyan <muntyan.removet...@tamu.eduwrote:
Peter Nilsson wrote:
Yevgen Muntyan <muntyan.removet...@tamu.eduwrote:
Peter Nilsson wrote:
"Old Wolf" <oldw...@inspire.net.nzwrote:
/* using malloc to eliminate alignment worries */
unsigned long *p = malloc( sizeof *p );
if ( p && sizeof(long) == sizeof(int) )
Same size does not mean same range or representation.
>> {
*p = 30;
printf( "%u\n", *(unsigned int *)p );
}
<snip>
>>The aliasing rules section in the standard (C99 6.5) does
not seem to permit this, but I can't see how it would fail,
since unsigned int and unsigned long are
required to have pure binary representations.
But there's no guarantee that the value bits in unsigned int
correspond precisely to the value bits in an unsigned long.
For example, one might be big endian, the other might be
little endian.
It would only affect the number printed.
No, both unsigned int and unsigned long can have trap
representations,

Not if there are no padding bits.
Agreed, but if you need to assume no padding bits, then the
construct isn't useful in a maximal portability sense, is it?
You can randomly set bits in an unsigned long, assuming there
are no padding bits, and print it.
If you're going to assume a vanilla machine, then there's little
point discussing standard C semantics.

Um, are you saying there is no point in discussion standard C
semantics because on vanilla machine this stuff just works?
I'm saying if you _have_ to make vanilla assumptions to feel
confident about such code, then you've missed the point of
comp.lang.c.
Compilers which follow letter of standard and break existing
code is not exactly news,
You seem to imply that the existing code is correct and the
standard is wrong. In some cases that may well be true, but
I don't see it being the case in the example given.

Fact is, most existing code is just 'lazy', and you seem to
be asking why the standard doesn't allow a particular form
of lazyness explicitly. The real question is, why should it?
even on those "PC" computers.
The OP's example might not be strict enough, but it can
easily check presence of padding bytes.
And what would be the point? Why do you think the 'real problem',
i.e. whatever circumstance has backed the OP into the corner
of wanting to alias an int with a long (or whatever), can't be
better solved using less shakey code that is just as efficient
on vanilla machines?

<snip>
Can I ask you (and the OP): Why is so important to be able
to alias ints through longs and vice versa?

You missed an important thing here.
That was malloc'ed area,
and its first bytes were set using assignment;
there wasn't a 'real' object declared to have type long
or int.
It became a 'real' object with the assignment [cf effective
type.]
Question is: why is that different from memcpy(),
what do aliasing rules mean.
memcpy isn't about type punning, it's about copying memory
_without_ regard to type.

[Aside: I think it would have been better if C had a separate
type for byte that independant of char, but that wasn't to be.]
double *p = malloc (sizeof (double));
*p = 3.14;
printf("%u\n", *((unsigned*)p));
<snip>
What is permitted and what isn't?
Good question, but I don't see how bad examples will help you
find good uses.
Then replace double with unsigned long from OP example.
To my eye, this replacement isn't any more useful than the
previous double example.
What exactly
is wrong with the normal conversion by value? It has
considerably fewer problems than aliasing.

And may have totally different properties from type punning.
Yes. For a start it's more likely to be well defined, accurate
and useful.
Certainly nothing is wrong with normal conversions.
To play advocate here, there are some problems with characters
and character types to do with aliasing (e.g. explore the value
of as a constant and as an input character), but the non
character examples so far are based on misguided precept that
it is useful to look at an int as a long. I just don't see the
use.

I think you would do better to explore Christian's optimisation
comments. The standard makes explicit exception for _genuine_
union usages; usages that seem to have been ignored in
preference to pointless discussions on reading longs as ints!
I think you should investigate the possibility of reading ints
as _ints_ through distinct structs sharing common initial
sequences.

--
Peter

Mar 15 '07 #10

P: n/a
Peter Nilsson wrote:
Yevgen Muntyan <muntyan.removet...@tamu.eduwrote:
>Peter Nilsson wrote:
>>Yevgen Muntyan <muntyan.removet...@tamu.eduwrote:
Peter Nilsson wrote:
"Old Wolf" <oldw...@inspire.net.nzwrote:
> /* using malloc to eliminate alignment worries */
> unsigned long *p = malloc( sizeof *p );
> if ( p && sizeof(long) == sizeof(int) )
Same size does not mean same range or representation.
> {
> *p = 30;
> printf( "%u\n", *(unsigned int *)p );
> }
<snip>
>The aliasing rules section in the standard (C99 6.5) does
>not seem to permit this, but I can't see how it would fail,
>since unsigned int and unsigned long are
>required to have pure binary representations.
But there's no guarantee that the value bits in unsigned int
correspond precisely to the value bits in an unsigned long.
For example, one might be big endian, the other might be
little endian.
It would only affect the number printed.
No, both unsigned int and unsigned long can have trap
representations,
Not if there are no padding bits.

Agreed, but if you need to assume no padding bits, then the
construct isn't useful in a maximal portability sense, is it?
I see. Then you needed to say this about the original piece of
code which assumes sizeof(long) == sizeof (int). Somehow I thought
you are fine with "if this and this holds, does standard guarantee
that is true?".
>>>You can randomly set bits in an unsigned long, assuming there
are no padding bits, and print it.
If you're going to assume a vanilla machine, then there's little
point discussing standard C semantics.
Um, are you saying there is no point in discussion standard C
semantics because on vanilla machine this stuff just works?

I'm saying if you _have_ to make vanilla assumptions to feel
confident about such code, then you've missed the point of
comp.lang.c.
I doubt it. Making certain assumptions and trying to see what
standard guarantees under those assumptions is not something
off-topic here. Depends on assumptions, of course.
>Compilers which follow letter of standard and break existing
code is not exactly news,

You seem to imply that the existing code is correct and the
standard is wrong.
I don't.
In some cases that may well be true, but
I don't see it being the case in the example given.

Fact is, most existing code is just 'lazy', and you seem to
be asking why the standard doesn't allow a particular form
of lazyness explicitly.
Nope. I am asking *whether* it allows certain things.
The real question is, why should it?
It should not. It either does or does not. Do you have problems
with people who want to know what is allowed and why?
>even on those "PC" computers.
The OP's example might not be strict enough, but it can
easily check presence of padding bytes.

And what would be the point? Why do you think the 'real problem',
i.e. whatever circumstance has backed the OP into the corner
of wanting to alias an int with a long (or whatever),
I don't think he asked the question because of some particular
real situation where he actually does that thing.
can't be
better solved using less shakey code that is just as efficient
on vanilla machines?

<snip>
>>Can I ask you (and the OP): Why is so important to be able
to alias ints through longs and vice versa?
You missed an important thing here.
That was malloc'ed area,
and its first bytes were set using assignment;
there wasn't a 'real' object declared to have type long
or int.

It became a 'real' object with the assignment [cf effective
type.]
>Question is: why is that different from memcpy(),
what do aliasing rules mean.

memcpy isn't about type punning, it's about copying memory
_without_ regard to type.
In fact, it seems 6.5p6 says memcpy() would have the very same effect
as the assignment, the effective type becomes the type of the object
whose value was copied.
[Aside: I think it would have been better if C had a separate
type for byte that independant of char, but that wasn't to be.]
>double *p = malloc (sizeof (double));
*p = 3.14;
printf("%u\n", *((unsigned*)p));
<snip>
>What is permitted and what isn't?

Good question, but I don't see how bad examples will help you
find good uses.
Suppose you're given piece of code (e.g. you want to use some
library with that piece of code). Are you going to throw it
away and rewrite everything to your tastes? Or is it fine to
use if you know it's actually correct? What do you do if you
can't say if it's correct? Say, the person who wrote it might
have been an experienced guy who didn't have any troubles
understanding certain things, and the code is perfectly valid
and everything. Or that person might have not known some things
and the code is crap. What then?
If you think "it's not important because it's useless", then
you either are wrong or you can predict future.
You know, "Don't do that" is not a 100%-working recipe.
(I won't try to tell about knowledge for the sake of knowledge
or knowledge which helps to better and deeper understand
language and all those fancy things)
>Then replace double with unsigned long from OP example.

To my eye, this replacement isn't any more useful than the
previous double example.
>>What exactly
is wrong with the normal conversion by value? It has
considerably fewer problems than aliasing.
And may have totally different properties from type punning.

Yes. For a start it's more likely to be well defined,
This is your problem, you're fine with "this is most likely defined"
and this is "more likely undefined". I'd like to be sure if
something is undefined, and then to know why it is undefined.

....
>Certainly nothing is wrong with normal conversions.

To play advocate here, there are some problems with characters
and character types to do with aliasing (e.g. explore the value
of as a constant and as an input character), but the non
character examples so far are based on misguided precept that
it is useful to look at an int as a long. I just don't see the
use.
I don't see the use either. I still want to know why it's UB.
So that perhaps when I see a thing *similar* to it but disguised
well in code written by a guy who *did* see the use, I could decide
if the code is okay or not.

Yevgen
Mar 16 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.