473,770 Members | 2,781 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Amusing C, amusing compiler

Ark
A function
int foo(struct T *x)
{
return (x+1)-x;
}
should always return a 1, no matter how T is defined. (And int could be
replaced with ptrdiff_t for you pedants.)

For one thing, it was amusing to watch how my compiler (the famed IAR
EWARM for ARM) jumps through hoops to arrive at this answer.
[ For sizeof(struct T)==6,
00000000 061080E2 ADD R1,R0,#+6
00000004 18209FE5 LDR R2,??foo_0 ;; 0xaaaaaaab
00000008 92318CE0 UMULL R3,R12,R2,R1
0000000C 2CC1A0E1 LSR R12,R12,#+2
00000010 0210A0E1 MOV R1,R2
00000014 912083E0 UMULL R2,R3,R1,R0
00000018 2331A0E1 LSR R3,R3,#+2
0000001C 03004CE0 SUB R0,R12,R3
00000020 0EF0A0E1 MOV PC,LR ;; return
??foo_0:
00000024 ABAAAAAA DC32 0xaaaaaaab
]

How does your compiler fare?
[MSVC gets it right:
mov eax, 1
ret 0
]

Another thing is that, logically, since the actual type doesn't matter,
it could be an incomplete type. However, if I just say
struct T;
before the foo's body, compilation fails. Is it good and/or justified?

- Ark
Oct 6 '06
29 2377
"pete" <pf*****@mindsp ring.comwrote in message
news:45******** **@mindspring.c om...
Ark wrote:
>If
- an expression correctly evaluates to something sensible /regardless/
of the values of some of its terms, and
- these terms are known to be valid (although not known precisely)
then what's wrong with accepting such an expression?

If x is a pointer to an incomplete type,
then (x + 1) doesn't evaluate to anything.
Right. About the only sensible retort from a compiler would
be what Paul Hogan said in "Crocodile Dundee": "One what?"

:-)

-Mike
Oct 7 '06 #21
ri*****@cogsci. ed.ac.uk (Richard Tobin) writes:
In article <ea************ *************** ***@comcast.com >,
Ark <ak*****@macroe xpressions.comw rote:
>>A function
int foo(struct T *x)
{
return (x+1)-x;
}
should always return a 1, no matter how T is defined.

I think that's true, provided it does not invoke undefined behaviour.
Assuming that struct T isn't an incomplete type.
The following calls would invoke undefined behaviour when the addition
is performed:

void bar(void)
{
struct T *undefined;
struct T array[1];
struct T *null = 0;

foo(undefined);
foo(&array[1]);
foo(null);
}

(maybe the first one invokes it as soon as the variable is passed to
the function?)
If the expression invokes undefined behavior, the compiler can do
anything it likes, including returning 1. So a compiler can
legitimately optimize "(x+1)-x" to just 1 even if it can't prove that
the behavior is defined. But of course it's not required to do so.

This kind of thing is part of why undefined behavior exists: so an
compiler can make assumptions about the code for purposes of
optimization. If the code happens to violate those assumptions, it's
not the compiler's fault.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oct 7 '06 #22
In article <eg**********@c hessie.cirr.com >
Christopher Benson-Manica <at***@otaku.fr eeshell.orgwrot e:
>(This post is not really on-topic at all; it deals with assembly code
generated by gcc 3.3.3 for a situation described by OP.)
Perhaps, but it may be instructive anyway. Here is a little more
on the assembly, given the source. This appears to be targeted to
the DEC Alpha (which has some similarities to MIPS, including many
of the assembler directives and the use of register "$30" as a stack
pointer).
>#include <stdio.h>

struct foo {
int bar;
int baz;
};

int qux( struct foo *f ) {
return (f+1)-f;
}

int main(void)
{
struct foo g;
return qux(&g)-1;
}

, generates this with optimization disabled:

.set noat
.set noreorder
These turn off the assembler's use of register $28 (AKA "$at" or
"assembler temporary") to synthesize various constants, and disable
the assembler's re-ordering of instructions to fill delay slots.
(GNU "gas" does not do any re-ordering anyway, apparently, but it
is always safe to tell it not to.)
.text
.align 2
.globl qux
.ent qux
$qux..ng:
qux:
These -- except the labels, which should be obvious -- are all
"pseudo-instructions", i.e., assembler directives that may or may
not generate any actual code. Most of these should be reasonably
familiar to most people who know assembly languages. The oddball,
".ent", inserts a debug information record marking the entry point
for the function.
.frame $15,32,$26,0
This pseudo-instruction inserts another debug-information record
-- which may also be used by longjmp() and/or C++ exception handlers;
I am not sure about this -- indicating that there is an "actual
frame pointer" in register $15, which is often, but not always,
used as a frame pointer. (The frame size, 32 bytes, is the 2nd
argument. Register $26 normally holds the return address. I am
not sure why it appears in this directive. The last 0 is ignored,
at least in the GNU system.)
.mask 0x4008000,-32
This pseudo-instruction inserts another debugger record, this time
indicating which registers are saved, and at what offset from the
initial $sp. The two bits set in the mask above correspond to
registers $26 and $15 ($30, which holds the stack pointer, is
calculated instead of saved).
lda $30,-32($30)
This subtracts 32 from the stack pointer, creating room on the stack
to store stuff.
stq $26,0($30)
stq $15,8($30)
This stores the previous return address ($26) and the previous frame
pointer ($15) in the stack frame just created. "stq" stores a "quad",
an 8-byte quantity; registers are 8 bytes.
bis $31,$30,$15
Oddly enough, this is actually a "move" instruction. "bis" is the
bitwise OR instruction, but register $31 is hardwired to zero, so
"$31 or $30" ORs register $30 with all-zero-bits, which obviously
is just the value in register $30. The result is stored in register
$15 -- i.e., "bis $31,$30,$15" copies the value in $30 to $15.
.prologue 0
This inserts the last of the stack-frame debug data, indicating
that the stack frame is now complete (and, in this case, that
register $27 is not being used by this code). If you run the code
under a debugger and it stops at various points, these debug records
tell it how to find and interpret the stack contents for a stack-trace.
(In other words, the stack frame format is more flexible than on,
e.g., the 80x86 or SPARC.)
stq $16,16($15)
$16 is the first scalar, non-floating-point (i.e., integral or
pointer) argument to the function -- in this case, "f". So this
stores the actual argument from the parameter register into the
stack-frame just built (at offset 16 from register $15).
lda $0,1($31)
Again, $31 is hardwired to zero, so computing offset 1 from it
computes the value 1. This is stored in register $0, which is
the register holding the return value.
bis $31,$15,$30
This copies the value in register $15 back to register $30 -- i.e.,
moves the frame pointer value back into the stack pointer. Since
the two registers are (still) equal (from the last such move), this
is entirely unnecessary; but this *is* unoptimized code. In a
function that made use of internal stack allocation (using C99's
VLAs or the non-standard alloca()), this might be required in
some cases (those where one could not use $15 for everything to
follow).
ldq $26,0($30)
This reloads register $26 (the return address) from where it was
saved, even though it is has not been changed.
ldq $15,8($30)
This reloads register $15 (the frame pointer) from where it was saved.
lda $30,32($30)
This reloads register $30 (the previous stack pointer) from where it
was saved. The stack frame that was built at the function entry is
now destroyed; the function must now return.
ret $31,($26),1
This returns to the caller -- the address in $26 -- and presumably
puts the address of the "ret" instruction itself (or that of the
next instruction) in $31. Since $31 is hardwired to zero, that
causes it to be discarded. I assume "ret" is just an alias for
"jmp" (but I am not an Alpha expert). (The final 1 is a prediction
as to whether the branch is taken. Since this is an unconditional
branch, obviously it is taken; it seems pointless to bother writing
this part, and apparently it *is* optional in the assembler.)
.end qux
.align 2
.globl main
.ent main
main:
These end the qux() function and start the main() function, in the
same way. (I omit the rest of the code since it should now be
obvious.)
>Not so good, but as I believe someone elsethread (but in ng) mentioned
recently, gcc is notorious for generating brain-dead code unless you
ask it to optimize.
More specifically, it never looks at anything other than one expression
at a time, and even then it often does not look closely. However, in
the code above, the "working" part of the code is just one instruction,
setting register $0 (the return value) to 1.
>When asked to do so (given -O3), gcc generates

.set noat
.set noreorder
.text
.align 2
.align 4
.globl main
.ent main
$main..ng:
main:
These are largely as before, except now we are generating code for
main() first. (Also, it is curious that there are two .align
directives. The first one could be removed with no effect.) The
main() function has had qux() expanded in-line:
.frame $30,16,$26,0
lda $30,-16($30)
.prologue 0
This time, there is no separate frame pointer -- the frame is
given by register $30, the stack pointer register. It has only
16 bytes in it -- and aside from zero bytes, this is the minimal
stack frame size, as stack frames must be multiples of 16 bytes.
No additional registers are saved, so no ".mask" directive is
required.
bis $31,$31,$0
This sets register $0 to 0, i.e., sets the return value for main()
to zero. GCC has seen that qux() returns the constant 1, and that
main() returns qux() - 1, or 0.
lda $30,16($30)
ret $31,($26),1
.end main
This restores the stack pointer and returns. The only "wasted motion"
was saving and restoring the stack pointer -- main() could have been
expressed as:

.frame $30,0,$26,0
bis $31,$31,0
ret $31,($26)

The qux() function is in fact compiled this way:
.align 2
.align 4
.globl qux
.ent qux
$qux..ng:
qux:
.frame $30,0,$26,0
.prologue 0
lda $0,1($31)
ret $31,($26),1
.end qux
This time, there is no useless adjusting of the stack pointer: the
qux() function simply sets the return-value register to 1 and then
returns. This is just two instructions long, which is the minimum.
.ident "GCC: (GNU) 3.3.3 (NetBSD nb3 20040520)"

I'm far from an assembler guru (I'm quite happy to let those of you
who are continue to make gobs of cash so that I never have to code in
it), but gcc seems to be pretty capable when you ask it to be. It
isn't what ICC will do for you (presumably) ...
Well, ICC does not generate Alpha assembly. :-) Aside from the
odd extra two instruction in main() (needlessly adjusting the stack
pointer in $30), this is pretty much minimal -- the qux() function
still has to exist, even though main() does not call it, in case
the object file is loaded into some other program (written in some
language other than C, presumably).
>... I'd be curious how gcc 4 performs.
Perhaps it elides the pointless stack adjustment in main().
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Oct 8 '06 #23
Ark wrote:
>...
If x is a pointer to an incomplete type, then (x + 1) is undefined.

You can't do pointer arithmetic on pointers to incomplete types.

Thanks. But I didn't ask if it is /legal/; I questioned the wisdom of it
being illegal - at today's level of compiler technology.
If
- an expression correctly evaluates to something sensible /regardless/
of the values of some of its terms, and
- these terms are known to be valid (although not known precisely)
then what's wrong with accepting such an expression?
...
What's legal and what's illegal is determined by the language standard.
In order to make something like this legal, you have to describe it in
the standard in a very specific way, so that different compilers behave
consistently in this respect. How are you suggesting this should be
described in standard? How are you going to describe the variety of
expressions that should become legal for this reason? You cannot just
say "If your compiler technology is good enough to evaluate it, then
consider it legal" because it will make it implementation-dependent and
wildly different between different implementations , which is completely
unacceptable in C. You'll have to describe it in a very specific way so
that it is strictly pre-defined and completely independent from concrete
implementations . And if you go that way, then where are you going to
draw the line between legal and illegal? What if I write a program that
counts 'unsigned x, y, z;' solutions for 'x^3 + y^3 = z^3' equation - is
the compiler with all that modern technology supposed to recognize the
equation and optimize the code to simply return 0?

--
Best regards,
Andrey Tarasevich

Oct 8 '06 #24
jacob navia wrote:
Ark wrote:
>A function
int foo(struct T *x)
{
return (x+1)-x;
}
should always return a 1, no matter how T is defined. (And int could
be replaced with ptrdiff_t for you pedants.)
It is amusing how stupid machines are.
....
3) If x is unsigned and equal to UINT_MAX, the the result is -x.
Which, of course, is 1.

--
Thad
Oct 9 '06 #25
Ark
Andrey Tarasevich wrote:
Ark wrote:
>>...
If x is a pointer to an incomplete type, then (x + 1) is undefined.

You can't do pointer arithmetic on pointers to incomplete types.
Thanks. But I didn't ask if it is /legal/; I questioned the wisdom of it
being illegal - at today's level of compiler technology.
If
- an expression correctly evaluates to something sensible /regardless/
of the values of some of its terms, and
- these terms are known to be valid (although not known precisely)
then what's wrong with accepting such an expression?
...

What's legal and what's illegal is determined by the language standard.
In order to make something like this legal, you have to describe it in
the standard in a very specific way, so that different compilers behave
consistently in this respect. How are you suggesting this should be
described in standard? How are you going to describe the variety of
expressions that should become legal for this reason? You cannot just
say "If your compiler technology is good enough to evaluate it, then
consider it legal" because it will make it implementation-dependent and
wildly different between different implementations , which is completely
unacceptable in C. You'll have to describe it in a very specific way so
that it is strictly pre-defined and completely independent from concrete
implementations . And if you go that way, then where are you going to
draw the line between legal and illegal? What if I write a program that
counts 'unsigned x, y, z;' solutions for 'x^3 + y^3 = z^3' equation - is
the compiler with all that modern technology supposed to recognize the
equation and optimize the code to simply return 0?
Ideally, I would wish this behavior:
1. Evaluate the expression /symbolically/; in my example in the OP it is
to assume sizeof(struct T) is some number XXX.
2. During evaluation, make notes of the implied domain (e.g. as in Mr.
Navia's irrelevant reply, that a float term must be a non-NaN)
3. If the expression evaluates (symbolically) to something not
containing XXX (that is, is independent of XXX) and doesn't reduce the
natural domain of terms, accept it.

I do take your example to heart though - provided that the language can
express "solutions to an equation".
So I am forced to retract my original complaint. I seem to understand
that proving that something is independent of XXX is very hard even if
restricted to constant integer expressions.
Thank you for pointing out the rationale. (But I feel sad to have to
expose types needlessly.)
- Ark
Oct 9 '06 #26
we******@gmail. com wrote:
Ian Collins wrote:
jacob navia wrote:
Ark wrote:
>A function
>int foo(struct T *x)
>{
> return (x+1)-x;
>}
>should always return a 1, no matter how T is defined. (And int could
>be replaced with ptrdiff_t for you pedants.)
>
It is amusing how stupid machines are.
You know? You forget a semi colon and they get all screwed up.
>
Stupid isn't it?
>
1) If x is double. If x is a NAN or INFinity,
the result is not one but NAN.
But the example is only doing pointer arithmetic..

Setting a pointer beyond its boundaries apparently leads to undefined
behavior.
Not in this case, it doesn't. Look closely: the only pointer that is
computed is the one just beyond x. Presuming x was valid to begin with,
that's a pointer you are allowed to compute (but not to dereference).

Besides, undefined behaviour is _undefined_. The function is allowed to
return anything (or even not-anything) if passed an invalid or null
pointer; and 1 is a reasonable value for "anything".

Richard
Oct 9 '06 #27
rl*@hoekstra-uitgeverij.nl (Richard Bos) writes:
we******@gmail. com wrote:
>Ian Collins wrote:
jacob navia wrote:
Ark wrote:
A function
int foo(struct T *x)
{
return (x+1)-x;
}
should always return a 1, no matter how T is defined. (And int could
be replaced with ptrdiff_t for you pedants.)

It is amusing how stupid machines are.
You know? You forget a semi colon and they get all screwed up.

Stupid isn't it?

1) If x is double. If x is a NAN or INFinity,
the result is not one but NAN.

But the example is only doing pointer arithmetic..

Setting a pointer beyond its boundaries apparently leads to undefined
behavior.

Not in this case, it doesn't. Look closely: the only pointer that is
computed is the one just beyond x. Presuming x was valid to begin with,
that's a pointer you are allowed to compute (but not to dereference).
x is a function parameter; you have no idea what its value is going to
be (unless you're able to analyze the complete program). x could
*already* point just beyond some object; x could be a valid (but not
dereferencable) pointer, but x+1 could be invalid.
Besides, undefined behaviour is _undefined_. The function is allowed to
return anything (or even not-anything) if passed an invalid or null
pointer; and 1 is a reasonable value for "anything".
Sure, a compiler *could* generate code that always returns 1, since
all possible cases that don't necessarily yield 1 invoke undefined
behavior. But it's not required (nor should it be).

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Oct 9 '06 #28
On Sat, 07 Oct 2006 16:12:57 GMT, "Mike Wahler"
<mk******@mkwah ler.netwrote:
"pete" <pf*****@mindsp ring.comwrote in message
news:45******** **@mindspring.c om...
Ark wrote:
<snip>
If x is a pointer to an incomplete type,
then (x + 1) doesn't evaluate to anything.

Right. About the only sensible retort from a compiler would
be what Paul Hogan said in "Crocodile Dundee": "One what?"

:-)
So you're suggesting C programmers are streetwalkers?

:-) :-) :-)

- David.Thompson1 at worldnet.att.ne t
Oct 23 '06 #29

"Dave Thompson" <da************ *@worldnet.att. netwrote in message
news:jv******** *************** *********@4ax.c om...
On Sat, 07 Oct 2006 16:12:57 GMT, "Mike Wahler"
<mk******@mkwah ler.netwrote:
>"pete" <pf*****@mindsp ring.comwrote in message
news:45******* ***@mindspring. com...
Ark wrote:
<snip>
If x is a pointer to an incomplete type,
then (x + 1) doesn't evaluate to anything.

Right. About the only sensible retort from a compiler would
be what Paul Hogan said in "Crocodile Dundee": "One what?"

:-)
So you're suggesting C programmers are streetwalkers?

:-) :-) :-)
Wasn't that where he had to check her for a kickstand? EC
Oct 24 '06 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
2332
by: Jeff Epler | last post by:
Hello. Recently, Generator Comprehensions were mentioned again on python-list. I have written an implementation for the compiler module. To try it out, however, you must be able to rebuild Python from source, because it also requires a change to Grammar. 1. Edit Python-2.3/Grammar/Grammar and add an alternative to the "listmaker" production: -listmaker: test ( list_for | (',' test)* )
12
1782
by: Bob Nelson | last post by:
The November 2003 edition of the ``C/C++ Users Journal'' contains a multi-paged advertising section for Microsoft's Visual C++. NET 2003 product. One section focuses on the compiler as being one of the most ISO compliant on any platform. Overall, the ad is rather interesting and has some worthwhile code snippets. - The amusing part is the declaration of ``main'' in an example showing support for function template specialization:
0
1137
by: Active8 | last post by:
This is amusing. I've got 2 windowing functions for Hamming and Hanning windows I was rewriting and I got ham and han as variables mixed up. vector<MCComplex> vOut = vIn; double han = 0.5 - 0.5 * cos( factor * i ); vOut.SetReal( ham * vOut.GetReal() );
5
1538
by: Colin King | last post by:
The following code also shows some amusing C grammar features: /* typedef unsigned int uint_t ... */ int typedef unsigned uint_t; /* defaults to int type in typedef... */ typedef int_t; main() {
0
2401
by: rollasoc | last post by:
Hi, I seem to be getting a compiler error Internal Compiler Error (0xc0000005 at address 535DB439): likely culprit is 'BIND'. An internal error has occurred in the compiler. To work around this problem, try simplifying or changing the program near the locations listed below. Locations at the top of the list are closer to the point at which the
3
5267
by: Mark Rockman | last post by:
------ Build started: Project: USDAver2, Configuration: Debug .NET ------ Preparing resources... Updating references... Performing main compilation... error CS0583: Internal Compiler Error (0xc0000005 at address 535F072A): likely culprit is 'BIND'. An internal error has occurred in the compiler. To work around this problem, try simplifying or changing the program near the locations listed below. Locations at the top of the list are...
41
18219
by: Miroslaw Makowiecki | last post by:
Where can I download Comeau compiler as a trial version? Thanks in advice.
0
9591
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10228
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10057
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10002
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8883
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7415
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6676
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5449
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
3
2816
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.