Horrible Visual C Bug!

Oliver Brausch

Hello,

have you ever heard about this MS-visual c compiler bug?
look at the small prog:
static int x=0;
int bit32() {
return ++x;
}

int bit64() {
return bit32() + (bit32() << 1);
}

void main(int argc, char **argv) {
int i;
for (i = 0; i < 5; i++) printf("%d. %d\n", i, bit64());
}
Ok, the (correct) result is:

0. 2
1. 8
2. 14
3. 20
4. 26

This is what every compiled progam says. Inclusive MSVisualC Compiler
with Debug options or /Ot fast-option.

But do not dare to switch to the the /O2 option of MSVisualC Compiler.
Then once your computer cannot calculate anymore:

0. 1
1. 7
2. 13
3. 19
4. 25

So, up to Microsoft, 0 + 2 = 1 ?????. That's why their OS is so stable....
Try to increase the "<< 1". It even gets worse.

Ever seen this? I costed me hours of debugging. Can I
sue Microsoft for this?

- Oliver Brausch

http://home.arcor.de/dreamlike

Nov 13 '05

Subscribe Reply

140

7692

Chris Torek

In article <bf**********@juniper.cis.uab.edu>
Robert Hyatt <hy***@crafty.cis.uab.edu> writes:

There should be _no_ difference in how a program runs, whether you use
"int main()" or "void main()". That only tells the compiler that the
program returns a value, that most operating systems use as the "return
or completion code".

Indeed. But what if this changes the calling sequence or function's
linker-level name?

The problem, you see, is that someone else -- over whom you have no
control -- has already written "int main()", in the code that calls
builds argc and argv and calls main() initially.

On typical boxes today, that code looks in part like this:

extern int main(int, char **);
...
exit(main(argc, argv));

Of course, the ANSI C standard requires that this work even if the
programmer wrote:

int main(void) {

so a less-typical box might actually have TWO startup routines, the
other looking like this:

extern int main(void);
...
exit(main());

Now suppose the C compiler uses a C++-style "name mangling" scheme
in order to distinguish between main(void) and main(int, char **).
The former compiles to 0$main$ and the latter to 2$i$ppc$main$.
At link time, this less-typical box scans the object files to see
whether the "main" symbol is 0$main$ or 2$i$ppc$main$, so as to
choose the correct call into main().

If you write something other than one of these two forms, then, your
program might fail to link.

Suppose that, instead of (or in addition to) name-mangling, we have
a stack-oriented machine in which main()'s return value is popped
and pushed (or no-op'ed) from the value stack to be passed to
exit(). If the value stack is empty at the initial call to main(),
and you use "void main()" (and if the compiler is doing name-mangling
you somehow manage to make it link despite not finding the right
name), then the startup routine's call to exit() (or exit() itself)
attempts to pop a value off an empty stack. The result is a runtime
exception when main() returns.

The latter may sound farfetched, but something remarkably similar
does happen on real implementations today if you manage to write code
of the form:

struct S { char a[100]; };

struct S main(int argc, char **argv) {

Typically one writes this accidentally, by mistakenly leaving off
the ";" at the end of the first line, and using "default int"
declarations that turn out not to be default int after all:

struct S {
char a[100];
} /* NOTE MISSING SEMICOLON */

main(int argc, char **argv) {
... code here using argc and/or argv ...
}

In this case, a number of compilers assume you "really meant":

void main(struct S *__secret_arg1, int argc, char **argv) {
...
/* returning a value assigns to *__secret_arg1, then returns */
}

so that argc and argv wind up having "peculiar values". (In fact,
on these systems today, argv seems to be junk or to contain the
environment variable list, and argc is some huge number. Yes,
I have debugged just such a problem. :-) )

The point of all this is simple enough: if you write "int main()",
a big guy with a large bat is metaphorically standing right behind
your compiler-vendor, ready to whack him in the head if he makes
your code fail because of this. But, if you write "void main()",
the guy with the bat will not do anything to your compiler-vendor,
because you failed to keep *your* part of the bargain.

The "bargain" in question is the C standard, which has lists of
programming constructs and behaviors and says (not quite literally)
"if you do only these things, your vendor must do only those things."
As long as the things you need done are things *every* ANSI-C vendor
*must* do, you have little to lose by sticking to the standard.
You may or may not think you have much to gain, but history suggests
that in fact you *do* have something to gain.

In other words, if you write "int main()", you are quite probably
helping yourself, and if you write "void main()", you are quite
probably hurting yourself.

Note that this is *not* true for *other* places you (as a programmer)
might completely ignore the ANSI C standard. For instance, given
that this thread is (for some mysterious reason) cross-posted to
rec.games.chess.computer, you might have a chess game in which you
want to draw graphics in a window. ANSI C does not give you a way
to do this -- so the cost/benefit equation now swings heavily the
other way: "Cost of not using standard C: unknown, probably not
zero but probably not huge. Benefit: can achieve goal." Compare
that to "void main": "Cost: unknown, probably not zero but probably
not huge. Benefit: none."
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://67.40.109.61/torek/index.html (for the moment)
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 13 '05 #101

Mark McIntyre

On Fri, 25 Jul 2003 20:34:29 +0000 (UTC), in comp.lang.c , Robert
Hyatt <hy***@crafty.cis.uab.edu> wrote:

In rec.games.chess.computer JeffK. <uc*@ftc.gov> wrote:
There are bugs in the compilers, and I've witnessed differences arising
from
the use of void main vs int main.

There should be _no_ difference in how a program runs, whether you use
"int main()" or "void main()".

This might be true on the limited set of machines you've used. On some
of the ones I've used, void main() could crash the system.

The OS required a return code to be in a register, and so it took one
out. But your app didn't put if there, so it grabbed some garbage. The
OS then acted on that garbage, and you can imagine what happens if the
garbage happened to be the error code for "the computer room is on
fire, shut down the cluster at once" or "the CPU is overheating,
instigate emergency procedures". Or even if it was only "this program
caused a memory violation and crashed, so please remove all files it
created since they're garbage"
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

Nov 13 '05 #102

Mark McIntyre

On Fri, 25 Jul 2003 17:03:58 -0400, in comp.lang.c , "Sin"
<br****@hotmail.com> wrote:

What if the application in question is a Windows GUI app
These don't have a main(). Windows GUI apps are not ISO C, so the
discussion is moot.
which returns
control immediatly to the calling batch file...? GUI applications usually
signal error/succes by visual and interactive means...
What, they signal to a batch visually? Amazing !!
The return code of such applications often is of absolutly no importance.
Depends, doesn't it?
Of course the ansi standard says you should write int main... But some
compilers (hence programmers) do not abide to it... Does it make those
compilers and programmers bad?

If hte compiler documents the effect of the behaviour, then no
problem. If it does /not/ then the progammer who uses it is bad.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---

Nov 13 '05 #103

Ben Taylor

can you smoke beef?
how?

"Charlie Gibbs" <cg****@kltpzyxm.invalid> wrote in message
news:12*******************@kltpzyxm.invalid...

In article <bf**********@oravannahka.helsinki.fi> pa*****@cc.helsinki.fi
(Joona I Palaste) writes:
Emmanuel Delahaye <em**********@noos.fr> scribbled the following
on comp.lang.c:
In 'comp.lang.c', ob***@web.de (Oliver Brausch) wrote:

have you ever heard about this MS-visual c compiler bug?
look at the small prog:
<snip>
return ++x;

No, there is nothing wrong with "return ++x;" by itself. Not a missing
sequence point, anyway. Others have already explained the OP's problems
to him: (1) He's using void main(), undefined behaviour. (2) He's
calling bit32() twice without an intervening sequence point,
unspecified behaviour.

A little voice inside my head starts screaming "Side effects!"
whenever I see something like this. My solution is simple:
don't do it.
"You can pick your friends, you can pick your nose, but you can't pick
your relatives."
- MAD Magazine

"You can smoke beef, and you can smoke hash, but you can't smoke
corned beef hash." -- National Lampoon

--
/~\ cg****@kltpzyxm.invalid (Charlie Gibbs)
\ / I'm really at ac.dekanfrus if you read it the right way.
X Top-posted messages will probably be ignored. See RFC1855.
/ \ HTML will DEFINITELY be ignored. Join the ASCII ribbon campaign!

Nov 13 '05 #104

SenderX

> can you smoke beef?

how?

Try using, a smoker.

--
The designer of the experimental, SMP and HyperThread friendly, AppCore
library.

http://AppCore.home.comcast.net

Nov 13 '05 #105

Morris Dovey

Ben Taylor wrote:

can you smoke beef?

You can; but they're really difficult to light.

--
Morris Dovey
West Des Moines, Iowa USA
C links at http://www.iedu.com/c

Nov 13 '05 #106

Falcon Kirtarania

The relevance is intact, because the MS Visual Studio C++ .NET compiler will
also compile C.

"Bruce Wheeler" <bs*********@hotmail.com> wrote in message
news:3f*************@news.siemens.at...

On Fri, 25 Jul 2003 00:32:01 GMT, "Carsten Hansen"
<ha******@worldnet.att.net> wrote:

"Mark McIntyre" <ma**********@spamcop.net> wrote in message
news:vd********************************@4ax.com.. .
On Thu, 24 Jul 2003 16:02:09 GMT, in comp.lang.c , "Carsten Hansen"
<ha******@worldnet.att.net> wrote:

>"Mark McIntyre" <ma**********@spamcop.net> wrote in message
>news:ph********************************@4ax.com.. .
>> On Wed, 23 Jul 2003 18:07:45 -0400, in comp.lang.c
>.
>>MS products do NOT document it, in fact the
>> reverse, they state that main must return an int.

>I'm not in favor of void main. However, I don't think your statement is >correct.
>Here is one page of Microsoft's documentation:
>
>---------------
>A special function called main is the starting point of execution for all
C
>and C++ programs.

I'd agree with you, if the documentation said that. Instead it says "A
special function called main is the entry point to all C++ programs".

I'm quoting from the MSVC 6 docs.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme:

<http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>

I'm quoting form the online documentation to Microsoft Visual Studio .NET
2003. So, apparently Microsoft has improved their documentation.

Carsten Hansen

Maybe someday Microsoft will get their documentation correct :-(

That section is in the VS.NET C++ Language Reference, so its relevance
to C programs is questionable.

However, it appears that they really intended the section to apply to
both C and C++, since the corresponding section in the C Language
Reference from VC6 has been removed for VS.NET.

Going back to what Mark McIntyre wrote
and secondly, if void main() /is/ defined, the compiler writer is
obliged to document it. MS products do NOT document it, in fact the
reverse, they state that main must return an int. The fact that their
sample code doesn't do that is merely bad quality control.

The change appears to be another case of bad quality control. If the
section is meant to apply to both languages, it should be in as section
common to both languages (the infamous C/C++), or the section should be
repeated in the C Language Reference.

Regards,
Bruce Wheeler

Nov 13 '05 #107

Falcon Kirtarania

"Christian Bau" <ch***********@cbau.freeserve.co.uk> wrote in message
news:ch*********************************@slb-newsm1.svr.pol.co.uk...

In article <3f******@news2.power.net.uk>,
Richard Heathfield <in*****@address.co.uk.invalid> wrote:
I occasionally conduct recruitment interviews for C programmers. At such
interviews, I always ask the programmer what main() returns. If he gets the answer wrong, which does sometimes happen, it is unlikely that he will get the job.
On the other hand, imagine you are in a job interview and you are asked:
What will this statement do?

i = 3;
a [i++] = i;

I recommend that you answer: It could store the number 3 or the number 4
into a [3]. Anything beyond that and you might confuse the interviewer.

Theoretically, it would execute:

LINE 1 WATCH: i == 3 true
LINE 2: a[3]==3, and at the end i=4

because i++ is a postdecrement and taxes place at the end of the line,
doesn't it? Or is it the statement?

(These two possibilities alone would be enough to make you avoid a
statement like this, obviously. But don't mention that your program
could crash or worse because of this).

Nov 13 '05 #108

Falcon Kirtarania

It said anyway, that it was in fact good code to have void and it would
simply return 0 or comething like that. Which is another example of how
Bill Gates fucked up the computing industry yet again.

"Mark McIntyre" <ma**********@spamcop.net> wrote in message
news:c5********************************@4ax.com...

On Thu, 24 Jul 2003 08:03:37 +0000, in comp.lang.c , Richard
Heathfield <in*****@address.co.uk.invalid> wrote:
Is it really? Quite a few people have looked for Microsoft's C documentationof the effect of void main. I don't think anyone has found it yet.
(and someone responded with a quote from the MSVC 6 helpfile,
mentioning void main() but I managed to lose the post)

I was about to say "interesting, so MS finally documented it" but then
I noticed that this section begins "A special function called main is
the entry point to all C++ programs".
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet

News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption

=---

Nov 13 '05 #109

Falcon Kirtarania

Theoretically, any undefined behavior should return an error (as in, the
standard should change). This might help solve the troubles of awful
programming.
]
"Christian Bau" <ch***********@cbau.freeserve.co.uk> wrote in message
news:ch*********************************@slb-newsm1.svr.pol.co.uk...

In article <el***********************@news1.calgary.shaw.ca >,
"Falcon Kirtarania" <cm****@shaw.ca> wrote:
Which would explain exactly why Microsoft programs are so poor: their
compiler accepts a program return type that is theoretically seen as
nonsense to the OS and therefore the program should raise many errors when run. However, their own programmers are not aware of that standard or their sample code would involve no void main(). However, it must be legal in
Windows because the compiler somehow manages to compile it on that platform. I wonder, if it was compiling for unix, would it be allowed to compile void main()?

Good question. Having a main () function declared as "void main ()"
invokes undefined behavior. That means that according to the C Standard,
anything could happen. And when I say anything, I mean _absolutely
anything_. If you compile and run this program:

void main () { printf ("Hello, world\n"); }

then it could happen that your computer explodes, or your harddisk gets
formatted, and you can't complain that your compiler is not a Standard C
compiler. (You still can complain that a common mistake like that
shouldn't explode your computer, but you can't complain that the
compiler is not conforming to the C Standard. )

You can check the documentation for your compiler. Maybe it defines what
will happen; if the C Standard leaves something undefined then any
compiler is allowed to define it. Maybe your compiler refuses to compile
the program; I would say that would be a very sensible approach. Maybe
your program crashes as soon as you start it, maybe it crashes just when
it finishes. Maybe the operating system puts up an alert that says:
"Warning: Program xxxx seems to be broken. Please contact the
manufacturer of this program for further advice. ". Anything could
happen.

Nov 13 '05 #110

Falcon Kirtarania

"Robert Hyatt" <hy***@crafty.cis.uab.edu> wrote in message
news:bf**********@juniper.cis.uab.edu...

In rec.games.chess.computer JeffK. <uc*@ftc.gov> wrote:

Oliver Brausch wrote:

I would rather fire somebody who always loses the focus instead of such
really nonimportant points. BTW, I always use int main() as you
can see in my programms. But for this example it is absolutely
irrelevant.
There are bugs in the compilers, and I've witnessed differences arising
from
the use of void main vs int main.

There should be _no_ difference in how a program runs, whether you use
"int main()" or "void main()". That only tells the compiler that the
program returns a value, that most operating systems use as the "return
or completion code". IE exit(0); returns a zero completion code. As
does "return 0" in the main program. int or void should have _no_ effect
on how the program executes, however, it just affects whether the system
can determine whether it executed normally or not.

That is why such code is substandard. Perhaps, what the standard is getting
at is it is far better to return an error code than exit "normally" (or
assume that) regardless of what happened.

I've seen code compile on one
compiler that
would not on another. A lot of the problems lie in how the compiler is
designed.
Not all problems arise from deviation from language specs. Memory
allocation
and error handling can be extremely different from one system to
another, such
that the problem of portability, as noted above, can arise.
For those who always blame the programmer, my advice is to start using
other products. Someday you'll need employment, and there's lots of
older code that needs upgrading or, you'll use it as is and need to
figure
out how to get it to work on a newer computer. When you work on such
projects,
you become aware of vendor product shortcomings.

The issue I see with your project is that a more helpful error warning
routine would be helpful, such that you would be put on notice that
what you are doing is treading on dangerous ground. If you did turn on
all warnings and were unaware of what was there, a vendor criticism is
in order.
If you did not turn on your warnings, then this is something to do that
is good practice.

I don't think any vendor criticism is in order. In C, you can do _many_
things that a compiler can't verify whether it is right or wrong. IE you
can pass an integer to a function that requires a float. If the compiler
doesn't see both the caller and the callee, or a prototype for the callee,
then it _must_ assume you know what you are doing. If it does see it,
but you re-cast the int to a float, it is not going to work, and the

compiler should not complain since the cast operator indicates you know what you are doing.
Such casts should generate a level 4 (highest level) warning anyway, just to
mention it, unless in very special cases such as when the types are like
char to int, AND the (int)char is not being written to above 127.

If you fail to follow normal programming practices, and it blows up when you run the thing, that's hardly something to whack the vendor about. After all you _can_ put your foot under a running lawnmower, but should you do so, you don't have much reason to complain about the result. This is the same kind of thing. You _can_ do some things, but the question is _should_ you do
them and if you do, who is responsible?
Where the standard is not specified, a decent compiler will logically extend
the syntax and normal operation to include a special case.

The bottom line in programming though is to keep the code simple. The
surprise you note is not really that uncommon when you start
experimenting
and not something you want to have to deal with on a project that has a
deadline
to meet.

But keep experimenting, and seeing what's out there!

JeffK.

Some people claim that the result I wrote is not correct at all.
Of course, it was "return x++" instead and forgot it in the
posting. Another point, that is not important to the problem at all.

> > int bit64() {
> > return bit32() + (bit32() << 1);
> > }
>
> You try to change a variable twice, which is invalid, ie. undefined > behaviour. Microsoft's compiler is absolutely right.

Ok, this is correct. I was of the opinion that such statements are
read from left to right. But it seems wrong, even most compilers
would do it like this. Ok, my fault, I learned it now (even
before reading all this flames.)

So thanks for the support.

Oliver

--
Robert Hyatt Computer and Information Sciences
hy***@cis.uab.edu University of Alabama at Birmingham
(205) 934-2213 115A Campbell Hall, UAB Station
(205) 934-5473 FAX Birmingham, AL 35294-1170

Nov 13 '05 #111

Falcon Kirtarania

Really, all it comes down to is that unless you really don't give a shit
what is in EAX after your program executes, you damn well better return int.
Theoretically, within standard, it could simply not set EAX on returning
void as it does for everything else. Then you would end up with return
codes that are pseudorandom numbers.

"Chris Torek" <no****@elf.eng.bsdi.com> wrote in message
news:bf**********@elf.eng.bsdi.com...

In article <bf**********@juniper.cis.uab.edu>
Robert Hyatt <hy***@crafty.cis.uab.edu> writes:
There should be _no_ difference in how a program runs, whether you use
"int main()" or "void main()". That only tells the compiler that the
program returns a value, that most operating systems use as the "return
or completion code".
Indeed. But what if this changes the calling sequence or function's
linker-level name?

The problem, you see, is that someone else -- over whom you have no
control -- has already written "int main()", in the code that calls
builds argc and argv and calls main() initially.

On typical boxes today, that code looks in part like this:

extern int main(int, char **);
...
exit(main(argc, argv));

Of course, the ANSI C standard requires that this work even if the
programmer wrote:

int main(void) {

so a less-typical box might actually have TWO startup routines, the
other looking like this:

extern int main(void);
...
exit(main());

Now suppose the C compiler uses a C++-style "name mangling" scheme
in order to distinguish between main(void) and main(int, char **).
The former compiles to 0$main$ and the latter to 2$i$ppc$main$.
At link time, this less-typical box scans the object files to see
whether the "main" symbol is 0$main$ or 2$i$ppc$main$, so as to
choose the correct call into main().

If you write something other than one of these two forms, then, your
program might fail to link.

Suppose that, instead of (or in addition to) name-mangling, we have
a stack-oriented machine in which main()'s return value is popped
and pushed (or no-op'ed) from the value stack to be passed to
exit(). If the value stack is empty at the initial call to main(),
and you use "void main()" (and if the compiler is doing name-mangling
you somehow manage to make it link despite not finding the right
name), then the startup routine's call to exit() (or exit() itself)
attempts to pop a value off an empty stack. The result is a runtime
exception when main() returns.

The latter may sound farfetched, but something remarkably similar
does happen on real implementations today if you manage to write code
of the form:

struct S { char a[100]; };

struct S main(int argc, char **argv) {

Typically one writes this accidentally, by mistakenly leaving off
the ";" at the end of the first line, and using "default int"
declarations that turn out not to be default int after all:

struct S {
char a[100];
} /* NOTE MISSING SEMICOLON */

main(int argc, char **argv) {
... code here using argc and/or argv ...
}

In this case, a number of compilers assume you "really meant":

void main(struct S *__secret_arg1, int argc, char **argv) {
...
/* returning a value assigns to *__secret_arg1, then returns */
}

so that argc and argv wind up having "peculiar values". (In fact,
on these systems today, argv seems to be junk or to contain the
environment variable list, and argc is some huge number. Yes,
I have debugged just such a problem. :-) )

The point of all this is simple enough: if you write "int main()",
a big guy with a large bat is metaphorically standing right behind
your compiler-vendor, ready to whack him in the head if he makes
your code fail because of this. But, if you write "void main()",
the guy with the bat will not do anything to your compiler-vendor,
because you failed to keep *your* part of the bargain.

The "bargain" in question is the C standard, which has lists of
programming constructs and behaviors and says (not quite literally)
"if you do only these things, your vendor must do only those things."
As long as the things you need done are things *every* ANSI-C vendor
*must* do, you have little to lose by sticking to the standard.
You may or may not think you have much to gain, but history suggests
that in fact you *do* have something to gain.

In other words, if you write "int main()", you are quite probably
helping yourself, and if you write "void main()", you are quite
probably hurting yourself.

Note that this is *not* true for *other* places you (as a programmer)
might completely ignore the ANSI C standard. For instance, given
that this thread is (for some mysterious reason) cross-posted to
rec.games.chess.computer, you might have a chess game in which you
want to draw graphics in a window. ANSI C does not give you a way
to do this -- so the cost/benefit equation now swings heavily the
other way: "Cost of not using standard C: unknown, probably not
zero but probably not huge. Benefit: can achieve goal." Compare
that to "void main": "Cost: unknown, probably not zero but probably
not huge. Benefit: none."
--
In-Real-Life: Chris Torek, Wind River Systems (BSD engineering)
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://67.40.109.61/torek/index.html (for the

moment) Reading email is like searching for food in the garbage, thanks to

spammers.

Nov 13 '05 #112

Joona I Palaste

Falcon Kirtarania <cm****@shaw.ca> scribbled the following
on comp.lang.c:

Really, all it comes down to is that unless you really don't give a shit
what is in EAX after your program executes, you damn well better return int.
Theoretically, within standard, it could simply not set EAX on returning
void as it does for everything else. Then you would end up with return
codes that are pseudorandom numbers.

Do you really still think that all the world's a Wintel box?

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"As we all know, the hardware for the PC is great, but the software sucks."
- Petro Tyschtschenko

Nov 13 '05 #113

Richard Heathfield

[Followups set to comp.lang.c]

Falcon Kirtarania wrote:

"Christian Bau" <ch***********@cbau.freeserve.co.uk> wrote in message
news:ch*********************************@slb-newsm1.svr.pol.co.uk...

On the other hand, imagine you are in a job interview and you are asked:
What will this statement do?

i = 3;
a [i++] = i;

I recommend that you answer: It could store the number 3 or the number 4
into a [3]. Anything beyond that and you might confuse the interviewer.
Theoretically, it would execute:

LINE 1 WATCH: i == 3 true
LINE 2: a[3]==3, and at the end i=4

Theoretically, it /could/ do that, at least partly on the grounds that,
theoretically, it could do anything at all.
because i++ is a postdecrement and taxes place at the end of the line,
doesn't it? Or is it the statement?

The behaviour is undefined because it violates a "shall" outside a
constraint, just like void main. The relevant Standard text (3.3 in C89,
6.5(2) in C99), is: "Between the previous and next sequence point an object
shall have its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be accessed only to
determine the value to be stored."

The code

a[i++] = i; /* bug */

violates that "shall" clause, invoking UB. No diagnostic is required.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #114

Richard Heathfield

[Followups set to comp.lang.c]

Falcon Kirtarania wrote:

Really, all it comes down to is that unless you really don't give a shit
what is in EAX after your program executes, you damn well better return
int.
If you really understood, you'd know that not all machines /have/ a register
called EAX. Furthermore, the Standard doesn't mandate that a return value
be stored in a register at all.
Theoretically, within standard, it could simply not set EAX on
returning
void as it does for everything else. Then you would end up with return
codes that are pseudorandom numbers.

Theoretically, it could do anything at all, as far as the Standard is
concerned. That's what "undefined behaviour" means. Returning pseudorandom
numbers would be a relatively harmless outcome compared to some I can
imagine.

<snip>

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #115

Martien Verbruggen

On Sun, 27 Jul 2003 04:31:57 GMT,
Falcon Kirtarania <cm****@shaw.ca> wrote:

"Christian Bau" <ch***********@cbau.freeserve.co.uk> wrote in message
news:ch*********************************@slb-newsm1.svr.pol.co.uk...

On the other hand, imagine you are in a job interview and you are asked:
What will this statement do?

i = 3;
a [i++] = i;

I recommend that you answer: It could store the number 3 or the number 4
into a [3]. Anything beyond that and you might confuse the interviewer.

Theoretically, it would execute:

LINE 1 WATCH: i == 3 true
LINE 2: a[3]==3, and at the end i=4

because i++ is a postdecrement and taxes place at the end of the line,
doesn't it? Or is it the statement?

No, it doesn't, at least not necessarily. This subject is discussed at
least once per week in comp.lang.c, so I suggest you use
group.google.com to find a few of those discussions, and also that you
have a look at the C FAQ, questions 3.1 to 3.3, where this is discussed.

Martien
--
|
Martien Verbruggen | Failure is not an option. It comes bundled
| with your Microsoft product.
|

Nov 13 '05 #116

Richard Heathfield

Joona I Palaste wrote:

<snip>

BTW, aren't all implementation-specific extensions such as the Windows
API also undefined behaviour?

Depends on your definition of "undefined". Personally, I agree that
extensions effectively invoke undefined behaviour. I seem to recall that
some people disagree with me. The name "Doug Gwyn" springs to mind, but I
may have disremembered.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #117

Bruce Wheeler

A: Because it makes your article more difficult to read.

Q: Why shouldn't I top-post?

On Sun, 27 Jul 2003 04:28:47 GMT, "Falcon Kirtarania"
<cm****@shaw.ca> wrote:

The relevance is intact, because the MS Visual Studio C++ .NET compiler will
also compile C.

The relevance of what to what is intact?

VS.NET will compile C and C++, depending on the settings you
provide.

However, C is not C++, just as C++ is not C.

Regards,
Bruce Wheeler

Nov 13 '05 #118

Kevin Easton

Joona I Palaste <pa*****@cc.helsinki.fi> wrote:

Richard Heathfield <do******@address.co.uk.invalid> scribbled the following
on comp.lang.c:
[Followups set to comp.lang.c]

Falcon Kirtarania wrote:

Theoretically, any undefined behavior should return an error (as in, the
standard should change). This might help solve the troubles of awful
programming.

Feel free to write a C compiler which diagnoses any and all instances of
undefined behaviour. Consider whether it should generate a diagnostic for
this code. If so, why? If not, why not?

int foo(int *a, int *b)
{
return *a = *b++;
}

<snip>

Well, I believe that the problem here could be solved by using run-time
checks for undefined behaviour. That would make the generated code quite
slow, but it would still work correctly.

Many instances of undefined behaviour are there specifically to absolve
the implementation of the responsibility of detecting them - forcing the
implementation to detect them anyway makes the reason they were
undefined in the first place moot.

- Kevin.

Nov 13 '05 #119

jacob navia

"Richard Heathfield" <do******@address.co.uk.invalid> wrote in message
news:bf**********@titan.btinternet.com...

Joona I Palaste wrote:

<snip>
BTW, aren't all implementation-specific extensions such as the Windows
API also undefined behaviour?

Depends on your definition of "undefined". Personally, I agree that
extensions effectively invoke undefined behaviour.

Yes. When adding extensions to lcc-win32 I have followed explicitely undefined behaviour to avoid
being incompatible with existing code.

For instance this extension is undefined behavior in standard C:

/* Define a new operator addition for a user defined type of numbers */
int operator+(Number a,Number b) { ... }

No legal program can use that, so it is a compatible extension. I took pains that

int operator = 5;

still works, of course.

As far as the standard goes, extensions are not forbidden. They just should not introduce new
keywords in contexts where they would invalidate existing code. Existing code that writes standard C
like:

int operator = 5;

should always compile what is intended.

Syntax extensions should avoid the user name space. Microsoft proposed under windows
__try { /* guarded code block }
__except( /* integer expression */ ) {
/* exception code block */
}
Since no legal C program can be written like this, this is a compatible extension. Lcc-win32
followed that proposal.

Another Microsoft extension that is necessary under windows is
__declspec(dllexport)
to indicate to the linker to export that symbol in the export table of the DLL being compiled.
Again, the user name space was preserved.

More difficult to follow was __int64 for long long. In this case lcc-win32 added
#define __int64 long long
automatically at startup to be able to compile code that uses that symbol. Since it wasn't in the
legal namespace anyway I think it is OK.

More problematic were the windows API headers. They are huge, and lcc-win32 was forced to provide an
ANSI C version, avoiding the dreaded

asm { /* a lot of assembly in microsoft syntax */}

that polluted so many headers. This has gotten better now, and many SDK headers are compliant.
Still, sometimes I wonder what does it mean:
typedef struct tagWindowsStruct {
...
DWORD bitfield:2;
}

a DWORD is an unsigned long, what really doesn't fit into 2 bits... Why not use
unsigned bitfield:2;

instead ???

Nov 13 '05 #120

Arthur J. O'Dwyer

On Sun, 27 Jul 2003, Joona I Palaste wrote:

Kevin Easton <kevin@-nospam-pcug.org.au> scribbled the following:
Joona I Palaste wrote:

Well, I believe that the problem here could be solved by using run-time
checks for undefined behaviour. That would make the generated code quite
slow, but it would still work correctly.

Many instances of undefined behaviour are there specifically to absolve
the implementation of the responsibility of detecting them - forcing the
implementation to detect them anyway makes the reason they were
undefined in the first place moot.

Then, if we want to achieve what Falcon is suggesting, let's leave the
definition of undefined behaviour as it is, and instead change the
standard from saying "such and such results in undefined behaviour" to
"such and such results in an error". Whether we *do* want to achieve it
is another question.

Simple answer: Of course we don't want to achieve it!

Complex answer: Detecting all UB at runtime (or any time) is
computationally equivalent to solving the halting problem, which
isn't something we want to foist on compiler vendors. ;)

Dumb answer: Are there any classes of UB that *are* feasible
to detect at compile-time or runtime? 'foo main', where 'foo'!='int',
looks easy to me. I can't think of any others off the top of my
head.

-Arthur

Nov 13 '05 #121

Arthur J. O'Dwyer

On Sun, 27 Jul 2003, jacob navia wrote:

When adding extensions to lcc-win32 I have followed explicitly
undefined behaviour to avoid being incompatible with existing code.

For instance this extension is undefined behavior in standard C:

/* Define a new operator addition for a user defined type of numbers */
int operator+(Number a,Number b) { ... }

No legal program can use that, so it is a compatible extension. I took
pains that

int operator = 5;

still works, of course.
And, presumably,

int foo, *bar;
....
void baz()
{
int operator=(foo *bar[5]);
...
}

does the Right Thing(tm) as well. I'm glad it's you implementing that
sort of thing, and not me. :)
As far as the standard goes, extensions are not forbidden. They just
should not introduce new keywords in contexts where they would
invalidate existing code. Existing code that writes standard C like:

int operator = 5;
or even more pathological cases,
should always compile what is intended. More difficult to follow was __int64 for long long. In this case
lcc-win32 added
#define __int64 long long
automatically at startup to be able to compile code that uses that symbol.
Since it wasn't in the legal namespace anyway I think it is OK.
It is okay, AFAICT. But why would __int64 be any harder to add than
any other extension, if you don't mind my asking?
More problematic were the windows API headers. They are huge, and
lcc-win32 was forced to provide an ANSI C version, avoiding the dreaded

asm { /* a lot of assembly in microsoft syntax */}

that polluted so many headers. This has gotten better now, and many
SDK headers are compliant. Still, sometimes I wonder what does it mean:

typedef struct tagWindowsStruct {
...
DWORD bitfield:2;
}

a DWORD is an unsigned long, what really doesn't fit into 2 bits...
Why not use
unsigned bitfield:2;

No reason. This looks like yet another MS extension, and I *think*
that the code as it stands invokes undefined behavior, so it's fair
game for an extension. Presumably this allows

DWORD wide_bitfield:20;

(or any large number of bits), so the implementors just used DWORD
everywhere for consistency.

-Arthur

Nov 13 '05 #122

jacob navia

> Dumb answer: Are there any classes of UB that *are* feasible

to detect at compile-time or runtime? 'foo main', where 'foo'!='int',
looks easy to me. I can't think of any others off the top of my
head.

Pointers

The set of all pointers in the program is initialized at startup. They are either NULL or they point
to valid addresses, established in the raw data of the program. For instance:

int a,*pint = &a;

The set of valid addresses is established by the compiler at startup. When control arrives at main()
all pointers are valid.

Undefined behavior (what pointers concerns) is when any pointer is used that
1) Has not been initialized to point to an existing valid object.
or
2) Is NULL or points somewhere else than

Object start address <= p < (start address)+sizeof(Object)

We can distinguish two types of pointers:

A) Unbounded pointers, i.e. pointers where the calculation of sizeof(Object) is impossible
B) Bounded pointers where sizeof(Object) is known and can be checked at run time.

Detecting this class of UB is called bounds checking and is done in many languages.
C is notoriously lacking this facility. Worse, the machine is not used to automatically test the
programmer's assumptions and all pointers are considered unbound.

Lisp, APL, and many other languages check array accesses and avoid memoy corruption. C doesn't, and
we are plagued by memory corruption and obscure bugs.

An improvement would be to encourage the automatic checking of object accesses and discouraging the
usage of unbounded pointers. Instead of writing:

void matmult(int n,int m, double *pmat)

we would write:

void matmult(int n, int m, double mat[n][m]);

Such a proto would allow to check in the calling program that the buffer passed has enough space as
declared, and in the called function it would be possible to check that no index is being misused.

Most of this attitude comes because the Pascal language has this facility, and many C people see
Pascal as something quite horrible.

I think that was a good feature of Pascal. I miss this in C and I see each day the consequences in
array overruns, obscure bugs, and many other problems. Encouraging the use of bounded pointers would
introduce some hygienical concepts isn't it?

Nobody is proposing banning unbounded pointers. They should remain for special uses or in old
software. Encouraging the use of bounded pointers will make them slowly les and less frequent,
that's all.

The sizeof calculation is very problematic in C because of the refusal of passing this information
in array prototypes by the standards comitee. Of course this has historical reasons, but I just do
not understand why in 2003 we still want to save us the few machine cycles that that would cost, and
spare the users and the programmers the stack overruns, memory corruption and other problems!

An array decays in C, to an unbounded pointer when passed to a subroutine. All sizeof information is
not passed along. This is (maybe) efficient but it is a problem for checking the bounds of array
indexes!

jacob

Nov 13 '05 #123

pete

Richard Heathfield wrote:

Joona I Palaste wrote:

<snip>
BTW, aren't all implementation-specific
extensions such as the Windows
API also undefined behaviour?

Depends on your definition of "undefined". Personally, I agree that
extensions effectively invoke undefined behaviour.
I seem to recall that some people disagree with me.
The name "Doug Gwyn" springs to mind, but I
may have disremembered.

The consensus of comp.std.c,
was that a program which calls a clear screen function
exhibits behavior which is not defined by the standard,
and which is also not considered to be undefined behavior.
I don't understand the importance of the distinction.

--
pete

Nov 13 '05 #124

Richard Heathfield

pete wrote:

Richard Heathfield wrote:
<snip>
[...] Personally, I agree that
extensions effectively invoke undefined behaviour.
I seem to recall that some people disagree with me.
The name "Doug Gwyn" springs to mind, but I
may have disremembered.

The consensus of comp.std.c,
was that a program which calls a clear screen function
exhibits behavior which is not defined by the standard,
and which is also not considered to be undefined behavior.
I don't understand the importance of the distinction.

I believe you're right about the csc consensus. I'm afraid I am just as much
in the dark over the difference between undefined behaviour and behaviour
which is not defined. I think I'm correct in saying that the committee sees
the two words "undefined behaviour" as being a key term with a particular
meaning, the meaning being, of course, the "this Standard imposes no
requirements" thing. But since the Standard imposes no requirements on
extensions, either, I continue to fail to see the distinction.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #125

Steve Zimmerman

Falcon Kirtarania wrote:

Theoretically, any undefined behavior should return an error (as in, the
standard should change). This might help solve the troubles of awful
programming.
]
"Christian Bau" <ch***********@cbau.freeserve.co.uk> wrote in message
news:ch*********************************@slb-newsm1.svr.pol.co.uk...
In article <el***********************@news1.calgary.shaw.ca >,
"Falcon Kirtarania" <cm****@shaw.ca> wrote:

Which would explain exactly why Microsoft programs are so poor: their
compiler accepts a program return type that is theoretically seen as
nonsense to the OS and therefore the program should raise many errors
when
run. However, their own programmers are not aware of that standard or
their
sample code would involve no void main(). However, it must be legal in
Windows because the compiler somehow manages to compile it on that
platform.
I wonder, if it was compiling for unix, would it be allowed to compile
void
main()?

Good question. Having a main () function declared as "void main ()"
invokes undefined behavior. That means that according to the C Standard,
anything could happen. And when I say anything, I mean _absolutely
anything_. If you compile and run this program:

void main () { printf ("Hello, world\n"); }

then it could happen that your computer explodes, or your harddisk gets
formatted, and you can't complain that your compiler is not a Standard C
compiler. (You still can complain that a common mistake like that
shouldn't explode your computer, but you can't complain that the
compiler is not conforming to the C Standard. )

You can check the documentation for your compiler. Maybe it defines what
will happen; if the C Standard leaves something undefined then any
compiler is allowed to define it. Maybe your compiler refuses to compile
the program; I would say that would be a very sensible approach. Maybe
your program crashes as soon as you start it, maybe it crashes just when
it finishes. Maybe the operating system puts up an alert that says:
"Warning: Program xxxx seems to be broken. Please contact the
manufacturer of this program for further advice. ". Anything could
happen.

So my question is this: You have Microsoft code (not conforming to the
C standard); you have Linux code (which shitheads on this place say
doesn't conform to the C standard); what _does_ conform to the C
standard? The fucking document itself? There's standard and there's
real world. Standards are important and I like them, but the standards
inform the real world code _and_ real world code informs standards.
Standards are meant to be helpful.

Nov 13 '05 #126

Steve Zimmerman

Falcon Kirtarania wrote:

Theoretically, any undefined behavior should return an error (as in, the
standard should change). This might help solve the troubles of awful
programming.
]
"Christian Bau" <ch***********@cbau.freeserve.co.uk> wrote in message
news:ch*********************************@slb-newsm1.svr.pol.co.uk...
In article <el***********************@news1.calgary.shaw.ca >,
"Falcon Kirtarania" <cm****@shaw.ca> wrote:

Which would explain exactly why Microsoft programs are so poor: their
compiler accepts a program return type that is theoretically seen as
nonsense to the OS and therefore the program should raise many errors
when
run. However, their own programmers are not aware of that standard or
their
sample code would involve no void main(). However, it must be legal in
Windows because the compiler somehow manages to compile it on that
platform.
I wonder, if it was compiling for unix, would it be allowed to compile
void
main()?

Good question. Having a main () function declared as "void main ()"
invokes undefined behavior. That means that according to the C Standard,
anything could happen. And when I say anything, I mean _absolutely
anything_. If you compile and run this program:

void main () { printf ("Hello, world\n"); }

then it could happen that your computer explodes, or your harddisk gets
formatted, and you can't complain that your compiler is not a Standard C
compiler. (You still can complain that a common mistake like that
shouldn't explode your computer, but you can't complain that the
compiler is not conforming to the C Standard. )

You can check the documentation for your compiler. Maybe it defines what
will happen; if the C Standard leaves something undefined then any
compiler is allowed to define it. Maybe your compiler refuses to compile
the program; I would say that would be a very sensible approach. Maybe
your program crashes as soon as you start it, maybe it crashes just when
it finishes. Maybe the operating system puts up an alert that says:
"Warning: Program xxxx seems to be broken. Please contact the
manufacturer of this program for further advice. ". Anything could
happen.

So Micro

Nov 13 '05 #127

Arthur J. O'Dwyer

On Sun, 27 Jul 2003, jacob navia wrote:

"Arthur J. O'Dwyer" wrote...
Why not use
unsigned bitfield:2;

No reason. This looks like yet another MS extension, and I *think*
that the code as it stands invokes undefined behavior, so it's fair
game for an extension.

The standard just says int/unsigned int as possible types for a bit field.
I added long/unsigned long, short, and even char. This makes the compiler
more usable but strictly speaking the standard says int/unsigned.
Presumably this allows

DWORD wide_bitfield:20;

(or any large number of bits), so the implementors just used DWORD
everywhere for consistency.

Yes, but "unsigned" would do the trick too... And since you are specifying
the number of bits, there are no 64 bit portability issues!

I don't know, but I always assumed that the number following the colon in
a bit-field had to be less than or equal to sizeof(int). 20 isn't less
than sizeof(int) on some implementations. Make that example

DWORD wide_bitfield:48;

if you like; then the same thing applies (and making it
'int wide_bitfield:48' wouldn't work, I think).

-Arthur

Nov 13 '05 #128

Richard Heathfield

Ron Natalie wrote:

>> Firstly they're not obliged to. Its worth noting that the C Standard
>> only requires compilers to complain about syntax errors and a couple
>> of other pretty major things.
>
>This is, in fact, one of them.
Sure about that?

Yep. The compiler must complain about the syntax and diagnosable
semantic errors unless the standard says no diagnostic is required.

That's not what the Standard says, though.

"5.1.1.3 Diagnostics
1 A conforming implementation shall produce at least one diagnostic message
(identified in an implementation-defined manner) if a preprocessing
translation unit or translation unit contains a violation of any syntax
rule or constraint, even if the behavior is also explicitly specified as
undefined or implementation-defined. Diagnostic messages need not be
produced in other circumstances.8)"

Nothing in there about diagnosable semantic errors.
The standard says main "SHALL" return int. This means the program
is not well-formed if it doesn't, and since there's no exemption clause,
it needs a diagnostic.

No, it doesn't. It's wrong, but the compiler is under no obligation to
diagnose it.

(BTW the rules are different for C++, and that might be what's thrown you.)

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #129

Richard Heathfield

Christian Bau wrote:

In article <3f******@news2.power.net.uk>,
Richard Heathfield <in*****@address.co.uk.invalid> wrote:
I occasionally conduct recruitment interviews for C programmers. At such
interviews, I always ask the programmer what main() returns. If he gets
the answer wrong, which does sometimes happen, it is unlikely that he
will get the job.
On the other hand, imagine you are in a job interview and you are asked:
What will this statement do?

i = 3;
a [i++] = i;

I recommend that you answer: It could store the number 3 or the number 4
into a [3]. Anything beyond that and you might confuse the interviewer.

I recommend that you answer: the second statement violates a "shall" outside
a constraint clause (specifically 6.5(2)), so no diagnostic is required,
but the behaviour is undefined, and the outcome is therefore unpredictable.
If that confuses the interviewer, so be it.
(These two possibilities alone would be enough to make you avoid a
statement like this, obviously. But don't mention that your program
could crash or worse because of this).

That could work against you, if the interviewer knows the language. And if
he doesn't know the language, he shouldn't be interviewing you anyway.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #130

Richard Heathfield

Mark McIntyre wrote:

On Thu, 24 Jul 2003 08:03:37 +0000, in comp.lang.c , Richard
Heathfield <in*****@address.co.uk.invalid> wrote:
Is it really? Quite a few people have looked for Microsoft's C
documentation of the effect of void main. I don't think anyone has found
it yet.

(and someone responded with a quote from the MSVC 6 helpfile,
mentioning void main() but I managed to lose the post)

I was about to say "interesting, so MS finally documented it" but then
I noticed that this section begins "A special function called main is
the entry point to all C++ programs".

They've "fixed" this in Version 7, but it matters not, since Version 7
doesn't claim to be C99-conforming anyway.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #131

Horst von Brand

"Falcon Kirtarania" <cm****@shaw.ca> writes:

Really, all it comes down to is that unless you really don't give a shit
what is in EAX after your program executes,
My usual work machine has no EAX of any sort...
you damn well better return int.
_You_ (the programmer) might not care, the implementation of C you are
using (or will be using some day in the future, after you thoroughly
forgot about the issue, or (even worse) you will be using to run
programs written by some moron 15 years back) might very well care a lot.
Theoretically, within standard, it could simply not set EAX on returning
void as it does for everything else. Then you would end up with return
codes that are pseudorandom numbers.

There you are talking about _one_ way a _particular_ compiler for a
_certain_ architecture _might_ do things today within the standard.
Fine as long as you don't care about ever changing anything in this
equation. But that is true only of throwaway programs, why bother with
C then? Do it in Perl, Python, ...
--
Dr. Horst H. von Brand User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria +56 32 654239
Casilla 110-V, Valparaiso, Chile Fax: +56 32 797513

Nov 13 '05 #132

Louis DeFiore

And what? Make these errors on gcc instead? Not that I have any great
love for Microsoft, but thats not the problem here.

Mr. Berserker wrote:

<snip>

Just stop using shit M$ products and grab gcc...

Nov 13 '05 #133

goose

"Alexander Grigoriev" <al***@earthlink.net> wrote in message news:<O5**************@TK2MSFTNGP11.phx.gbl>...

My favorite example of a crappy compiler is GNU ARM C/C++ compiler. We've
had so many problems with it, and it's slow as molassa.
One time, for example, it just didn't complain about a global pointer,
_defined_ in multiple compilation units, and _twice_ defined in one of them,
for example:

struct A * pa;

struct A * pa=& a;

I dont think that that is an error.
-----------------------------------
[LManickum@lee] Tue Jul 29 10:46:42 [1 bg] /usr/src/hw.c
36 ok cat hw.c
#include <stdio.h>
#include <stdlib.h>

struct A {
char *a;
};

static struct A a;
struct A *pa;
struct A *pa = &a;

int main (void) {
a.a = "Hello World\n";
printf ("%s\n", pa->a);
return EXIT_SUCCESS;
}
[LManickum@lee] Tue Jul 29 10:46:44 [1 bg] /usr/src/hw.c
37 ok make
gcc.exe -c -ansi -W -Wall -pedantic -ggdb -c -o hw.o hw.c
gcc.exe -o hw.exe hw.o
[LManickum@lee] Tue Jul 29 10:46:49 [1 bg] /usr/src/hw.c
38 ok ./hw.exe
Hello World
-----------------------------------

see ?

Second definition just didn't have any effect,
that is just plain wrong.
without any diagnostics. Of
course, the first definition should have been a declaration, with 'extern',
but the compiler didn't give any help to find it.

When I've tried MS eVC ARM C, I thought it's godsent.

goose,
hth

Nov 13 '05 #134

goose

Richard Heathfield <do******@address.co.uk.invalid> wrote in message news:<bf**********@sparta.btinternet.com>...

[Followups set to comp.lang.c]

Falcon Kirtarania wrote:
Theoretically, any undefined behavior should return an error (as in, the
standard should change). This might help solve the troubles of awful
programming.

Feel free to write a C compiler which diagnoses any and all instances of
undefined behaviour. Consider whether it should generate a diagnostic for
this code. If so, why? If not, why not?

int foo(int *a, int *b)
{
return *a = *b++;
}

<snip>

could it be that the reason for the addition of restrict in c99
was so that programmers can hint to the compiler that in the
above case there will definately be no UB ?

in the foo() example a helpfull c99 compiler might generate a warning
along the lines of
warning: in function foo: possible UB on line xx.

and adding in the "restrict" to the arguments then makes this
warning go away ?

goose,
just speculating, flame away ;-)

Nov 13 '05 #135

Falcon Kirtarania

You might try putting it over a Z80 running on a 233MHz clock.

"Ben Taylor" <be***********@yahoo.co.uk> wrote in message
news:OU**************@TK2MSFTNGP12.phx.gbl...

can you smoke beef?
how?

"Charlie Gibbs" <cg****@kltpzyxm.invalid> wrote in message
news:12*******************@kltpzyxm.invalid...
In article <bf**********@oravannahka.helsinki.fi> pa*****@cc.helsinki.fi
(Joona I Palaste) writes:
Emmanuel Delahaye <em**********@noos.fr> scribbled the following
on comp.lang.c:

> In 'comp.lang.c', ob***@web.de (Oliver Brausch) wrote:
>
>> have you ever heard about this MS-visual c compiler bug?
>> look at the small prog:

<snip>
>> return ++x;

No, there is nothing wrong with "return ++x;" by itself. Not a missing
sequence point, anyway. Others have already explained the OP's problems
to him: (1) He's using void main(), undefined behaviour. (2) He's
calling bit32() twice without an intervening sequence point,
unspecified behaviour.

A little voice inside my head starts screaming "Side effects!"
whenever I see something like this. My solution is simple:
don't do it.
"You can pick your friends, you can pick your nose, but you can't pick
your relatives."
- MAD Magazine

"You can smoke beef, and you can smoke hash, but you can't smoke
corned beef hash." -- National Lampoon

--
/~\ cg****@kltpzyxm.invalid (Charlie Gibbs)
\ / I'm really at ac.dekanfrus if you read it the right way.
X Top-posted messages will probably be ignored. See RFC1855.
/ \ HTML will DEFINITELY be ignored. Join the ASCII ribbon campaign!

Nov 13 '05 #136

Kelsey Bjarnason

[snips]

On Mon, 28 Jul 2003 06:35:21 +0000, Richard Heathfield wrote:

The consensus of comp.std.c,
was that a program which calls a clear screen function
exhibits behavior which is not defined by the standard,
and which is also not considered to be undefined behavior.
I don't understand the importance of the distinction.

I believe you're right about the csc consensus. I'm afraid I am just as much
in the dark over the difference between undefined behaviour and behaviour
which is not defined. I think I'm correct in saying that the committee sees
the two words "undefined behaviour" as being a key term with a particular
meaning, the meaning being, of course, the "this Standard imposes no
requirements" thing. But since the Standard imposes no requirements on
extensions, either, I continue to fail to see the distinction.

Apart from the obvious: one involves presumably correct operation outside
the scope of the standard, the other denotes very likely incorrect
behaviour.

int x = SomeWinAPICall(); is quite possibly correct, but beyond C's
ability to say much; i = ++i + i++; on the other hand...

Nov 13 '05 #137

Kevin Easton

Kelsey Bjarnason <ke*****@xxnospamyy.lightspeed.bc.ca> wrote:

[snips]

On Mon, 28 Jul 2003 06:35:21 +0000, Richard Heathfield wrote:
The consensus of comp.std.c,
was that a program which calls a clear screen function
exhibits behavior which is not defined by the standard,
and which is also not considered to be undefined behavior.
I don't understand the importance of the distinction.

I believe you're right about the csc consensus. I'm afraid I am just as much
in the dark over the difference between undefined behaviour and behaviour
which is not defined. I think I'm correct in saying that the committee sees
the two words "undefined behaviour" as being a key term with a particular
meaning, the meaning being, of course, the "this Standard imposes no
requirements" thing. But since the Standard imposes no requirements on
extensions, either, I continue to fail to see the distinction.

Apart from the obvious: one involves presumably correct operation outside
the scope of the standard, the other denotes very likely incorrect
behaviour.

int x = SomeWinAPICall(); is quite possibly correct, but beyond C's
ability to say much; i = ++i + i++; on the other hand...

An implementation could describe some "correct" behaviour of i = ++i + i++,
if it wanted to, in the same way it can describe the "correct" behaviour
of SomeWinAPICall(), couldn't it?

- Kevin.

Nov 13 '05 #138

Richard Heathfield

Kelsey Bjarnason wrote:

[snips]

On Mon, 28 Jul 2003 06:35:21 +0000, Richard Heathfield wrote:
The consensus of comp.std.c,
was that a program which calls a clear screen function
exhibits behavior which is not defined by the standard,
and which is also not considered to be undefined behavior.
I don't understand the importance of the distinction.
I believe you're right about the csc consensus. I'm afraid I am just as
much in the dark over the difference between undefined behaviour and
behaviour which is not defined. I think I'm correct in saying that the
committee sees the two words "undefined behaviour" as being a key term
with a particular meaning, the meaning being, of course, the "this
Standard imposes no requirements" thing. But since the Standard imposes
no requirements on extensions, either, I continue to fail to see the
distinction.

Apart from the obvious: one involves presumably correct operation outside
the scope of the standard, the other denotes very likely incorrect
behaviour.

It is not at all obvious to me that the distinction exists.

int x = SomeWinAPICall(); is quite possibly correct, but beyond C's
ability to say much; i = ++i + i++; on the other hand...

i = ++i + i++ is exactly as "quite possibly correct" as SomeWinAPICall(), in
terms of what the Standard says about them both. Any implementation is free
to make either of them do a certain thing, and document that thing. And of
course any implementation is free to do the nasal demon thing with either
of them.

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton

Nov 13 '05 #139

Joona I Palaste

Richard Heathfield <do******@address.co.uk.invalid> scribbled the following:

Kelsey Bjarnason wrote:
int x = SomeWinAPICall(); is quite possibly correct, but beyond C's
ability to say much; i = ++i + i++; on the other hand...
i = ++i + i++ is exactly as "quite possibly correct" as SomeWinAPICall(), in
terms of what the Standard says about them both. Any implementation is free
to make either of them do a certain thing, and document that thing. And of
course any implementation is free to do the nasal demon thing with either
of them.

Does the C standard anywhere say that i = ++i + i++; may do something
that SomeWinAPICall() may not, or the other way around? If not, then I
agree with Richard Heathfield, both are equally undefined behaviour.

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ---------------------------\
| Kingpriest of "The Flying Lemon Tree" G++ FR FW+ M- #108 D+ ADA N+++|
| http://www.helsinki.fi/~palaste W++ B OP+ |
\----------------------------------------- Finland rules! ------------/
"The trouble with the French is they don't have a word for entrepreneur."
- George Bush

Nov 13 '05 #140

Dave Thompson

On Sun, 27 Jul 2003 23:28:34 +0200, "jacob navia"
<ja*********@jacob.remcomp.fr> wrote:

Dumb answer: Are there any classes of UB that *are* feasible
to detect at compile-time or runtime? 'foo main', where 'foo'!='int',
looks easy to me. I can't think of any others off the top of my
head.
Pointers

The set of all pointers in the program is initialized at startup. They are either NULL or they point
to valid addresses, established in the raw data of the program. For instance:

int a,*pint = &a;

The set of valid addresses is established by the compiler at startup. When control arrives at main()
all pointers are valid.

Not startup. Pointer variables (objects) can be static (including all
file scope) in which case the objects exist at startup; or local,
created when the block (usually function) is entered, even
recursively, and "vanish" when it exits; or dynamic, created and
released as directed by program code. Static pointer variables are
initialized, to null by default or to the stated initializer, but it
is not necessarily obvious if the value is valid, consider:
extern int a [ /* bound not given, defined in other t.u. */ ];
int * p = &a[37]; /* have to look there to validate this */
Automatic pointer variables are indeterminate if not initialized, and
dynamic pointer variables always indeterminate until assigned unless
calloc is used and all-bits-zero is valid, as it is on many machines.

The set of valid addresses = pointer values, or equivalently of
objects that can validly be pointed to, similarly varies: it
includes static variables, local auto but not register variables, and
dynamically allocated aka heap space; the latter two have lifetimes
shorter (often much shorter) than the whole program execution.
Undefined behavior (what pointers concerns) is when any pointer is used that
1) Has not been initialized to point to an existing valid object.
or
2) Is NULL or points somewhere else than

Object start address <= p < (start address)+sizeof(Object)
Assuming by 'used' you mean dereferenced, almost; in the case of
pointing within a composite (array or struct) object, we further need
the pointer to be to a valid subobject (element or field), or any
character=byte. For a pointer (value) that is only computed, stored,
passed, returned, etc., we also allow == startaddr + size.

Note that 'existing' requires the object existed when the address was
taken, and the *same* object *still* exists when the pointer is used;
this is trivial for a static target, but not for the other durations.
We can distinguish two types of pointers:

A) Unbounded pointers, i.e. pointers where the calculation of sizeof(Object) is impossible
B) Bounded pointers where sizeof(Object) is known and can be checked at run time.

Detecting this class of UB is called bounds checking and is done in many languages.
C is notoriously lacking this facility. Worse, the machine is not used to automatically test the
programmer's assumptions and all pointers are considered unbound.
'Bounds checking' usually means checking of subscripts for arrays,
which are in (most) compiled languages the only actual objects that
can have differing sizes at runtime. C99 'flexible' structs are
stored in, and can access, differing space, but the type itself has
the fixed size, and layout, of the fixed part, only the array part is
variable. Various polymorphic pointers and references can point to
different types of object with different sizes, but each actual type
has a fixed size. Pascal variant records can have different sizes,
but each variant still has a fixed layout. C does not actually
prohibit runtime checking, but it does largely prevent the
optimizations that make it (mostly) practical for other languages.
Lisp, APL, and many other languages check array accesses and avoid memoy corruption. C doesn't, and
we are plagued by memory corruption and obscure bugs.

An improvement would be to encourage the automatic checking of object accesses and discouraging the
usage of unbounded pointers. Instead of writing:

void matmult(int n,int m, double *pmat)

we would write:

void matmult(int n, int m, double mat[n][m]);

Such a proto would allow to check in the calling program that the buffer passed has enough space as
declared, and in the called function it would be possible to check that no index is being misused.
Yes. And I believe you realize, but other readers may not, that this
syntax is standard in C99, but bounds checking (still) isn't.

(I would expect a function named matmult to multiply matrices, and
that requires three matrix arguments, or two and one return "value",
which of course in C must be done by returning a pointer to allocated
or reused memory or by wrapping a fixed-size array in a struct.)

In APL it is conventional to do whole-array operations whenever
reasonable, so fewer subscripts need actually be checked. This is
also supported and encouraged in HPF/Fortran>=90 and PL/I.
Most of this attitude comes because the Pascal language has this facility, and many C people see
Pascal as something quite horrible.
Pascal does not have quite the semantics you show; it (optionally) has
a form which binds(!) each bound (and in Pascal both lower and upper
bounds are specified, not just upper as in C) to an array formal:
procedure mat_something (mat: array [a:b, c:d] of real);
which can then be called with any 2-D array of real, and a,b,c,d used
(if necessary) to obtain the actual bound values. I'm not sure if you
can have multiple formals with the same variable-size (or as they call
it 'conformant') array type, and I'm pretty sure you can't have
mutiple formals with interdependent types, as e.g. actual matmult
needs: if A is [Xrange, Yrange], B must be [Yrange, Zrange] for some
Zrange, and result is [Xrange, Zrange].

Fortran has (since F77 IIRC) the form you describe, except slightly
different syntax, and since F90 also the equivalent of the Pascal form
(assumed-shape) except that the bounds are unnamed and instead
obtained using special builtin functions. I will call that 'attached'
since the bounds are carried with the array argument not written
separately, and your form 'separate'. I'm pretty sure PL/I has both.
Ada has attached (unconstrained array) and AFAICT prevents separate
except by instantiating a generic (with a different syntax).
I think that was a good feature of Pascal. I miss this in C and I see each day the consequences in
array overruns, obscure bugs, and many other problems. Encouraging the use of bounded pointers would
introduce some hygienical concepts isn't it?

Nobody is proposing banning unbounded pointers. They should remain for special uses or in old
software. Encouraging the use of bounded pointers will make them slowly les and less frequent,
that's all.

The sizeof calculation is very problematic in C because of the refusal of passing this information
in array prototypes by the standards comitee. Of course this has historical reasons, but I just do
not understand why in 2003 we still want to save us the few machine cycles that that would cost, and
spare the users and the programmers the stack overruns, memory corruption and other problems!

An array decays in C, to an unbounded pointer when passed to a subroutine. All sizeof information is
not passed along. This is (maybe) efficient but it is a problem for checking the bounds of array
indexes!

You can't pass bounds for arrays only and not pointers, because array
parameters *are* pointers; this is fundamental to the language, and
cannot be changed while still calling the result C. An implementation
can compute, store and pass (and check) bounds for *all* pointers, at
extra cost which most people apparently consider too high. You might
also want to store and check bounds for C99 'flexible' structs, either
hidden in the struct or in (all!) pointer-to-struct.

Or, you could design a callling sequence which passes bounds for
'array' arguments, similar to the way most Fortrans pass a 'hidden'
length of (there fixed-size) character-string arguments, but not keep
bounds in explicit pointers, so where a pointer has passed through a
variable or out-of-line function, and probably certain kinds of
conversions/casts, the bound information would become 'unknown --
can't check this one'. And I suspect so much C code does (and would
still do) this as to make the feature worth little, though legal.

In C++ you can create a bounded_ptr (probably template) which
keeps/passes (and checks) a bound in exactly the cases you specify a
bounded_ptr, and no others, and I think, without actually working out
the details, that you can have the (visible) code optimize away in
cases determined at compile time. Or you can write a template
function that deduces the bound of an array parameter where known, or
uses 'unknown' for a pointer; to avoid code bloat you could (and I
would) have it inline into code that passes the bound explicitly to a
shared implementation, but then AFAICT you can't prevent a caller from
cheating and directly specifying a wrong bound.

- David.Thompson1 at worldnet.att.net

Nov 13 '05 #141

Similar topics