By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
425,625 Members | 1,283 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 425,625 IT Pros & Developers. It's quick & easy.

Does C implement the first C compiler itself?

P: n/a
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?

Feb 14 '07 #1
Share this Question
Share on Google+
30 Replies


P: n/a
lovecreatesbea...@gmail.com wrote:
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Put some deductive reasoning/logic into it.
Feb 14 '07 #2

P: n/a
Christopher Layne wrote:
>lovecreatesbea...@gmail.com wrote:
>K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Read "The Development of the C Language" at Dennis Ritchie home page (
http://cm.bell-labs.com/who/dmr )
>Put some deductive reasoning/logic into it.
[OT] That may lead you to logically correct, but nevertheless false
conclusions. The first version of an Algol-like system programming
language for Burroughs mainframes (B5000?) was written in itself.
It was "hand compiled" into assembler to get the first running
version, and it was self-sustained from there.
Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Feb 14 '07 #3

P: n/a
Roberto Waltman wrote:
[OT] That may lead you to logically correct, but nevertheless false
conclusions. The first version of an Algol-like system programming
language for Burroughs mainframes (B5000?) was written in itself.
It was "hand compiled" into assembler to get the first running
version, and it was self-sustained from there.
You just contradicted yourself. That was my point.

The first version was written in another language.
Feb 14 '07 #4

P: n/a
On Feb 14, 10:09 am, Christopher Layne <cla...@com.anodizedwrote:
Roberto Waltman wrote:
[OT] That may lead you to logically correct, but nevertheless false
conclusions. The first version of an Algol-like system programming
language for Burroughs mainframes (B5000?) was written in itself.
It was "hand compiled" into assembler to get the first running
version, and it was self-sustained from there.

You just contradicted yourself. That was my point.

The first version was written in another language.

Not so.

The compiler was _written_ in the target language. It was
_translated_ to assembly.

This is a very common process referred to as "bootstrapping" a
compiler. It's a very useful exercise, in that it proves (or
disproves) the utility of the language with a non-trivial application,
and provides an immediate base of test code. Compilers tend to
require a wide range of algorithmic techniques and a variety of data
structures. If your language can implement a compiler, it's likely to
be useful for most other tasks as well.

Though these days, the intermediate language would more likely be C
than assembly, the process is essentially unchanged.

I guess you could argue that the first version of the compiler was
wetware writ by the finger of God, but that probably isn't very
helpful.

Regards,

-=Dave

Feb 14 '07 #5

P: n/a
"lovecreatesbea...@gmail.com" <lo***************@gmail.comwrote in message
news:11*********************@k78g2000cwa.googlegro ups.com...
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Surprisingly, I understand the thought process behind this frivolous
question.

Here is another related question:

If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?

(This is no more and no less frivolous than the original.)
--
David T. Ashley (dt*@e3ft.com)
http://www.e3ft.com (Consulting Home Page)
http://www.dtashley.com (Personal Home Page)
http://gpl.e3ft.com (GPL Publications and Projects)
Feb 14 '07 #6

P: n/a
"David T. Ashley" <dt*@e3ft.comwrites:
>If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?
But there's no guarantees that there's optimizations to be made...

--
Chris.
Feb 14 '07 #7

P: n/a

lovecreatesbea...@gmail.com wrote:
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
You may be interested in the following link:

<http://cm.bell-labs.com/cm/cs/who/dmr/chist.html>

Feb 14 '07 #8

P: n/a
"David T. Ashley" <dt*@e3ft.comwrites:
"lovecreatesbea...@gmail.com" <lo***************@gmail.comwrote in message
news:11*********************@k78g2000cwa.googlegro ups.com...
>K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?

Surprisingly, I understand the thought process behind this frivolous
question.

Here is another related question:

If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?
No. An optimizer (if it's working properly) makes the target code run
more quickly; it doesn't change its behavior.

Suppose you have C sources for two C compilers, compiler1 and
compiler2. compiler1 includes a very clever optimizer; compiler2
doesn't. Use compiler2 to compile itself, yielding compiler2a.exe,
which compiles slowly. Use compiler1 to compile compiler2, yielding
compiler2b.exe, which compiles more quickly. But compiler2a.exe and
compiler2b.exe work the same way; the code they generate is identical.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 14 '07 #9

P: n/a
"David T. Ashley" wrote:
>"lovecreatesbea...@gmail.com" wrote in message
>K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?

Surprisingly, I understand the thought process behind this frivolous
question.
Why frivolous?
>Here is another related question:

If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?
[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.

If you then change the sources improving the optimization algorithms,
the compiler should produce a faster and/or smaller version of itself
reaching again a steady state after a few iterations.

To expect the cycle to continue for ever is akin to expect a file
compression algorithm to always reduce a file's size, which leads to
the conclusion that all files can be compressed into a single bit.

I could not find a reference now, but I recall reading about Niklaus
Wirth using this process to evaluate optimizations for his Pascal
compilers.

An optimization that would result in a faster or smaller compiler was
integrated in the code, but increasing the complexity of the compiler
with additional optimization steps that could improve code in some
marginal cases, but would not have a significant effect in the Pascal
compiler itself, was not considered a worthwhile trade-off.

Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Feb 14 '07 #10

P: n/a
On Feb 14, 11:38 am, "David T. Ashley" <d...@e3ft.comwrote:
[...]
If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?
Umm. The generated code should always be the same. Unless there's a
bug in the optimizer. Optimization is easy if you don't have to get
the right answer.

But you reminded me of something I heard years ago: It is said that
every program contains at least one bug, and can be shortened by one
instruction. Therefore, by induction, every program can be reduced to
a single instruction, but it will be wrong.

Regards,

-=Dave

Feb 14 '07 #11

P: n/a
"lovecreatesbea...@gmail.com" <lo***************@gmail.comwrote in message
news:11*********************@k78g2000cwa.googlegro ups.com...
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
That would leave you with a rather serious chicken-and-egg problem.

The very, very first C compiler was written in B. It was subsequently
rewritten in C (which wasn't a major change) and recompiled with the first
compiler.

DMR has a web site that explains the history of how C (and its original
compiler) evolved into existence.

S

--
Stephen Sprunk "Those people who think they know everything
CCIE #3723 are a great annoyance to those of us who do."
K5SSS --Isaac Asimov

--
Posted via a free Usenet account from http://www.teranews.com

Feb 14 '07 #12

P: n/a
Dave Hansen wrote, On 14/02/07 19:51:
On Feb 14, 11:38 am, "David T. Ashley" <d...@e3ft.comwrote:
[...]
>If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?

Umm. The generated code should always be the same. Unless there's a
bug in the optimizer. Optimization is easy if you don't have to get
the right answer.
Indeed.
But you reminded me of something I heard years ago: It is said that
every program contains at least one bug, and can be shortened by one
instruction. Therefore, by induction, every program can be reduced to
a single instruction, but it will be wrong.
The versions I have heard have an additional clause which removes the
problem. The additional clause being something like, "of significant
complexity." It is important because of useful programs such as:

int main()
{
return 0;
}

Note, I've deliberately not specified a void parameter list to make it
compatible with what I know of pre-standard C as well as C89 & C99.
--
Flash Gordon
Feb 14 '07 #13

P: n/a
On Feb 14, 6:45 pm, Roberto Waltman <use...@rwaltman.netwrote:
"David T. Ashley" wrote:
[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.
Not necessarily.

The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

Now assume that your compiler A has a function codegen (), and when
the compiler wishes to generate code for two function calls in
unspecified order, with the results discarded, it calls

codegen (right_tree) + codegen (left_tree)

Lets say we compile this with a compiler X, which calls the left
function first. So the assembler code in the compiled code is

codegen (right_tree)
codegen (left_tree)

Compile the compiler using the compiler compiled by X. The assembler
code will now be

codegen (left_tree)
codegen (right_tree).

Compile the compiler with the generated code. The assembler code will
be

codegen (right_tree)
codegen (left_tree)

and so on. All the even versions are different from all the odd
versions.

Feb 14 '07 #14

P: n/a
"christian.bau" wrote:
>Roberto Waltman wrote:
>[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.

Not necessarily.

The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

Now assume that your compiler A has a function codegen (), and when
the compiler wishes to generate code for two function calls in
unspecified order, with the results discarded, it calls

codegen (right_tree) + codegen (left_tree)

Lets say we compile this with a compiler X, which calls the left
function first. So the assembler code in the compiled code is

codegen (right_tree)
codegen (left_tree)

Compile the compiler using the compiler compiled by X. The assembler
code will now be

codegen (left_tree)
codegen (right_tree).

Compile the compiler with the generated code. The assembler code will
be

codegen (right_tree)
codegen (left_tree)

and so on. All the even versions are different from all the odd
versions.
OK, I stand corrected. So the generated compilers fall into a cycle,
of length 2 in this case, (easily expanded to more.)
Is this always the case? (To have cyclic versions, that is.)

Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Feb 14 '07 #15

P: n/a

"lovecreatesbea...@gmail.com" <lo***************@gmail.comwrote in message
news
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Yes.
So how did it compile itself? The usual way is to write a little
interpreter. Interpreters are far easier to write than compilers. See my
homepage for one in a single reasonable-length file.

The interpreter runs the compiler source on itself and creates the first
native compiler.
Feb 14 '07 #16

P: n/a
"Malcolm McLean" <re*******@btinternet.comwrites:
"lovecreatesbea...@gmail.com" <lo***************@gmail.comwrote in message
news
>K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Yes.
No, it doesn't say that. Somebody in this thread said that the first
C compiler was written in B. I just scanned
<http://cm.bell-labs.com/cm/cs/who/dmr/chist.htmlfor the word
"compiler"; I didn't see a clear statement to that effect. It appears
that there was an evolutionary path from B to C, with "NB" (New B)
somewhere in the middle. I think that the compiler and the language
evolved in parallel, with new language features being implemented by
the compiler, then used in the next version of the compiler.
So how did it compile itself? The usual way is to write a little
interpreter. Interpreters are far easier to write than compilers. See my
homepage for one in a single reasonable-length file.

The interpreter runs the compiler source on itself and creates the first
native compiler.
It could have been done that way, but as far as I can tell it wasn't.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 15 '07 #17

P: n/a
"Dave Hansen" <id**@hotmail.comwrote in message
news:11*********************@p10g2000cwp.googlegro ups.com...
On Feb 14, 11:38 am, "David T. Ashley" <d...@e3ft.comwrote:

Optimization is easy if you don't have to get
the right answer.
Regrettably, some compiler vendors have this integrated into their business
model. I was once told by tech support "Don't use optimizaton level 5 until
we get the bugs fixed".
--
David T. Ashley (dt*@e3ft.com)
http://www.e3ft.com (Consulting Home Page)
http://www.dtashley.com (Personal Home Page)
http://gpl.e3ft.com (GPL Publications and Projects)
Feb 15 '07 #18

P: n/a
Roberto Waltman wrote:
"christian.bau" wrote:
Roberto Waltman wrote:
[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.
Not necessarily.

The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

Now assume that your compiler A has a function codegen (), and when
the compiler wishes to generate code for two function calls in
unspecified order, with the results discarded, it calls

codegen (right_tree) + codegen (left_tree)

Lets say we compile this with a compiler X, which calls the left
function first. So the assembler code in the compiled code is

codegen (right_tree)
codegen (left_tree)

Compile the compiler using the compiler compiled by X. The assembler
code will now be

codegen (left_tree)
codegen (right_tree).

Compile the compiler with the generated code. The assembler code will
be

codegen (right_tree)
codegen (left_tree)

and so on. All the even versions are different from all the odd
versions.

OK, I stand corrected. So the generated compilers fall into a cycle,
of length 2 in this case, (easily expanded to more.)
Is this always the case? (To have cyclic versions, that is.)
No, this is not always the case. tinycc, for example, puts garbage in
its generated output. Its x86 long doubles are 96 bits, with only 80
value bits. When storing a long double in the generated object file,
all 96 bits are copied, but the 16 padding bits are never explicitly
initialised, and are for practical purposes completely random. You
can't even assume that one specific binary of the compiler with one
specific set of compiler options will always generate bitwise-
identical object code.

(I hope what I mean by "value bit" and "padding bit" is clear enough,
even though it does not exactly match the meanings of the same terms
as used by the standard.)

Feb 15 '07 #19

P: n/a
"Harald van D?k" wrote:
>...
No, this is not always the case. tinycc, for example, puts garbage in
its generated output. Its x86 long doubles are 96 bits, with only 80
value bits. When storing a long double in the generated object file,
all 96 bits are copied, but the 16 padding bits are never explicitly
initialised, and are for practical purposes completely random. You
can't even assume that one specific binary of the compiler with one
specific set of compiler options will always generate bitwise-
identical object code.
[OT] Thanks, after I posted my last question I realized this is
possible, (thinking of uninitialized structs padding.)
There was a recent thread in comp.software-eng about verifying that
two source files are equivalent after "pretty-printing" one.
The OP resorted to compiling them and comparing the resulting object
files, it is clear now this is not guaranteed to work.

Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Feb 15 '07 #20

P: n/a
christian.bau wrote:
The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.
I guess addition always associates from left to right in C...

But perhaps it's only relevant when you have more than two numbers being summed...
For example, a + b + c is always (a + b) + b, but the order in which a and b are evaluated is
compiler dependent... Did I understand correctly?

JJ
Feb 17 '07 #21

P: n/a
João Jerónimo wrote:
christian.bau wrote:
The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

I guess addition always associates from left to right in C...

But perhaps it's only relevant when you have more than two numbers being summed...
For example, a + b + c is always (a + b) + b, but the order in which a and b are evaluated is
compiler dependent... Did I understand correctly?
It's a bit stronger than that.

a + b + c is always (a + b) + c, but the order in which not only a and
b but also c are evaluated is unspecified. It's possible for c to be
evaluated before, between, or after a and b.

Feb 17 '07 #22

P: n/a
Harald van Dijk wrote:
a + b + c is always (a + b) + c, but the order in which not only a and
b but also c are evaluated is unspecified. It's possible for c to be
evaluated before, between, or after a and b.
Very well... But so... it's always added after a and b are added... Right?
It wasn't what I said, anyway... But I understand that the compiler is free to generate all the code
the way it wants as far as the code behaves as expected...

JJ
Feb 17 '07 #23

P: n/a
João Jerónimo wrote:
Harald van Dijk wrote:
a + b + c is always (a + b) + c, but the order in which not only a and
b but also c are evaluated is unspecified. It's possible for c to be
evaluated before, between, or after a and b.

Very well... But so... it's always added after a and b are added... Right?
Yes, unless...
It wasn't what I said, anyway... But I understand that the compiler is free to generate all the code
the way it wants as far as the code behaves as expected...
....the compiler can prove that adding c earlier has no effect on the
result.

But -1 + INT_MAX + 1 cannot ever cause unexpected results due to
overflow, if that's what you mean.

Feb 17 '07 #24

P: n/a
João Jerónimo <j_*******@yahoo.com.brwrites:
Harald van Dijk wrote:
>a + b + c is always (a + b) + c, but the order in which not only a and
b but also c are evaluated is unspecified. It's possible for c to be
evaluated before, between, or after a and b.

Very well... But so... it's always added after a and b are added... Right?
The standard imposes a partial order on the evaluation of the
subexpressions.

The operands can be evaluated in any of 6 possible orders:

a,b,c a,c,b b,a,c b,c,a c,a,b c,b,a

The first addition must be evaluated after both a and b have been
evaluated, and the second addition must be evaluated after both the
first addition and c have been evaluated.

Or the compiler can re-order the operations arbitrarily *if* the
resulting execution always gives the same result as what's required.
It wasn't what I said, anyway... But I understand that the compiler
is free to generate all the code the way it wants as far as the code
behaves as expected...
Only if "as expected" matches the standard's requirements. Otherwise,
the result can easily violate your expectations. For example, if you
*expect*

printf("hello, ") + printf("world\n")

to print "hello, world", the compiler is free to disappoint you.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 18 '07 #25

P: n/a
In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>"David T. Ashley" <dt*@e3ft.comwrites:
>"lovecreatesbea...@gmail.com" <lo***************@gmail.comwrote in message
news:11*********************@k78g2000cwa.googlegr oups.com...
>>K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?

Surprisingly, I understand the thought process behind this frivolous
question.

Here is another related question:

If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?

No. An optimizer (if it's working properly) makes the target code run
more quickly; it doesn't change its behavior.

Suppose you have C sources for two C compilers, compiler1 and
compiler2. compiler1 includes a very clever optimizer; compiler2
doesn't. Use compiler2 to compile itself, yielding compiler2a.exe,
which compiles slowly. Use compiler1 to compile compiler2, yielding
compiler2b.exe, which compiles more quickly. But compiler2a.exe and
compiler2b.exe work the same way; the code they generate is identical.
That is incorrect. There are compilers I know that will compile for size
or speed. They generate different code depending on the flags.

Thus whilst the external functionality of both is the same the internal
behaviour is different. In some cases if you want faster code the
executable is larger because it will unwrap some calls and loops.
Register and memory usage is different.

Ps Keith I hadn't forgotten I owe you an answer to the other post on the
Safe C lib. I have to dig out the file and read it with some thought
Things are VERY hectic here including builders removing walls.
(Everything is covered in dust or in boxes or in store)
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
/\/\/ ch***@phaedsys.org www.phaedsys.org \/\/\
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Feb 22 '07 #26

P: n/a
Chris Hills <ch***@phaedsys.orgwrites:
In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>>"David T. Ashley" <dt*@e3ft.comwrites:
[...]
>>If one has the C source code for an optimizing compiler, the
compiles the compiler, creating a new compiler, and repeats this
process over many iterations, will the compiler executable get
smaller and smaller each time?

No. An optimizer (if it's working properly) makes the target code run
more quickly; it doesn't change its behavior.

Suppose you have C sources for two C compilers, compiler1 and
compiler2. compiler1 includes a very clever optimizer; compiler2
doesn't. Use compiler2 to compile itself, yielding compiler2a.exe,
which compiles slowly. Use compiler1 to compile compiler2, yielding
compiler2b.exe, which compiles more quickly. But compiler2a.exe and
compiler2b.exe work the same way; the code they generate is identical.
That is incorrect. There are compilers I know that will compile for
size or speed. They generate different code depending on the flags.

Thus whilst the external functionality of both is the same the
internal behaviour is different. In some cases if you want faster code
the executable is larger because it will unwrap some calls and
loops. Register and memory usage is different.
My point is that the code generated by a compiler is not (or should
not be) affected by any optimizations performed by the compiler used
to compile the compiler. The compiler itself may run more quickly
and/or use less memory, but it will generate the same code.

Let me try to make the point more clearly. Suppose I have a Pascal
compiler written in C. For simplicity, assume the Pascal compiler
source is a single C source file, "pas.c". Suppose I have a Pascal
program source file "foo.p".

(1) gcc -O0 pas.c -o pas_slow
(2) ./pas_slow foo.p -o foo0
(3) ./foo0

(4) gcc -O3 pas.c -o pas_fast
(5) ./pas_fast foo.p -o foo1
(6) ./foo1

Line 4 will likely take longer to execute than line 1, because gcc is
doing extra work. Line 5 will likely execute more quickly than line
2, because in line 5 I'm executing optimized code. But the "pas_slow"
and "pas_fast" programs (both Pascal compilers) should behave
identically (apart from their speed), which means they should produce
essentially identical code from the same Pascal source, and lines 3
and 6 should take the same time to execute (ignoring random
variations).

If pas_slow and pas_fast behave differently, in the sense of producing
different output, then the optimizer is broken, or possibly pas.c
depends on undefined behavior.
Ps Keith I hadn't forgotten I owe you an answer to the other post on
the Safe C lib. I have to dig out the file and read it with some
thought Things are VERY hectic here including builders removing
walls. (Everything is covered in dust or in boxes or in store)
Can you remind me which thread this is about (the subject header will
do)? Thanks.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 22 '07 #27

P: n/a
In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>Chris Hills <ch***@phaedsys.orgwrites:
>In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>>>"David T. Ashley" <dt*@e3ft.comwrites:
[...]
>>>If one has the C source code for an optimizing compiler, the
compiles the compiler, creating a new compiler, and repeats this
process over many iterations, will the compiler executable get
smaller and smaller each time?

No. An optimizer (if it's working properly) makes the target code run
more quickly; it doesn't change its behavior.

Suppose you have C sources for two C compilers, compiler1 and
compiler2. compiler1 includes a very clever optimizer; compiler2
doesn't. Use compiler2 to compile itself, yielding compiler2a.exe,
which compiles slowly. Use compiler1 to compile compiler2, yielding
compiler2b.exe, which compiles more quickly. But compiler2a.exe and
compiler2b.exe work the same way; the code they generate is identical.
That is incorrect. There are compilers I know that will compile for
size or speed. They generate different code depending on the flags.

Thus whilst the external functionality of both is the same the
internal behaviour is different. In some cases if you want faster code
the executable is larger because it will unwrap some calls and
loops. Register and memory usage is different.

My point is that the code generated by a compiler is not (or should
not be) affected by any optimizations performed by the compiler used
to compile the compiler. The compiler itself may run more quickly
and/or use less memory, but it will generate the same code.

Let me try to make the point more clearly. Suppose I have a Pascal
compiler written in C. For simplicity, assume the Pascal compiler
source is a single C source file, "pas.c". Suppose I have a Pascal
program source file "foo.p".

(1) gcc -O0 pas.c -o pas_slow
(2) ./pas_slow foo.p -o foo0
(3) ./foo0

(4) gcc -O3 pas.c -o pas_fast
(5) ./pas_fast foo.p -o foo1
(6) ./foo1

Line 4 will likely take longer to execute than line 1, because gcc is
doing extra work. Line 5 will likely execute more quickly than line
2, because in line 5 I'm executing optimized code. But the "pas_slow"
and "pas_fast" programs (both Pascal compilers) should behave
identically (apart from their speed), which means they should produce
essentially identical code from the same Pascal source, and lines 3
and 6 should take the same time to execute (ignoring random
variations).

If pas_slow and pas_fast behave differently, in the sense of producing
different output, then the optimizer is broken, or possibly pas.c
depends on undefined behavior.
The functionality of the code generated by the compiler will be the same
no matter what the optimisation. The size of the code and it's
composition will be different .

IF I have a C compiler and an application APP that I compile and it
generates for example 64 K of executable by compiling using
optimisation for size I might get a 54K executable. If I compiler for
speed I might get a 70K executable.

In all three cases execution times of the running app will be different
except in the places where hardware timers and interrupts are used.

Also not only will the application size be different the amount of
memory will be different.

For example some optimisations are not doing integer promotion rules,
using 8 bit enums , data overlaying
I have a feeling we are arguing at cross purposes here though.
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
/\/\/ ch***@phaedsys.org www.phaedsys.org \/\/\
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Feb 23 '07 #28

P: n/a
Chris Hills <ch***@phaedsys.orgwrites:
In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>>Chris Hills <ch***@phaedsys.orgwrites:
>>In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
"David T. Ashley" <dt*@e3ft.comwrites:
[...]
>>>>If one has the C source code for an optimizing compiler, the
compiles the compiler, creating a new compiler, and repeats this
process over many iterations, will the compiler executable get
smaller and smaller each time?

No. An optimizer (if it's working properly) makes the target code run
more quickly; it doesn't change its behavior.

Suppose you have C sources for two C compilers, compiler1 and
compiler2. compiler1 includes a very clever optimizer; compiler2
doesn't. Use compiler2 to compile itself, yielding compiler2a.exe,
which compiles slowly. Use compiler1 to compile compiler2, yielding
compiler2b.exe, which compiles more quickly. But compiler2a.exe and
compiler2b.exe work the same way; the code they generate is identical.

That is incorrect. There are compilers I know that will compile for
size or speed. They generate different code depending on the flags.

Thus whilst the external functionality of both is the same the
internal behaviour is different. In some cases if you want faster code
the executable is larger because it will unwrap some calls and
loops. Register and memory usage is different.

My point is that the code generated by a compiler is not (or should
not be) affected by any optimizations performed by the compiler used
to compile the compiler. The compiler itself may run more quickly
and/or use less memory, but it will generate the same code.
[snip]
The functionality of the code generated by the compiler will be the
same no matter what the optimisation. The size of the code and it's
composition will be different .
Of course.
IF I have a C compiler and an application APP that I compile and it
generates for example 64 K of executable by compiling using
optimisation for size I might get a 54K executable. If I compiler
for speed I might get a 70K executable.

In all three cases execution times of the running app will be
different except in the places where hardware timers and interrupts
are used.

Also not only will the application size be different the amount of
memory will be different.

For example some optimisations are not doing integer promotion rules,
using 8 bit enums , data overlaying
Yes, that's what optimization is all about.
I have a feeling we are arguing at cross purposes here though.
I don't know what we're arguing about at all.

The original question, which you can find up at the top of this
article, was whether repeatedly self-compiling an optimizating
compiler will cause the compiler executable to get smaller at each
iteration. The answer is no; it will get smaller the first time, but
not thereafter. I think that's what I've been consistently saying all
along. You said I was incorrect, but I think we're in agreement.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 23 '07 #29

P: n/a
In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>Chris Hills <ch***@phaedsys.orgwrites:
>In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>>>Chris Hills <ch***@phaedsys.orgwrites:
In article <ln************@nuthaus.mib.org>, Keith Thompson
<ks***@mib.orgwrites
>"David T. Ashley" <dt*@e3ft.comwrites:
[...]
>If one has the C source code for an optimizing compiler, the
>compiles the compiler, creating a new compiler, and repeats this
>process over many iterations, will the compiler executable get
>smaller and smaller each time?
>
>No. An optimizer (if it's working properly) makes the target code run
>more quickly; it doesn't change its behavior.
>
>Suppose you have C sources for two C compilers, compiler1 and
>compiler2. compiler1 includes a very clever optimizer; compiler2
>doesn't. Use compiler2 to compile itself, yielding compiler2a.exe,
>which compiles slowly. Use compiler1 to compile compiler2, yielding
>compiler2b.exe, which compiles more quickly. But compiler2a.exe and
>compiler2b.exe work the same way; the code they generate is identical.
>
That is incorrect. There are compilers I know that will compile for
size or speed. They generate different code depending on the flags.

Thus whilst the external functionality of both is the same the
internal behaviour is different. In some cases if you want faster code
the executable is larger because it will unwrap some calls and
loops. Register and memory usage is different.

My point is that the code generated by a compiler is not (or should
not be) affected by any optimizations performed by the compiler used
to compile the compiler. The compiler itself may run more quickly
and/or use less memory, but it will generate the same code.

[snip]
>The functionality of the code generated by the compiler will be the
same no matter what the optimisation. The size of the code and it's
composition will be different .

Of course.
>IF I have a C compiler and an application APP that I compile and it
generates for example 64 K of executable by compiling using
optimisation for size I might get a 54K executable. If I compiler
for speed I might get a 70K executable.

In all three cases execution times of the running app will be
different except in the places where hardware timers and interrupts
are used.

Also not only will the application size be different the amount of
memory will be different.

For example some optimisations are not doing integer promotion rules,
using 8 bit enums , data overlaying

Yes, that's what optimization is all about.
>I have a feeling we are arguing at cross purposes here though.

I don't know what we're arguing about at all.

The original question, which you can find up at the top of this
article, was whether repeatedly self-compiling an optimizating
compiler will cause the compiler executable to get smaller at each
iteration. The answer is no; it will get smaller the first time, but
not thereafter. I think that's what I've been consistently saying all
along. You said I was incorrect, but I think we're in agreement.
We are.... it's been a heavy week. I have builders extending the office
and it is almost impossible to work sensibly. I had the feeling I was
arguing at cross purposes
--
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
\/\/\/\/\ Chris Hills Staffs England /\/\/\/\/
/\/\/ ch***@phaedsys.org www.phaedsys.org \/\/\
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/

Feb 23 '07 #30

P: n/a
Keith Thompson wrote:
>
.... snip ...
>
The original question, which you can find up at the top of this
article, was whether repeatedly self-compiling an optimizating
compiler will cause the compiler executable to get smaller at
each iteration. The answer is no; it will get smaller the first
time, but not thereafter. I think that's what I've been
consistently saying all along. You said I was incorrect, but I
>think we're in agreement.
It takes at least two compiles, one with the old to produce the
more efficient output with a less efficient compiler, and then
another to produce the more efficient compiler. This assumes the
ancilliary code (libraries, etc) are not changed. The path is not
necessarily easy. I speak from experience, having been forced to
implement changes in multiple phases many times.

--
Chuck F (cbfalconer at maineline dot net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net>
Feb 23 '07 #31

This discussion thread is closed

Replies have been disabled for this discussion.