473,805 Members | 2,119 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Does C implement the first C compiler itself?

K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?

Feb 14 '07
30 2982
On Feb 14, 11:38 am, "David T. Ashley" <d...@e3ft.comw rote:
[...]
If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?
Umm. The generated code should always be the same. Unless there's a
bug in the optimizer. Optimization is easy if you don't have to get
the right answer.

But you reminded me of something I heard years ago: It is said that
every program contains at least one bug, and can be shortened by one
instruction. Therefore, by induction, every program can be reduced to
a single instruction, but it will be wrong.

Regards,

-=Dave

Feb 14 '07 #11
"lovecreatesbea ...@gmail.com" <lo************ ***@gmail.comwr ote in message
news:11******** *************@k 78g2000cwa.goog legroups.com...
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
That would leave you with a rather serious chicken-and-egg problem.

The very, very first C compiler was written in B. It was subsequently
rewritten in C (which wasn't a major change) and recompiled with the first
compiler.

DMR has a web site that explains the history of how C (and its original
compiler) evolved into existence.

S

--
Stephen Sprunk "Those people who think they know everything
CCIE #3723 are a great annoyance to those of us who do."
K5SSS --Isaac Asimov

--
Posted via a free Usenet account from http://www.teranews.com

Feb 14 '07 #12
Dave Hansen wrote, On 14/02/07 19:51:
On Feb 14, 11:38 am, "David T. Ashley" <d...@e3ft.comw rote:
[...]
>If one has the C source code for an optimizing compiler, the compiles the
compiler, creating a new compiler, and repeats this process over many
iterations, will the compiler executable get smaller and smaller each time?

Umm. The generated code should always be the same. Unless there's a
bug in the optimizer. Optimization is easy if you don't have to get
the right answer.
Indeed.
But you reminded me of something I heard years ago: It is said that
every program contains at least one bug, and can be shortened by one
instruction. Therefore, by induction, every program can be reduced to
a single instruction, but it will be wrong.
The versions I have heard have an additional clause which removes the
problem. The additional clause being something like, "of significant
complexity." It is important because of useful programs such as:

int main()
{
return 0;
}

Note, I've deliberately not specified a void parameter list to make it
compatible with what I know of pre-standard C as well as C89 & C99.
--
Flash Gordon
Feb 14 '07 #13
On Feb 14, 6:45 pm, Roberto Waltman <use...@rwaltma n.netwrote:
"David T. Ashley" wrote:
[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.
Not necessarily.

The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

Now assume that your compiler A has a function codegen (), and when
the compiler wishes to generate code for two function calls in
unspecified order, with the results discarded, it calls

codegen (right_tree) + codegen (left_tree)

Lets say we compile this with a compiler X, which calls the left
function first. So the assembler code in the compiled code is

codegen (right_tree)
codegen (left_tree)

Compile the compiler using the compiler compiled by X. The assembler
code will now be

codegen (left_tree)
codegen (right_tree).

Compile the compiler with the generated code. The assembler code will
be

codegen (right_tree)
codegen (left_tree)

and so on. All the even versions are different from all the odd
versions.

Feb 14 '07 #14
"christian. bau" wrote:
>Roberto Waltman wrote:
>[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.

Not necessarily.

The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

Now assume that your compiler A has a function codegen (), and when
the compiler wishes to generate code for two function calls in
unspecified order, with the results discarded, it calls

codegen (right_tree) + codegen (left_tree)

Lets say we compile this with a compiler X, which calls the left
function first. So the assembler code in the compiled code is

codegen (right_tree)
codegen (left_tree)

Compile the compiler using the compiler compiled by X. The assembler
code will now be

codegen (left_tree)
codegen (right_tree).

Compile the compiler with the generated code. The assembler code will
be

codegen (right_tree)
codegen (left_tree)

and so on. All the even versions are different from all the odd
versions.
OK, I stand corrected. So the generated compilers fall into a cycle,
of length 2 in this case, (easily expanded to more.)
Is this always the case? (To have cyclic versions, that is.)

Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Feb 14 '07 #15

"lovecreatesbea ...@gmail.com" <lo************ ***@gmail.comwr ote in message
news
K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Yes.
So how did it compile itself? The usual way is to write a little
interpreter. Interpreters are far easier to write than compilers. See my
homepage for one in a single reasonable-length file.

The interpreter runs the compiler source on itself and creates the first
native compiler.
Feb 14 '07 #16
"Malcolm McLean" <re*******@btin ternet.comwrite s:
"lovecreatesbea ...@gmail.com" <lo************ ***@gmail.comwr ote in message
news
>K&R says the following in the preface to the first edition,

"... the C compiler, and ... are written in C."

I'm wondering, does it say even the first / original C compiler was
written in C?
Yes.
No, it doesn't say that. Somebody in this thread said that the first
C compiler was written in B. I just scanned
<http://cm.bell-labs.com/cm/cs/who/dmr/chist.htmlfor the word
"compiler"; I didn't see a clear statement to that effect. It appears
that there was an evolutionary path from B to C, with "NB" (New B)
somewhere in the middle. I think that the compiler and the language
evolved in parallel, with new language features being implemented by
the compiler, then used in the next version of the compiler.
So how did it compile itself? The usual way is to write a little
interpreter. Interpreters are far easier to write than compilers. See my
homepage for one in a single reasonable-length file.

The interpreter runs the compiler source on itself and creates the first
native compiler.
It could have been done that way, but as far as I can tell it wasn't.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <* <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Feb 15 '07 #17
"Dave Hansen" <id**@hotmail.c omwrote in message
news:11******** *************@p 10g2000cwp.goog legroups.com...
On Feb 14, 11:38 am, "David T. Ashley" <d...@e3ft.comw rote:

Optimization is easy if you don't have to get
the right answer.
Regrettably, some compiler vendors have this integrated into their business
model. I was once told by tech support "Don't use optimizaton level 5 until
we get the bugs fixed".
--
David T. Ashley (dt*@e3ft.com)
http://www.e3ft.com (Consulting Home Page)
http://www.dtashley.com (Personal Home Page)
http://gpl.e3ft.com (GPL Publications and Projects)
Feb 15 '07 #18
Roberto Waltman wrote:
"christian. bau" wrote:
Roberto Waltman wrote:
[OT] No, (at least not as worded) A simple test that every
self-compiling compiler should pass, is that when compiling its own
sources it should produce exact duplicates of itself.
Not necessarily.

The order in which subexpressions are evaluated is compiler-dependent.
For example, if we have a function

int f (void* p);

then it is compiler-dependent whether in the expression

f (right) + f (left)

f (right) is called first or f (left) is called first.

Now assume that your compiler A has a function codegen (), and when
the compiler wishes to generate code for two function calls in
unspecified order, with the results discarded, it calls

codegen (right_tree) + codegen (left_tree)

Lets say we compile this with a compiler X, which calls the left
function first. So the assembler code in the compiled code is

codegen (right_tree)
codegen (left_tree)

Compile the compiler using the compiler compiled by X. The assembler
code will now be

codegen (left_tree)
codegen (right_tree).

Compile the compiler with the generated code. The assembler code will
be

codegen (right_tree)
codegen (left_tree)

and so on. All the even versions are different from all the odd
versions.

OK, I stand corrected. So the generated compilers fall into a cycle,
of length 2 in this case, (easily expanded to more.)
Is this always the case? (To have cyclic versions, that is.)
No, this is not always the case. tinycc, for example, puts garbage in
its generated output. Its x86 long doubles are 96 bits, with only 80
value bits. When storing a long double in the generated object file,
all 96 bits are copied, but the 16 padding bits are never explicitly
initialised, and are for practical purposes completely random. You
can't even assume that one specific binary of the compiler with one
specific set of compiler options will always generate bitwise-
identical object code.

(I hope what I mean by "value bit" and "padding bit" is clear enough,
even though it does not exactly match the meanings of the same terms
as used by the standard.)

Feb 15 '07 #19
"Harald van D?k" wrote:
>...
No, this is not always the case. tinycc, for example, puts garbage in
its generated output. Its x86 long doubles are 96 bits, with only 80
value bits. When storing a long double in the generated object file,
all 96 bits are copied, but the 16 padding bits are never explicitly
initialised, and are for practical purposes completely random. You
can't even assume that one specific binary of the compiler with one
specific set of compiler options will always generate bitwise-
identical object code.
[OT] Thanks, after I posted my last question I realized this is
possible, (thinking of uninitialized structs padding.)
There was a recent thread in comp.software-eng about verifying that
two source files are equivalent after "pretty-printing" one.
The OP resorted to compiling them and comparing the resulting object
files, it is clear now this is not guaranteed to work.

Roberto Waltman

[ Please reply to the group,
return address is invalid ]
Feb 15 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
21131
by: Evan Smith | last post by:
During a routine performance check using an event monitor, I discovered a class of query whose performance has me baffled. The monitor captured: SELECT * FROM EWM_CASE fetch first 1 rows only It took 14 seconds of CPU time to execute. After looking up the documentation on the FETCH FIRST notation I find "Limiting the result table to the first integer rows can improve performance. The database
15
2067
by: Sam Sungshik Kong | last post by:
Hello! A disposable object's Dispose() method can be called either explicitly by the programmer or implicitly during finalization. If you call Dispose, the unmanaged resources are released earlier. Thus, if you think the unmanaged resources are important, you call Dispose explicitly. My question is what's the criteria to decide the unmanaged resources are important.
6
2034
by: bramdoornbos | last post by:
Hello, I am looking for a solution to interface with C++ classes implemented in a dll compiled by gcc. This dll will be however accessed by a visual c++ compiled host (not made by me). Both implementations will share headers that define virtual c++ class interfaces.
34
3973
by: Ben Sizer | last post by:
I've installed several different versions of Python across several different versions of MS Windows, and not a single time was the Python directory or the Scripts subdirectory added to the PATH environment variable. Every time, I've had to go through and add this by hand, to have something resembling a usable Python installation. No such problems on Linux, whether it be Mandrake/Mandriva, Fedora Core, or Kubuntu. So why is the Windows...
1
2213
MrPickle
by: MrPickle | last post by:
Does a stream flush itself when it is destroyed? If so, is it best just to let the stream flush itself and only flush it when you want it's buffer synchronizing?
0
9718
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10613
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10363
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
10107
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9186
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7649
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6876
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5544
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
3
3008
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.