By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,837 Members | 1,613 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,837 IT Pros & Developers. It's quick & easy.

size of a sizeof(pointer)

P: n/a
what is the size of a pointer?

suppose i am writing,
datatype *ptr;
sizeof(ptr);
now what does this sizeof(ptr) will give? will it give the size of the
data the pointer is pointing to?

if no, can you give an counter example?

basically , i want to know what is the meaning of size of a ponter.

as you know

sizeof(int)=4;

sizeof(char)= 2;

but what does sizeof(ptr) means??

can anybody explain?
Nov 14 '05 #1
Share this Question
Share on Google+
79 Replies


P: n/a
On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
what is the size of a pointer?

suppose i am writing,
datatype *ptr;
sizeof(ptr);
now what does this sizeof(ptr) will give? will it give the size of the
data the pointer is pointing to?

if no, can you give an counter example?

basically , i want to know what is the meaning of size of a ponter.

as you know

sizeof(int)=4;
Maybe. It must be >= 2.
sizeof(char)= 2;
sizeof(char) is, by definition, 1.
but what does sizeof(ptr) means??


It's the amount of space the pointer itself takes up. Not the data pointed
to, but the pointer itself. Often, it's == sizeof(int).

Josh
Nov 14 '05 #2

P: n/a

"syntax" <sa*****@yahoo.com.hk> wrote in message
what is the size of a pointer?
A pointer is a variable that holds an address. The size of a pointer is the
size of this address.
For instance, most computers have an address space of 4GB. 32 bits allows
you 4GB, so the size of a pointer will be 32 bits, or 4 (char is usually 8
bits). On some microcomputers the address space is only 64K, so 16-bit
pointers are used.
datatype *ptr;
sizeof(ptr);

now what does this sizeof(ptr) will give? will it give the size of the
data the pointer is pointing to?
No, it gives the size of the pointer, probably 4.
if no, can you give an counter example?
One confusing thing about C is that arrays and pointer have array/pointer
equivalence.

char string[32];

printf("sizeof string %d\n", (int) sizeof(string));

will give you 32.

char *string = malloc(32);

printf(" sizeof string %d\n", (int) sizeof(string));

will give you the size of a pointer on your system, probably 4.
basically , i want to know what is the meaning of size of a ponter.

as you know

sizeof(int)=4;

sizeof(char)= 2;
sizeof(char) is always 1, one of the little quirks of the C language.
sizeof(int) is very commonly 4, but it can be any size. It is meant to be
the natural size for the machine to use, which means the width of the
register.
For technical reasons pointers are usually the same size as ints, but again
they can be any size.
but what does sizeof(ptr) means??


Nov 14 '05 #3

P: n/a
Josh Sebastian wrote:
On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
as you know

sizeof(int)=4;
Maybe. It must be >= 2.


Wrong. It must, however, be an exact multiple of 1.
sizeof(char)= 2;


sizeof(char) is, by definition, 1.


Right.
but what does sizeof(ptr) means??


It's the amount of space the pointer itself takes up. Not the data pointed
to, but the pointer itself. Often, it's == sizeof(int).


But, of course, it doesn't have to be (as you know).

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
Nov 14 '05 #4

P: n/a
On Sun, 08 Feb 2004 19:58:20 +0000, Richard Heathfield wrote:
Josh Sebastian wrote:
On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:
as you know

sizeof(int)=4;


Maybe. It must be >= 2.


Wrong. It must, however, be an exact multiple of 1.


Jeez... yeah, thanks.

Nov 14 '05 #5

P: n/a
Josh Sebastian <cu****@cox.net> writes:
On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote:

[...]
but what does sizeof(ptr) means??


It's the amount of space the pointer itself takes up. Not the data pointed
to, but the pointer itself. Often, it's == sizeof(int).


It's true that the size of a pointer is often equal to sizeof(int),
but it's dangerous (an unnecessary) to assume that it always is.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
Nov 14 '05 #6

P: n/a
Richard Heathfield wrote:
Josh Sebastian wrote:
syntax wrote:
sizeof(int)=4;


Maybe. It must be >= 2.


Wrong. It must, however, be an exact multiple of 1.


An implementation cannot have 16-bit chars and 24-bit ints?

How about 16-bit chars and 24-bit pointers?

Nov 14 '05 #7

P: n/a
Grumble wrote:

Richard Heathfield wrote:
Josh Sebastian wrote:
syntax wrote:

sizeof(int)=4;

Maybe. It must be >= 2.
Wrong. It must, however, be an exact multiple of 1.


It must be greater than 1, on hosted implementations.
An implementation cannot have 16-bit chars and 24-bit ints?
The sum of the numbers of padding bits,
value bits and the sign bit, is a multiple of CHAR_BIT.
How about 16-bit chars and 24-bit pointers?


The bit representation of pointers is not specified.

--
pete
Nov 14 '05 #8

P: n/a
On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete
<pf*****@mindspring.com> wrote:
Grumble wrote:

Richard Heathfield wrote:
> Josh Sebastian wrote:
>
>> syntax wrote:
>>
>>> sizeof(int)=4;
>>
>> Maybe. It must be >= 2.
>
> Wrong. It must, however, be an exact multiple of 1.


It must be greater than 1, on hosted implementations.


Not if a char were 16 bits wide.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #9

P: n/a
pete <pf*****@mindspring.com> wrote:
Grumble wrote:

Richard Heathfield wrote:
Josh Sebastian wrote:

> syntax wrote:
>
>> sizeof(int)=4;
>
> Maybe. It must be >= 2.

Wrong. It must, however, be an exact multiple of 1.
It must be greater than 1, on hosted implementations.


Chapter and verse, please.

Of course, it's exceedingly awkward for a hosted implementation to have
sizeof(int)==1, but it isn't illegal.
An implementation cannot have 16-bit chars and 24-bit ints?


The sum of the numbers of padding bits,
value bits and the sign bit, is a multiple of CHAR_BIT.
How about 16-bit chars and 24-bit pointers?


The bit representation of pointers is not specified.


Even so, all types have sizes measurable in whole chars; look up the
definition of sizeof.

Richard
Nov 14 '05 #10

P: n/a
"Mike Wahler" <mk******@mkwahler.net> wrote in
news:qJ*******************@newsread1.news.pas.eart hlink.net:
> to, but the pointer itself. Often, it's == sizeof(int).


It's true that the size of a pointer is often equal to sizeof(int),
but it's dangerous (an unnecessary) to assume that it always is.


Or for that matter, to assume that all pointer types have the same size.


Indeed. For example, Keil C51 has 1 byte, 2 byte, and 3 byte pointer sizes
depending upon which memory space the pointer points to.
Nov 14 '05 #11

P: n/a
Richard Bos wrote:
Of course, it's exceedingly awkward for a hosted implementation
to have sizeof(int)==1 [...]


Is it awkward because getc() can return either a char or EOF?

Nov 14 '05 #12

P: n/a
Grumble <in*****@kma.eu.org> wrote:
Richard Bos wrote:
Of course, it's exceedingly awkward for a hosted implementation
to have sizeof(int)==1 [...]


Is it awkward because getc() can return either a char or EOF?


That, and related problems, yes. If you need to take these legal-but-
unlikely implementations into account (i.e., if you really want to be as
anal-retentive about ISO-conformance as your common huff-throwing newbie
(and uncommon troll) makes us out to be), you need to check for feof()
and ferror() after every read operation, instead of simply for EOF.
Personally, I never do.

Richard
Nov 14 '05 #13

P: n/a

On Mon, 9 Feb 2004, Richard Bos wrote:
pete <pf*****@mindspring.com> wrote:
Grumble wrote:
Richard Heathfield wrote:
> Josh Sebastian wrote:
>> [sizeof(int)] must be >= 2.
>
> Wrong. It must, however, be an exact multiple of 1.


It must be greater than 1, on hosted implementations.


Chapter and verse, please.


<bu**********@sunnews.cern.ch>
and subsequent posts. This should be a FAQ.

-Arthur
Nov 14 '05 #14

P: n/a
Mark McIntyre wrote:
On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete
<pf*****@mindspring.com> wrote:

Grumble wrote:
Richard Heathfield wrote:
Josh Sebastian wrote:
>syntax wrote:
>
>
>>sizeof(int)=4;
>
>Maybe. It must be >= 2.

Wrong. It must, however, be an exact multiple of 1.


It must be greater than 1, on hosted implementations.

Not if a char were 16 bits wide.


Is there any alive implementation that uses 16bit chars?? (I know of the
existance of a machine that a byte is 6-bit)

--
#include <stdio.h>
#define p(s) printf(#s" endian")
int main(void){int v=1;*(char*)&v?p(Little):p(Big);return 0;}

Giannis Papadopoulos
http://dop.users.uth.gr/
University of Thessaly
Computer & Communications Engineering dept.
Nov 14 '05 #15

P: n/a
On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis
<ip******@inf.uth.gr> wrote:

Is there any alive implementation that uses 16bit chars?? (I know of the
existance of a machine that a byte is 6-bit)


Unicode springs to mind.

I suspect that quite a few DSPs do, tho typically they're freestanding
implementations.

That aside, I'd be unsurprised to see future implementations using 16 bits
for chars.

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #16

P: n/a

"Grumble" <in*****@kma.eu.org> wrote in message

An implementation cannot have 16-bit chars and 24-bit ints?

How about 16-bit chars and 24-bit pointers?

Not allowed. chars and bytes, or to be pedantic unsigned chars and bytes,
are the same thing in C. An unfortunate hangover from the early days.
All types have to be a whole multiple of char.

Nov 14 '05 #17

P: n/a

"CBFalconer" <cb********@yahoo.com> wrote in message
For instance, most computers have an address space of 4GB. 32
bits allows you 4GB, so the size of a pointer will be 32 bits,
or 4 (char is usually 8 bits). On some microcomputers the
address space is only 64K, so 16-bit pointers are used.
Nope. A pointer points. What information it needs to hold to do
that is up to the implementation. It could consist of a URL and
other information, just as a not too wild example. Another might
be "Malcolms house, under the bed beside the dirty socks, last
Tuesday". The amount of information needed is usually constrained
by limiting the things that the pointer is allowed to point to.
Clear now?

Don't patronise.
You and I both know that perverse implementations are allowed. Since
pointers have to be a fixed size then using a URL would be grossly
inefficient.
Since the OP needs to understand how pointers are represented in memory on a
typical system such as the one he will certainly be using, telling him that
32 bit pointers are needed to address 4GB gets across the message clearly.
Talk about URL pointers is liable to confuse.
You should neither know nor care, unless you are implementing the
system.

Well you very often need to break the bounds of ANSI C and go to a lower
level. An example would be if you have a custom memory scheme. How do you
know if a pointer comes from your arena or from elsewhere?
Another example would be using a debugger. Invalid pointers are often set to
some defined bit pattern. You need to know something about addressing to
detect these bad pointers.
Programming is practical. It doesn't make sense to hand someone a copy of
the standard and expect them to be able to write fully-conforming ANSI C.
You need to play with a real implementation on a real machine to have any
hope of understanding what is going on.
Nov 14 '05 #18

P: n/a
On Sun, 8 Feb 2004 22:04:50 -0000, "Malcolm"
<ma*****@55bank.freeserve.co.uk> wrote:

"Keith Thompson" <ks***@mib.org> wrote in message

No, there is no array/pointer equivalence (or rather, "equivalence" is
a misleading term for what's really going on). Array names are
implicitly converted to pointer values in many contexts.

See the C FAQ at <http://www.eskimo.com/~scs/C-faq/faq.html>,
particularly section 6, particularly question 6.3.
Exactly. "Equivalence" is the accepted term for what is going on, which is
confusing.


I've never heard the term before starting to read this newsgroup. I've
always called it "array/pointer duality"
-leor



Leor Zolman
BD Software
le**@bdsoft.com
www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix
C++ users: Download BD Software's free STL Error Message
Decryptor at www.bdsoft.com/tools/stlfilt.html
Nov 14 '05 #19

P: n/a
Mark McIntyre wrote:
On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis
<ip******@inf.uth.gr> wrote:

That aside, I'd be unsurprised to see future implementations using 16 bits
for chars.


If we use 16-bit values as char, then the new C0x spec must define
something like "byte" (java's char is unicode and it haves an 8-bit type)..

There is of course wchar_t so there is definately no need for 16bit
chars.. Or so I think... Comments?

--
#include <stdio.h>
#define p(s) printf(#s" endian")
int main(void){int v=1;*(char*)&v?p(Little):p(Big);return 0;}

Giannis Papadopoulos
http://dop.users.uth.gr/
University of Thessaly
Computer & Communications Engineering dept.
Nov 14 '05 #20

P: n/a
Papadopoulos Giannis <ip******@inf.uth.gr> writes:
Mark McIntyre wrote:
On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis
<ip******@inf.uth.gr> wrote:
That aside, I'd be unsurprised to see future implementations using
16 bits
for chars.


If we use 16-bit values as char, then the new C0x spec must define
something like "byte" (java's char is unicode and it haves an 8-bit
type)..

There is of course wchar_t so there is definately no need for 16bit
chars.. Or so I think... Comments?


I think C will always define a char as being one byte (sizeof(char)==1).
There's too much code that would break if that were changed. The
process that led to the 1989 ANSI standard was probably the last real
opportunity to change this.

I'd greatly prefer the concepts of "character" and "uniquely
addressable storage unit" to be separate, but it's too late to fix it.

It just might be possible to deprecate the use of the word "byte"
(which is part of the desciption of the language, not part of the
language itself) while continuing to guarantee that sizeof(char)==1,
but I doubt that even that will be done.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
Nov 14 '05 #21

P: n/a
Mark McIntyre wrote:

On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete
<pf*****@mindspring.com> wrote:
Grumble wrote:

Richard Heathfield wrote:

> Josh Sebastian wrote:
>
>> syntax wrote:
>>
>>> sizeof(int)=4;
>>
>> Maybe. It must be >= 2.
>
> Wrong. It must, however, be an exact multiple of 1.


It must be greater than 1, on hosted implementations.


Not if a char were 16 bits wide.


You can't implement the whole standard library,
if sizeof(int) is one.

putchar(EOF) has to be able to return EOF
converted to an unsigned char value,
converted back to a nonnegative int.

http://groups.google.com/groups?selm...andrew.cmu.edu

--
pete
Nov 14 '05 #22

P: n/a
"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:c0**********@newsg1.svr.pol.co.uk...

"CBFalconer" <cb********@yahoo.com> wrote in message
For instance, most computers have an address space of 4GB. 32
bits allows you 4GB, so the size of a pointer will be 32 bits,
or 4 (char is usually 8 bits). On some microcomputers the
address space is only 64K, so 16-bit pointers are used.
Nope. A pointer points. What information it needs to hold to do
that is up to the implementation. It could consist of a URL and
other information, just as a not too wild example. Another might
be "Malcolms house, under the bed beside the dirty socks, last
Tuesday". The amount of information needed is usually constrained
by limiting the things that the pointer is allowed to point to.
Clear now?

Don't patronise.
You and I both know that perverse implementations are allowed.


For suitable defintions of 'perverse'.
Since
pointers have to be a fixed size
C & V please.
then using a URL would be grossly
inefficient.
Since the OP needs to understand how pointers are represented in memory
That's platform/implemenatation dependent.
on a
typical system
Whose definition of 'typical'?
such as the one he will certainly be using,
Doesn't matter which one. The answers will be platform-specific,
not applicable to standard C.
telling him that
32 bit pointers are needed to address 4GB gets across the message clearly.
That's one of many possible ways to represent such an address space.
Talk about URL pointers is liable to confuse.
It's intended to clarify (and imo it did) that a pointer is
an *abstraction*, and as such, one need not (should not) be
concerned about its physical implementation.

You should neither know nor care, unless you are implementing the
system.

Well you very often need to break the bounds of ANSI C and go to a lower
level.


In which case the dicussion needs to depart from clc.
An example would be if you have a custom memory scheme. How do you
know if a pointer comes from your arena or from elsewhere?
Then one would need to ask/read about it where such things are discussed.
Not here.
Another example would be using a debugger. Invalid pointers are often set to some defined bit pattern. You need to know something about addressing to
detect these bad pointers.
Then one would need to ask/read about it where debuggers are discussed. Not
here.
Programming is practical.
The subject of clc is not programming.
It doesn't make sense to hand someone a copy of
the standard and expect them to be able to write fully-conforming ANSI C.
That's why we have books, schools, intructors, etc.
You need to play with a real implementation on a real machine to have any
hope of understanding what is going on.


Not at the abstract level of ISO C. 'Way' back when, I got a decent
understanding
of how COBOL worked, before I ever laid eyes on any hardware. This was
proven
when I actually coded, compiled, and successfully ran programs when we did
get access to a computer.

-Mike
Nov 14 '05 #23

P: n/a
[snips]

On Tue, 10 Feb 2004 06:23:04 +0000, Mike Wahler wrote:
Since
pointers have to be a fixed size


C & V please.
then using a URL would be grossly
inefficient.
Since the OP needs to understand how pointers are represented in memory


That's platform/implemenatation dependent.


I've always favord SQL queries. Store all the values in a database and
the pointers are all just queries to retrieve them.
telling him that
32 bit pointers are needed to address 4GB gets across the message
clearly.


That's one of many possible ways to represent such an address space.


Anyone who ever used older DOS compilers will appreciate the clarity of
not assuming pointers make any sort of inherent sense. :)
Nov 14 '05 #24

P: n/a
"Mike Wahler" <mk******@mkwahler.net> wrote:
"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:c0**********@newsg1.svr.pol.co.uk...
Programming is practical.


The subject of clc is not programming.


Well, yes, it is. Where Malcolm goes wrong is in believing that locking
yourself into the Wintel platform is part of that practicality.

Richard
Nov 14 '05 #25

P: n/a

"Mike Wahler" <mk******@mkwahler.net> wrote in message
Since
pointers have to be a fixed size
C & V please.

Uggle *ptr = 0;

Uggle **uptr = malloc(sizeof(Uggle *));

*uptr = ptr;

*uptr now must be set to NULL. How is this achieved if an Uggle * is of
variable width?
Whose definition of 'typical'?
Natural language definition of "typical".
such as the one he will certainly be using,
Doesn't matter which one. The answers will be platform-specific,
not applicable to standard C.

But standard C is deeply dependent on the types of architectures that exist
in the real world. That's why it has pointers, rather than the "advance"
commands that would be expected of Turing machines.
That's one of many possible ways to represent such an address
space. Use of 32 bit pointers to address a 4GB memory space is not just one of many
possible ways to represent such a space. It's the most obvious, natural way
to do so.
Talk about URL pointers is liable to confuse.
It's intended to clarify (and imo it did) that a pointer is
an *abstraction*, and as such, one need not (should not) be
concerned about its physical implementation.

You need to understand the physical representation to understand how the
ANSI committee made their decisions. Or else why not say that a pointer is
held in a variable size memory entity?
Well you very often need to break the bounds of ANSI C and go
to a lower level.
In which case the dicussion needs to depart from clc.

NO, because clc is not cl.ansic. The newsgriup precedes the ANSI standard,
which is proving itself to be an ephemeral chapter in the history of the
language. The C99 standard seems to have failed.
An example would be if you have a custom memory scheme. How > > do you know if a pointer comes from your arena or from elsewhere?


Then one would need to ask/read about it where such things are
discussed. Not here.

It's a perfectly on-topic question. I have implemeted a mymalloc() using a
static arena, when a pointer is passed to myfree(), how can I verify that it
is from the arena. The ANSI answer is that you can't, but that's not good
enough. [ debuggers ] Then one would need to ask/read about it where debuggers are
discussed. Not here.
You need to understand the sorts of ways pointers are represented in memory
before you can understand debuggers, or indeed the (ANSI) %p format
specifier to the printf() family of functions. Perfectly on topic, but
nothing to do with ANSI.
Programming is practical.
The subject of clc is not programming.

It's C programming. Not ANSI C programming, portable C programming i.e.
compiler-specfic questions are off-topic, but not, for example, "how does a
typical implemetation provide malloc()".
It doesn't make sense to hand someone a copy of
the standard and expect them to be able to write fully-conforming
ANSI C.
That's why we have books, schools, intructors, etc.

And also comp.lang,c. Otherwise one could simply post the standard in answer
to every query.
Not at the abstract level of ISO C. 'Way' back when, I got a decent
understanding of how COBOL worked, before I ever laid eyes on
any hardware. This was proven when I actually coded, compiled,
and successfully ran programs when we did get access to a computer.

Well done but that's unusual, and an inefficient way of learning. Basically
you are using the tutor to dry run code, and he will do so several million
times slower than a processor.
Programming is a practical skill, which means that you need to understnad
your implementation. Otherwise we could simply hand a copy of the standard
to every newbie and expect them to become proficient C programmers. It
doesn't work like that.

Basically engage brain before trying to obfuscate my explanations with
references to URL pointers and other such rubbish.
Nov 14 '05 #26

P: n/a

"Richard Bos" <rl*@hoekstra-uitgeverij.nl> wrote in message
Well, yes, it is. Where Malcolm goes wrong is in believing that
locking yourself into the Wintel platform is part of that practicality.

So you think that Wintel is the only platform that uses 32-bit pointers to
address a 4GB memory space?
Nov 14 '05 #27

P: n/a
On Tue, 10 Feb 2004 21:08:17 -0000, in comp.lang.c , "Malcolm"
<ma*****@55bank.freeserve.co.uk> wrote:

"Mike Wahler" <mk******@mkwahler.net> wrote in message
> Since
> pointers have to be a fixed size


C & V please.

Uggle *ptr = 0;

Uggle **uptr = malloc(sizeof(Uggle *));

*uptr = ptr;

*uptr now must be set to NULL. How is this achieved if an Uggle * is of
variable width?


Mike meant that different types' pointers might be different widths. Thus
an Uggle** might be wider (or narrower) than an Uggle*, which might in turn
be wider (or narrower) than an int*.
--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #28

P: n/a

"Malcolm" <ma*****@55bank.freeserve.co.uk> wrote in message
news:c0**********@newsg1.svr.pol.co.uk...

"Mike Wahler" <mk******@mkwahler.net> wrote in message
Since
pointers have to be a fixed size
C & V please.
In case you didn't know, that acronym means "Chapter & Verse"
I'm asking you to support your claim with a citation from
the standard.

Uggle *ptr = 0;

Uggle **uptr = malloc(sizeof(Uggle *));

*uptr = ptr;

*uptr now must be set to NULL. How is this achieved if an Uggle * is of
variable width?


Doesn't matter "how". It must simply 'work correctly'. That's
all the standard requires.

Whose definition of 'typical'?
Please don't omit context. Restored:

Malcolm: on a typical system

Mike: Whose definition of 'typical'?

Natural language definition of "typical".


OK I suppose I have to spell it out. Whose definition of
'typical *system*'. In some contexts a 'typical system'
is a PC. In others, it's a cell phone. In the widest
(computer system) context, if 'typical' is the most
widely used, it's certainly not a PC, but more likely
some embedded system I've probably never heard of.
such as the one he will certainly be using,


Doesn't matter which one. The answers will be platform-specific,
not applicable to standard C.

But standard C is deeply dependent on the types of architectures that

exist in the real world.
Not at all. The standard makes requirements that an implementation
must meet. If a platform cannot provide support sufficient for
such an implementation (either directly or via e.g. software emulation,
etc.)
(perhaps it only has 6 bit bytes) then it's simply not possible to create a
conforming C implemenation for it. Period. So you have the 'dependency'
issue exactly backwards.

That's why it has pointers,
I'd have to ask Mr. Ritchie for the 'real' answer, but imo
it has pointers because they allow one to do the useful things
they can do. They implement an abstraction: indirection.

rather than the "advance"
commands that would be expected of Turing machines.

That's one of many possible ways to represent such an address
space. Use of 32 bit pointers to address a 4GB memory space is not just one of

many possible ways to represent such a space. It's the most obvious, natural way to do so.
Talk about URL pointers is liable to confuse.
It's intended to clarify (and imo it did) that a pointer is
an *abstraction*, and as such, one need not (should not) be
concerned about its physical implementation.

You need to understand the physical representation to understand how the
ANSI committee made their decisions.


I need to understand neither physical representation, nor know (or care)
why the committee decided what they did, in order to successfully write
standard C. All I need is a conforming implementation, and access to
the rules (the standard). Of course textbooks written in a more 'prose'
like form are a huge help.
Or else why not say that a pointer is
held in a variable size memory entity?
Because either one would be acceptable with regard to the standard.
It's called flexibility, which I suspect the committe allowed for
when possible. For example why do you suppose there's no hard
definition for the exact representation of '\n'?

Well you very often need to break the bounds of ANSI C and go
to a lower level.


In which case the dicussion needs to depart from clc.

NO, because clc is not cl.ansic.


For the zillionth time that I've stated this here, the name of a newsgroup
does *not* define its exact nature. It's only a general guideline.

The nature and guidelines of clc are stated in the 'welcome message',
which has by consensus of the regulars become the defining document.
The newsgriup precedes the ANSI standard,
Irrelevant.
which is proving itself to be an ephemeral chapter in the history of the
language. The C99 standard seems to have failed.
Your opinion. And you seem to have imposed some arbitrary
time limit for C99 to 'succeed'.
An example would be if you have a custom memory scheme. How > > do you know if a pointer comes from your arena or from elsewhere?


Then one would need to ask/read about it where such things are
discussed. Not here.

It's a perfectly on-topic question. I have implemeted a mymalloc() using a
static arena, when a pointer is passed to myfree(), how can I verify that

it is from the arena. The ANSI answer is that you can't, but that's not good
enough.
Tough.

[ debuggers ]
Then one would need to ask/read about it where debuggers are
discussed. Not here.

You need to understand the sorts of ways pointers are represented in

memory before you can understand debuggers,
Debuggers are not topical here.
or indeed the (ANSI) %p format
All one need know is that it will print the value of a type 'void*'
object. The exact display format used is left up to the implemenation.
specifier to the printf() family of functions. Perfectly on topic, but
nothing to do with ANSI.
%p (the ISO specification of it) is indeed topical. Its implementation
is not.
Programming is practical.


The subject of clc is not programming.

It's C programming.


It's the C programming *language* and how to *use* it.
Not ANSI C programming, portable C programming i.e.
compiler-specfic questions are off-topic, but not, for example, "how does a typical implemetation provide malloc()".
That's an implementation specific issue. The language only
specifies 'malloc()'s *behavior*.
It doesn't make sense to hand someone a copy of
the standard and expect them to be able to write fully-conforming
ANSI C.


That's why we have books, schools, intructors, etc.

And also comp.lang,c. Otherwise one could simply post the standard in

answer to every query.
So here you are at comp.lang.c where so many experts graciously share
their knowledge and skill, gratis. So instead of desperately trying
to prove yourself "right", why not *listen* and learn? I did.
When I first came to clc, I considered myself, if not 'expert',
at least very knowledgable about C. A couple days here proved
me wrong. I did not allow my ego to obscure or deny this fact.
Not at the abstract level of ISO C. 'Way' back when, I got a decent
understanding of how COBOL worked, before I ever laid eyes on
any hardware. This was proven when I actually coded, compiled,
and successfully ran programs when we did get access to a computer.

Well done but that's unusual,


I suppose one might call it "unusual". I found my instructor's
methods to be brilliant.
and an inefficient way of learning.
I suppose that depends upon what you mean by "efficient". Fast?
Fast just means fast, not necessarily "good".

I found it a very *effective* way to learn.
Basically
you are using the tutor to dry run code,
Actually the students all used one another to represent
system components, one of which was the CPU, who was
given a sequence of predefined instructions. Others
represented data objects, peripheral devices, etc.
We 'executed' a 'program' according to a strict
formal set of rules (analagous to a standard
language specification). But these rules did *not*
mandate implementation methods. E.g. a the person
representing an 'accumulator' was only required
to 'reset', 'accumulate', and report a value.
It was not mandated *how* to do so. He was free
to rely on his memory, or he could write things
down, or use a handheld calculator, etc.
and he will do so several million
times slower than a processor.
Speed was not the objective. Learning was.
And after the students all having participated
in the 'execution' of a 'program' we all had a
much better appreciation for the true power
of a computer, and the discipline required to
effectively program one.
Programming is a practical skill,
Yes, and a programming language is only a small part of it.
This newsgroup provides only a small part of the knowledge
necessary. Other learning resources exist for the other
issues.

which means that you need to understnad
your implementation.
Not to use C you don't.
Otherwise we could simply hand a copy of the standard
to every newbie and expect them to become proficient C programmers. It
doesn't work like that.
As I already said, that's why we have schools, books, instructors, etc.

Basically engage brain before trying to obfuscate my explanations
I have in no way tried to obfuscate anything you've 'explained'.
I've only debated your opinions.
with
references to URL pointers and other such rubbish.


I made no reference to a URL pointer.

-Mike
Nov 14 '05 #29

P: n/a

"Mike Wahler" <mk******@mkwahler.net> wrote in message
Whose definition of 'typical'?
Please don't omit context. Restored:

Malcolm: on a typical system

Mike: Whose definition of 'typical'?
Well every system I know uses fixed-size pointers. There is one main
exception to the rule that the size of the pointer represents the size of
the address space, and that's quite an important one, old x86 compilers with
their segmented architecture.
I think we can call the x86 "non-typical" because the natural thing to do is
to have one pointer value equalling one address, and because virtually every
other system works that way. But standard C is deeply dependent on the types of architectures > > >
that exist in the real world.

Not at all. The standard makes requirements that an implementation
must meet. If a platform cannot provide support sufficient for
such an implementation (either directly or via e.g. software emulation,
etc.)
(perhaps it only has 6 bit bytes) then it's simply not possible to create
a conforming C implemenation for it. Period. So you have the
'dependency' issue exactly backwards.
C is not an abstract language for specifying the behviour of Turing
machines, but one that is deeply-dependent on the types of architectures
that exist. You can incidentally provide a conforming C implemetation for
any Turing-comptible machine, even if it uses 6-bit bytes internally, as
long as you are prepared to accept gross inefficiency.
It is precisely because 6-bit byte general-purpose processors are rare that
C doesn't easily support them.
I need to understand neither physical representation, nor know (or
care) why the committee decided what they did, in order to
successfully write standard C. All I need is a conforming
implementation, and access to the rules (the standard). Of course
textbooks written in a more 'prose' like form are a huge help.
This is nonsense. People are not machines. You can't learn French from a
dictionary and grammar, nor is it possible to learn C from the standard. And
over-literal explantions, such as "pointers can be URLs" obfusucate rather
than illuminate.
The newsgriup precedes the ANSI standard,
Irrelevant.

No highly relevant. And ANSI has shot itself in the foot by proposing a
standard that has not been widely adopted, which means that now C will
probably spread into several dialects. The newsgroup precedes ANSI, and will
survive when ANSI is just a memory.
Your opinion. And you seem to have imposed some arbitrary
time limit for C99 to 'succeed'.
It's only five years, and obviously |I cannot fortell the future, but it
seems likely that C99 will never be widely implemeted. I think that what
will happen is that people will increasingly run C code through a C++
compiler to use useful C99 features such as single line comments and inline
functions.
The ANSI answer is that you can't, but that's not good
enough.
Tough.

Tough for you but you're being unnecessarily restrictive. How about
explaining how this can be done in C on some platforms, but not portably? Debuggers are not topical here.

The details of a specific debugger are not topical, debuggers generally (for
instance we had a thread recently about whether or not they were time
wasters) are topical.
%p (the ISO specification of it) is indeed topical. Its implementation
is not.
Implemetation of standard library functions is topical.
So here you are at comp.lang.c where so many experts graciously
share their knowledge and skill, gratis. So instead of desperately
trying to prove yourself "right", why not *listen* and learn? I did.
When I first came to clc, I considered myself, if not 'expert',
at least very knowledgable about C. A couple days here proved
me wrong. I did not allow my ego to obscure or deny this fact.
It doesn't take more than a couple of days to learn all the C you need to
know, unless you want to write a compiler, if you already know another
language. That is one of the great strengths of C.
To know the answer to exotica takes a bit longer, but you don't actually
need to know this to write successful C. How about learning from someone who
knows a great deal about programming, without claiming to be at the leading
edge?
Actually the students all used one another to represent
system components, one of which was the CPU, who was
given a sequence of predefined instructions. Others
represented data objects, peripheral devices, etc.
We 'executed' a 'program' according to a strict
formal set of rules (analagous to a standard
language specification). But these rules did *not*
mandate implementation methods. E.g. a the person
representing an 'accumulator' was only required
to 'reset', 'accumulate', and report a value.
It was not mandated *how* to do so. He was free
to rely on his memory, or he could write things
down, or use a handheld calculator, etc.
If you don't have a computer then you can use these sorts of devices to
teach programming. It sounds highly creative and I wouldn't want to knock
your tutor. However if you just hnad someone a computer and let them play
with it, they can very quickly pick up programming if they have a natural
aptitude for it.
Programming is a practical skill,
Yes, and a programming language is only a small part of it.
This newsgroup provides only a small part of the knowledge
necessary. Other learning resources exist for the other
issues.

Yes sure, knowing C is only a small part of knowing "how to program", which
is a bit like "knowing how to cook", there are a few basics everyone has to
learn, but you can be perfectly competent at meat and 2 veg without being a
cordon bleu chef.
which means that you need to understnad
your implementation.
Not to use C you don't.

Yes you do, because to make mistakes and funny things happen. Formally we
could just post a copy of the standard in response to every query, in
practise humans aren't built like that.
I made no reference to a URL pointer.

No, you've defended someone who corrected my statement that typically a
pointer has enough bits to address the meory space of the computer by
pointing out that the implemetation could use a URL pointer. Formally he's
right of course, in the same way that it could use decimal ten-state memory
instead of binary.

In fact a non-perverse use of pointers would be to store the bounds of the
data item pointed to in every pointer. Then an attempt to address memeory
illegally could be caught. To my knowledge not a single implemetation
actually uses safe pointers. The reason of course is that C programmers
expect pointer dereferences to compile to single machine instructions -
something again not mentioned in the standard but highly relevant to anyone
who programs in C.
Nov 14 '05 #30

P: n/a

In article <c0**********@news5.svr.pol.co.uk>, "Malcolm" <ma*****@55bank.freeserve.co.uk> writes:
In fact a non-perverse use of pointers would be to store the bounds of the
data item pointed to in every pointer. Then an attempt to address memeory
illegally could be caught. To my knowledge not a single implemetation
actually uses safe pointers.
Your knowledge is incomplete. At least three C implementations for the
AS/400 - EPM C, System C, and ILE C - use 16-byte / 128-bit pointers
(CHAR_BIT is 8) which are not simple addresses but descriptors, and
which include a reference to a memory space, an offset in that memory
space, and a validity flag which can only be set by a privileged-mode
instruction. Mucking about with a pointer's internals resets the
flag, rendering the pointer invalid.

All three implementations will immediately trap on invalid pointer
access.

I believe ILE C (the current one) is a fully conforming C94 hosted
implementation, and System C was a fully conforming C90 hosted
implementation. I suspect EPM C wasn't a conforming hosted
implementation, though it probably came fairly close, and may have
been a conforming freestanding implementation.
The reason of course is that C programmers
expect pointer dereferences to compile to single machine instructions -
something again not mentioned in the standard but highly relevant to anyone
who programs in C.


C programmers working on the AS/400 will find that expectation is
incorrect. In C on the AS/400, *nothing* compiles to machine
instructions, single or otherwise. It compiles to a pseudoassembly
language called "MI". And that's a good thing, for AS/400 software,
since it's one of the qualities that allowed IBM to completely change
the machine's architecture without breaking working programs. (That's
*binaries*, with no recompilation required, in many cases.)

On the AS/400, robustness trumps performance. That was the design
decision for the whole architecture, and C needed to fall in line.
One of the nice things about the C standard was that it could
accomodate that.

More C programmers should do some work on the AS/400. (For one thing,
it'd make them appreciate their other development environments all
the more, if they use IBM's awful Program Development Manager and
Source Entry Utility.) You can learn a lot about what a conforming
hosted implementation can do. And if you're using a real 5250
terminal, you can also learn those swell trigraph sequences (or the
EBCDIC code points for various C punctuation characters).
--
Michael Wojcik mi************@microfocus.com

Pseudoscientific Nonsense Quote o' the Day:
From the scientific standpoint, until these energies are directly
sensed by the evolving perceptions of the individual, via the right
brain, inner-conscious, intuitive faculties, scientists will never
grasp the true workings of the universe's ubiquitous computer system.
-- Noel Huntley
Nov 14 '05 #31

P: n/a

"Michael Wojcik" <mw*****@newsguy.com> wrote in message

C programmers working on the AS/400 will find that expectation
[that pointer dereferences compile to single machine instructions ] is
incorrect. In C on the AS/400, *nothing* compiles to machine
instructions, single or otherwise. It compiles to a pseudoassembly
language called "MI".

This really is the exception that proves the point. A platform that
disallows native machine langauge programs cannot really be said to have a
compiler. Nor is C the ideal language for such an environment - you need
something which does memory management for you.
Nov 14 '05 #32

P: n/a
>"Michael Wojcik" <mw*****@newsguy.com> wrote in message

C programmers working on the AS/400 will find that expectation
[that pointer dereferences compile to single machine instructions ] is
incorrect. In C on the AS/400, *nothing* compiles to machine
instructions, single or otherwise. It compiles to a pseudoassembly
language called "MI".

In article <news:c0**********@newsg4.svr.pol.co.uk>
Malcolm <ma*****@55bank.freeserve.co.uk> writes:This really is the exception that proves the point. A platform that
disallows native machine langauge programs cannot really be said to have a
compiler. Nor is C the ideal language for such an environment - you need
something which does memory management for you.


But if you believe that C on this machine is not "compiled", then
you must believe that *nothing* on the AS/400 is *ever* compiled --
not COBOL, not RPG, not Modula-2. Yet IBM will sell you "compilers"
for all of these, as well as for C and C++. There are even AS/400
assemblers that read "MI" source and produces "machine code":
<http://www-1.ibm.com/servers/eserver/iseries/whpapr/translator.html>.

Would you also claim that any machine on which the machine's "opcodes"
are interpreted by microcode has no compilers? If not, why do you
distinguish between OMI opcodes and microcoded-machine opcodes?
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #33

P: n/a
"Malcolm" <ma*****@55bank.freeserve.co.uk> writes:
"Michael Wojcik" <mw*****@newsguy.com> wrote in message
C programmers working on the AS/400 will find that expectation
[that pointer dereferences compile to single machine instructions ] is
incorrect. In C on the AS/400, *nothing* compiles to machine
instructions, single or otherwise. It compiles to a pseudoassembly
language called "MI".

This really is the exception that proves the point. A platform that
disallows native machine langauge programs cannot really be said to have a
compiler. Nor is C the ideal language for such an environment - you need
something which does memory management for you.


Exceptions don't prove points, as least not in the sense you mean.

There are plenty of compilers that generate something other than
machine code. I'm not familiar with the AS/400, but I haven't seen
anything to suggest that C is a poor language for it.

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
Nov 14 '05 #34

P: n/a
Keith Thompson wrote:

"Malcolm" <ma*****@55bank.freeserve.co.uk> writes:
This really is the exception that proves the point.


If you ever see me using sophistry like that, here,
it will be the first of April.
Exceptions don't prove points, as least not in the sense you mean.


--
pete
Nov 14 '05 #35

P: n/a
In <ln************@nuthaus.mib.org> Keith Thompson <ks***@mib.org> writes:
machine code. I'm not familiar with the AS/400, but I haven't seen
anything to suggest that C is a poor language for it.


It depends on how you define the notion of poor language.

It is a fact that C is not the language of choice for the primary
application domain of this machine (small business server) and that very
little (if any) of the open source C code available on the Internet
has been ported to that platform (or written with portability to this
platform in mind).

It is possible to program in C on this machine, but apparently few of
those who did it actually enjoyed the experience. And this has precious
little to do with the unusual pointer size/representation.

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #36

P: n/a

"Chris Torek" <no****@torek.net> wrote in message

Would you also claim that any machine on which the machine's
"opcodes" are interpreted by microcode has no compilers? If not,
why do you distinguish between OMI opcodes and microcoded-
machine opcodes?

Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
explicit casts of void * etc. Would you describe such a program as a C
compiler? If not, why not?

Microcode creates a grey area. I would say that the difference is between an
intermediate bytecode that is designed for arbitrary hardware, and a program
which is hardware-specific, although it relies on some microcode to support
that hardware. Of course a really good optimising C compiler should build
and load microcode itself :-).

Ultimately it's just a question of definition - how far can we extend the
term "compiler" until we're talking about two totally different things? The
AS/400 almost certainly contains a substantial amount of C code which is
compiled to native machine code and runs the OS. How are we to distiguish
this compiler from the "compiler" shipped to customers?
Nov 14 '05 #37

P: n/a

"pete" <pf*****@mindspring.com> wrote in message
This really is the exception that proves the point.


If you ever see me using sophistry like that, here,
it will be the first of April.
Exceptions don't prove points, as least not in the sense you mean.

"The exception proves the rule" is a famous proverb. "Prove" means "tests",
not "demonstrates the point".

Now I claimed that not a single compiler, to my knowledge, implemented safe
pointers. An exception was raised. However on examination we see that the
"compiler" isn't really a compiler at all, if we define "compiler" as
"something that translates source code to machine code". So the exception
actually demonstrates that the point is valid.
Nov 14 '05 #38

P: n/a

In article <c0********@enews2.newsguy.com>, Chris Torek <no****@torek.net> writes:

But if you believe that C on this machine is not "compiled", then
you must believe that *nothing* on the AS/400 is *ever* compiled --
not COBOL, not RPG, not Modula-2. Yet IBM will sell you "compilers"
for all of these, as well as for C and C++.
Indeed, though I suppose we shouldn't in general allow IBM to define
"compiler" for us. Still, I think the consensus among AS/400
programmers is that we are, indeed, compiling our programs, and I
defy Malcolm to prove otherwise.
There are even AS/400
assemblers that read "MI" source and produces "machine code":
<http://www-1.ibm.com/servers/eserver/iseries/whpapr/translator.html>.


In fact, there used to be (and probably still is) a C API supplied by
IBM for this purpose; IIRC, it was just a function that took a
FILE* referring to a file open for writing and a string containing
MI source, assembled the latter, and wrote it into the former. Which
made the AS/400 the easiest machine I knew of to write an assembler
for...

(MI is a nicely CISCy pseudo-assembly, with opcodes like "translate
byte using table". Not as CISCy as VAX assembly, as I recall, but
pretty rich.)

--
Michael Wojcik mi************@microfocus.com

This record comes with a coupon that wins you a trip around the world.
-- Pizzicato Five
Nov 14 '05 #39

P: n/a

In article <c0**********@newsg4.svr.pol.co.uk>, "Malcolm" <ma*****@55bank.freeserve.co.uk> writes:

"Michael Wojcik" <mw*****@newsguy.com> wrote in message

C programmers working on the AS/400 will find that expectation
[that pointer dereferences compile to single machine instructions ] is
incorrect. In C on the AS/400, *nothing* compiles to machine
instructions, single or otherwise. It compiles to a pseudoassembly
language called "MI".
This really is the exception that proves the point.
That's not what that idiom means. "The exception proves the rule"
is a partial vernacular translation of a Latin legal principle which
means that when an exception is explicit in the law ("No parking
between 9AM and 5PM"), it implies a general rule where the exception
does not apply ("You may park between 5PM and 9AM").

In what logical system does the existence of an exception prove that
the general thesis is true? In fact, what we have here is an
exception which disproves the thesis. See [1].
A platform that
disallows native machine langauge programs cannot really be said to have a
compiler.
Oh yes it can. Observe: There are compiled languages on the AS/400.
Perhaps you need to review what a "compiler" is. Hint: it's not a
system for translating some source language into "native machine
language". That's why Java, for example, is still a compiled
language.

A compiler *compiles*. It collects multiple source statements and
processes them as a whole into some form more amenable for execution.
Contrast that with an interpreter, which is incremental - it processes
and executes one "statement" (however defined by the language) at a
time.

In any case, the C standard says nothing about compilation. There is
an implementation, which acts upon translation units. A program is
composed of one or more translation units, which undergo the various
translation stages specified by the standard.
Nor is C the ideal language for such an environment - you need
something which does memory management for you.


Really. Care to expand upon this rather bizarre thesis? In what
way do the characteristics of the AS/400 1) make C any less "ideal"
there than on any other platform, or 2) require automatic memory
management?
1. http://alt-usage-english.org/excerpts/fxtheexc.html

--
Michael Wojcik mi************@microfocus.com

Is it any wonder the world's gone insane, with information come to be
the only real medium of exchange? -- Thomas Pynchon
Nov 14 '05 #40

P: n/a
In article <c0**********@newsg3.svr.pol.co.uk>
Malcolm <ma*****@55bank.freeserve.co.uk> writes:
Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
explicit casts of void * etc. Would you describe such a program as a C
compiler? If not, why not?


Generally, I *would* call it a compiler (provided it produced an
executable image in the process, perhaps by later invoking the
"assembler" that translates the C++ to machine code). But if this
particular translator depended on the C++-to-machine-code step to
find certain fundamental errors, that is a -- perhaps even the only
-- condition under which I would not call it a compiler.

I am not sure I can define it very well, so consider the following
as an example, before I go on to an attempt at a definition:

% cat bug.c
int main(void] { return *42; }
% ctocxx -C bug.c

(Here, please assume the -C option means "leave the C++ `assembly'
visible for inspection, and that no diagnostics occur.)

% cat bug.c++
int main(] { return *42; }
%

This fails the "compiler" criterion by missing the obvious syntax
error ("]" should be "}") and semantic error (unary "*" cannot be
applied to an integer constant). (And of course, if main() were
to call itself recursively in the C version, the C++ code would
have to use some other function, or depend on that particular C++
implementation to allow recursive calls to main() -- either would
be acceptable, provided the "C compiler" comes *with* the C++
compiler portion. If the C compiler is meant to work with *any*
C++ compiler, depending on implementation-defined characteristics
would be at best a bug.)

The difference is basically one of responsibility: to be called a
"compiler", the program must make a complete syntactic and semantic
analysis of the source code, determine its "intended meaning" (or
one of several meanings, in cases where the source language has
various freedoms), and generate as its output code that is intended
to pass cleanly through any (required and/or supplied) intermediate
stages before it produces the final "executable". If something
fails to "assemble" without the "compiler" stage first pointing out
an error, this indicates a bug in the compiler.

A preprocessor, macro-processor, or textual-substitution system, on
the other hand, does not need to make complete analyses -- if the
input is erroneous, its output can be arbitrarily malformed without
this necessarily being a bug. Diagnostics from later passes are
acceptable and expected.

Of course, escape hatches (as commonly found in C compilers with
__asm__ keywords and the like) can muddy things up a bit. If you
use __asm__ to insert invalid assembly code, while the compiler
assumes that you know what you are doing, this is probably "your
fault". Likewise, a C-via-C++-to-executable compiler might provide
an escape hatch to "raw C++", and if you muck that up, it would be
your fault, rather than a compiler bug or disqualifier.

(Note that a clever implementor might even use the C++ stage to
find [some of the] required-diagnostic bugs in incorrect C code.
I consider this "OK" and "not a disqualifier" *if* the C compiler
actually reads and digests the C++ stage's diagnostics, and re-forms
them back to refer to the original C code, so that the process is
invisible to the C programmer.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #41

P: n/a
On 13 Feb 2004 20:49:18 GMT, in comp.lang.c , mw*****@newsguy.com (Michael
Wojcik) wrote:

In article <c0**********@newsg4.svr.pol.co.uk>, "Malcolm" <ma*****@55bank.freeserve.co.uk> writes:

"Michael Wojcik" <mw*****@newsguy.com> wrote in message
>
> C programmers working on the AS/400 will find that expectation
>[that pointer dereferences compile to single machine instructions ] is
> incorrect. In C on the AS/400, *nothing* compiles to machine
> instructions, single or otherwise. It compiles to a pseudoassembly
> language called "MI".

This really is the exception that proves the point.


That's not what that idiom means. "The exception proves the rule"
is a partial vernacular translation of a Latin legal principle


possibly. Its probably more likely that the saying uses the alternate
meaning of "prove" which is "test". As in the "proof of the pudding is in
the eating".

--
Mark McIntyre
CLC FAQ <http://www.eskimo.com/~scs/C-faq/top.html>
CLC readme: <http://www.angelfire.com/ms3/bchambless0/welcome_to_clc.html>
----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==----
http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups
---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =---
Nov 14 '05 #42

P: n/a

On Fri, 13 Feb 2004, Malcolm wrote:

"Chris Torek" <no****@torek.net> wrote in message

Would you also claim that any machine on which the machine's
"opcodes" are interpreted by microcode has no compilers? If not,
why do you distinguish between OMI opcodes and microcoded-
machine opcodes?
Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
explicit casts of void * etc. Would you describe such a program as a C
compiler? If not, why not?


I would call it a "translator" -- a translator that translates C code
to C++ code. It would definitely be an "implementation" of C in the
sense the word is used in the Standard (which is a broad term
encompassing compilers, interpreters, translators, and whatever else
an implementor can think up).

[Oh yeah, and BTW, that's a pretty neat idea. An open-source C
munger would cut down on at least *one* of the perennial flamewar
topics in this newsgroup.]
Ultimately it's just a question of definition - how far can we extend
the term "compiler" until we're talking about two totally different
things?
How far can a dog run into the woods? Just call a thing a "compiler"
unless you think it's not -- and then don't. At root, my opinion would
simply be that all compilers are translators, and not necessarily vice
versa, and that the word "compiler" has taken on a special connotation
in the minds of those who market software, to the point where they think
it's something magic involving hardware and stuff. :)
The AS/400 almost certainly contains a substantial amount of C code
which is compiled to native machine code and runs the OS. How are we
to distiguish this compiler from the "compiler" shipped to customers?


Why do you think we need to distinguish between them? I think
we should call a spade a spade, and if someone's not sure it *is* a
spade, then we should just call it a digging implement that looks
very much *like* a spade, and move on to more C-related topics. ;-)

-Arthur
Nov 14 '05 #43

P: n/a
Chris Torek wrote:
In article <c0**********@newsg3.svr.pol.co.uk>
Malcolm <ma*****@55bank.freeserve.co.uk> writes:
Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds
explicit casts of void * etc. Would you describe such a program as a C
compiler? If not, why not?


Generally, I *would* call it a compiler (provided it produced an
executable image in the process, perhaps by later invoking the
"assembler" that translates the C++ to machine code).


Well, it's obviously your prerogative to use words as you choose, but your
proviso here flies in the face of Aho, Sethi and Ullman's definition: "a
compiler is a program that reads a program written in one language - the
source language - and translates it to an equivalent program in another
language - the target language" - no mention there of executable images.
Source: Dragon Book (Chapter 1, page 1!)

<snip>

--
Richard Heathfield : bi****@eton.powernet.co.uk
"Usenet is a strange place." - Dennis M Ritchie, 29 July 1999.
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
K&R answers, C books, etc: http://users.powernet.co.uk/eton
Nov 14 '05 #44

P: n/a
"Malcolm" <ma*****@55bank.freeserve.co.uk> writes:
[...]
Ultimately it's just a question of definition - how far can we extend the
term "compiler" until we're talking about two totally different things? The
AS/400 almost certainly contains a substantial amount of C code which is
compiled to native machine code and runs the OS. How are we to distiguish
this compiler from the "compiler" shipped to customers?


You're almost certain that the AS/400 OS is written in C? You may be
right, but my guess is that it's written in some other language(s).

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
Nov 14 '05 #45

P: n/a

"Michael Wojcik" <mw*****@newsguy.com> wrote in message
This really is the exception that proves the point.
That's not what that idiom means. "The exception proves the rule"
is a partial vernacular translation of a Latin legal principle which
means that when an exception is explicit in the law ("No parking
between 9AM and 5PM"), it implies a general rule where the
exception does not apply ("You may park between 5PM and 9AM").

Etymology isn't meaning. The proverb is not used in that way. (By the way
the etymology itself is dodgy)
http://www.icsi.berkeley.edu/~nchang...exception.html
In what logical system does the existence of an exception prove that
the general thesis is true? In fact, what we have here is an
exception which disproves the thesis. See [1].
It's not something in formal logic, but a rule of thumb. To see if a rule
applies, look at cases that appear to be exceptions. For instance if I say
"All mammals are viviparous" then looking at chickens, or bats, which are
not exceptions to the rule, isn't helpful. However if we look at the
duck-billed platypus and echidna, which lay eggs, we find that they are
formally mammals, but they split off from the rest of the mammals a long
time ago. The rule is still useful - we won't find an oviparous antelope.
Let's take another example, "No mammals are eusocial." Well the naked mole
rat is eusocial, and wolves are a borderline case. There is nothing much
else in common between these animals, and they are not otherwise special. We
conclude that the rule isn't too useful - there's nothing special about
being a mammal that precludes eusociality.

End of logic 101, back to C.
A platform that disallows native machine langauge programs cannot
really be said to have a compiler.
Oh yes it can. Observe: There are compiled languages on the
AS/400.

It depends how you want to use the word. If anyone with a little bit of
computer knowledge asked "What's a compiler?" I would say "Something that
translates a high-level language to machine code."
In any case, the C standard says nothing about compilation.
You can build a C interpreter. What's your point, that mentioning "the
compiler" makes a post off-topic?
Nor is C the ideal language for such an environment - you need
something which does memory management for you.


Really. Care to expand upon this rather bizarre thesis? In what
way do the characteristics of the AS/400 1) make C any less "ideal"
there than on any other platform, or 2) require automatic memory
management?

Because C sacrifices safety in memory access for efficiency. Since the
platform won't allow this, the safety has to be put in at an inappropriate
level. So I would guess that when writing a function to iterate over a
string, the pointer is checked for out-of-bounds at every increment.
Certainly passing a pointer, if it contains validity information, will be
very slow.
If you do memory management at a higher level then you can have similar
safety, but raw pointers can be used internally (where the user code can't
mess with them).

There ceases to be a point in using C on the AS/400, except that C is a very
popular language, and there is always a point in supporting a standard. A
bit like driving a sports car over a traffic-calmed road - it can't go very
fast and a hatchback would make more sense, but if you own a sports car
already then you might want to do it.
Nov 14 '05 #46

P: n/a
Michael Wojcik <mw*****@newsguy.com> scribbled the following:
A compiler *compiles*. It collects multiple source statements and
processes them as a whole into some form more amenable for execution.
Contrast that with an interpreter, which is incremental - it processes
and executes one "statement" (however defined by the language) at a
time.


Could the distinction between a compiler and an interpreter be that when
they encounter program code, compilers translate it into another
language, while interpreters execute it? In other words, more or less,
compilers store away code for later execution while interpreters execute
it when they see it?

--
/-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\
\-- http://www.helsinki.fi/~palaste --------------------- rules! --------/
"All that flower power is no match for my glower power!"
- Montgomery Burns
Nov 14 '05 #47

P: n/a
nrk
Richard Heathfield wrote:
Chris Torek wrote:
In article <c0**********@newsg3.svr.pol.co.uk>
Malcolm <ma*****@55bank.freeserve.co.uk> writes:
Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes,
adds explicit casts of void * etc. Would you describe such a program as a
C compiler? If not, why not?


Generally, I *would* call it a compiler (provided it produced an
executable image in the process, perhaps by later invoking the
"assembler" that translates the C++ to machine code).


Well, it's obviously your prerogative to use words as you choose, but your
proviso here flies in the face of Aho, Sethi and Ullman's definition: "a
compiler is a program that reads a program written in one language - the
source language - and translates it to an equivalent program in another
language - the target language" - no mention there of executable images.
Source: Dragon Book (Chapter 1, page 1!)

<snip>


From "Advanced Compiler Design and Implementation" by Steven S. Muchnik:

<quote>
Strictly speaking, compilers are software systems that translate programs
written in higher-level languages into equivalent programs in object code
or machine language for execution on a computer.
....
The definition can be widened to include systems that translate from one
higher-level language to an indermediate-level form, etc.
</quote>

One might argue that an author/book cannot serve as an authoritative
definition of a term, but considering the widespread use and popularity of
the book, I would tend to take this to be an appropriate definition.

-nrk.

--
Remove devnull for email
Nov 14 '05 #48

P: n/a
nrk wrote:

Richard Heathfield wrote:
Chris Torek wrote:
In article <c0**********@newsg3.svr.pol.co.uk>
Malcolm <ma*****@55bank.freeserve.co.uk> writes:
Let's say someone produces a tool that converts C code to compliant C++
code - e.g. alters C++ keywords used as identifiers, adds prototypes,
adds explicit casts of void * etc. Would you describe such a program as a
C compiler? If not, why not?

Generally, I *would* call it a compiler (provided it produced an
executable image in the process, perhaps by later invoking the
"assembler" that translates the C++ to machine code).


Well, it's obviously your prerogative to use words as you choose, but your
proviso here flies in the face of Aho, Sethi and Ullman's definition: "a
compiler is a program that reads a program written in one language - the
source language - and translates it to an equivalent program in another
language - the target language" - no mention there of executable images.
Source: Dragon Book (Chapter 1, page 1!)

<snip>


From "Advanced Compiler Design and Implementation" by Steven S. Muchnik:

<quote>
Strictly speaking, compilers are software systems that translate programs
written in higher-level languages into equivalent programs in object code
or machine language for execution on a computer.
...
The definition can be widened to include systems that translate from one
higher-level language to an indermediate-level form, etc.
</quote>

One might argue that an author/book cannot serve as an authoritative
definition of a term, but considering the widespread use and popularity of
the book, I would tend to take this to be an appropriate definition.

-nrk.

--
Remove devnull for email


Muchnik's book's version is problematic. The Aho, Sethi and
Ullman version is
a much more disciplined definition.

Object code itself is a "language".

There may be, and usually are, several stages to producing
executables from
source code - compilation, assembly ( which is just a
specialization of
compilation ), linking, (possibly) locating and
loading/storage. That
some of thes estages are hidden behind one command or ( button
on
an IDE) is a matter of packaging, not of much else.

--
Les Cargill
Nov 14 '05 #49

P: n/a
nrk <ra*********@devnull.verizon.net> writes:
[...]
From "Advanced Compiler Design and Implementation" by Steven S. Muchnik:

<quote>
Strictly speaking, compilers are software systems that translate programs
written in higher-level languages into equivalent programs in object code
or machine language for execution on a computer.
...
The definition can be widened to include systems that translate from one
higher-level language to an indermediate-level form, etc.
</quote>

One might argue that an author/book cannot serve as an authoritative
definition of a term, but considering the widespread use and popularity of
the book, I would tend to take this to be an appropriate definition.


Doesn't IEEE have an official dictionary of computer terms? Can
someone who has a copy look up "compiler"?

For what it's worth, the first compiler I used (UCSD Pascal) generated
a pseudo-code (P-code) which was then interpreted; nobody ever called
it a translator rather than a compiler. (Later, one company started
making chips that executed P-code in hardware, or at least in
microcode.)

--
Keith Thompson (The_Other_Keith) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://www.sdsc.edu/~kst>
Schroedinger does Shakespeare: "To be *and* not to be"
Nov 14 '05 #50

79 Replies

This discussion thread is closed

Replies have been disabled for this discussion.