448,837 Members | 1,613 Online
Need help? Post your question and get tips & solutions from a community of 448,837 IT Pros & Developers. It's quick & easy.

# size of a sizeof(pointer)

 P: n/a what is the size of a pointer? suppose i am writing, datatype *ptr; sizeof(ptr); now what does this sizeof(ptr) will give? will it give the size of the data the pointer is pointing to? if no, can you give an counter example? basically , i want to know what is the meaning of size of a ponter. as you know sizeof(int)=4; sizeof(char)= 2; but what does sizeof(ptr) means?? can anybody explain? Nov 14 '05 #1
79 Replies

 P: n/a On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote: what is the size of a pointer? suppose i am writing, datatype *ptr; sizeof(ptr); now what does this sizeof(ptr) will give? will it give the size of the data the pointer is pointing to? if no, can you give an counter example? basically , i want to know what is the meaning of size of a ponter. as you know sizeof(int)=4; Maybe. It must be >= 2. sizeof(char)= 2; sizeof(char) is, by definition, 1. but what does sizeof(ptr) means?? It's the amount of space the pointer itself takes up. Not the data pointed to, but the pointer itself. Often, it's == sizeof(int). Josh Nov 14 '05 #2

 P: n/a "syntax" wrote in message what is the size of a pointer? A pointer is a variable that holds an address. The size of a pointer is the size of this address. For instance, most computers have an address space of 4GB. 32 bits allows you 4GB, so the size of a pointer will be 32 bits, or 4 (char is usually 8 bits). On some microcomputers the address space is only 64K, so 16-bit pointers are used. datatype *ptr; sizeof(ptr); now what does this sizeof(ptr) will give? will it give the size of the data the pointer is pointing to? No, it gives the size of the pointer, probably 4. if no, can you give an counter example? One confusing thing about C is that arrays and pointer have array/pointer equivalence. char string[32]; printf("sizeof string %d\n", (int) sizeof(string)); will give you 32. char *string = malloc(32); printf(" sizeof string %d\n", (int) sizeof(string)); will give you the size of a pointer on your system, probably 4. basically , i want to know what is the meaning of size of a ponter. as you know sizeof(int)=4; sizeof(char)= 2; sizeof(char) is always 1, one of the little quirks of the C language. sizeof(int) is very commonly 4, but it can be any size. It is meant to be the natural size for the machine to use, which means the width of the register. For technical reasons pointers are usually the same size as ints, but again they can be any size. but what does sizeof(ptr) means?? Nov 14 '05 #3

 P: n/a Josh Sebastian wrote: On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote: as you know sizeof(int)=4; Maybe. It must be >= 2. Wrong. It must, however, be an exact multiple of 1. sizeof(char)= 2; sizeof(char) is, by definition, 1. Right. but what does sizeof(ptr) means?? It's the amount of space the pointer itself takes up. Not the data pointed to, but the pointer itself. Often, it's == sizeof(int). But, of course, it doesn't have to be (as you know). -- Richard Heathfield : bi****@eton.powernet.co.uk "Usenet is a strange place." - Dennis M Ritchie, 29 July 1999. C FAQ: http://www.eskimo.com/~scs/C-faq/top.html K&R answers, C books, etc: http://users.powernet.co.uk/eton Nov 14 '05 #4

 P: n/a On Sun, 08 Feb 2004 19:58:20 +0000, Richard Heathfield wrote: Josh Sebastian wrote: On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote: as you know sizeof(int)=4; Maybe. It must be >= 2. Wrong. It must, however, be an exact multiple of 1. Jeez... yeah, thanks. Nov 14 '05 #5

 P: n/a Josh Sebastian writes: On Sun, 08 Feb 2004 11:37:15 -0800, syntax wrote: [...] but what does sizeof(ptr) means?? It's the amount of space the pointer itself takes up. Not the data pointed to, but the pointer itself. Often, it's == sizeof(int). It's true that the size of a pointer is often equal to sizeof(int), but it's dangerous (an unnecessary) to assume that it always is. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> Schroedinger does Shakespeare: "To be *and* not to be" Nov 14 '05 #6

 P: n/a Richard Heathfield wrote: Josh Sebastian wrote: syntax wrote: sizeof(int)=4; Maybe. It must be >= 2. Wrong. It must, however, be an exact multiple of 1. An implementation cannot have 16-bit chars and 24-bit ints? How about 16-bit chars and 24-bit pointers? Nov 14 '05 #7

 P: n/a Grumble wrote: Richard Heathfield wrote: Josh Sebastian wrote: syntax wrote: sizeof(int)=4; Maybe. It must be >= 2. Wrong. It must, however, be an exact multiple of 1. It must be greater than 1, on hosted implementations. An implementation cannot have 16-bit chars and 24-bit ints? The sum of the numbers of padding bits, value bits and the sign bit, is a multiple of CHAR_BIT. How about 16-bit chars and 24-bit pointers? The bit representation of pointers is not specified. -- pete Nov 14 '05 #8

 P: n/a On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete wrote: Grumble wrote: Richard Heathfield wrote: > Josh Sebastian wrote: > >> syntax wrote: >> >>> sizeof(int)=4; >> >> Maybe. It must be >= 2. > > Wrong. It must, however, be an exact multiple of 1.It must be greater than 1, on hosted implementations. Not if a char were 16 bits wide. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #9

 P: n/a pete wrote: Grumble wrote: Richard Heathfield wrote: Josh Sebastian wrote:> syntax wrote:>>> sizeof(int)=4;>> Maybe. It must be >= 2. Wrong. It must, however, be an exact multiple of 1. It must be greater than 1, on hosted implementations. Chapter and verse, please. Of course, it's exceedingly awkward for a hosted implementation to have sizeof(int)==1, but it isn't illegal. An implementation cannot have 16-bit chars and 24-bit ints? The sum of the numbers of padding bits, value bits and the sign bit, is a multiple of CHAR_BIT. How about 16-bit chars and 24-bit pointers? The bit representation of pointers is not specified. Even so, all types have sizes measurable in whole chars; look up the definition of sizeof. Richard Nov 14 '05 #10

 P: n/a "Mike Wahler" wrote in news:qJ*******************@newsread1.news.pas.eart hlink.net: > to, but the pointer itself. Often, it's == sizeof(int). It's true that the size of a pointer is often equal to sizeof(int), but it's dangerous (an unnecessary) to assume that it always is. Or for that matter, to assume that all pointer types have the same size. Indeed. For example, Keil C51 has 1 byte, 2 byte, and 3 byte pointer sizes depending upon which memory space the pointer points to. Nov 14 '05 #11

 P: n/a Richard Bos wrote: Of course, it's exceedingly awkward for a hosted implementation to have sizeof(int)==1 [...] Is it awkward because getc() can return either a char or EOF? Nov 14 '05 #12

 P: n/a Grumble wrote: Richard Bos wrote: Of course, it's exceedingly awkward for a hosted implementation to have sizeof(int)==1 [...] Is it awkward because getc() can return either a char or EOF? That, and related problems, yes. If you need to take these legal-but- unlikely implementations into account (i.e., if you really want to be as anal-retentive about ISO-conformance as your common huff-throwing newbie (and uncommon troll) makes us out to be), you need to check for feof() and ferror() after every read operation, instead of simply for EOF. Personally, I never do. Richard Nov 14 '05 #13

 P: n/a On Mon, 9 Feb 2004, Richard Bos wrote: pete wrote: Grumble wrote: Richard Heathfield wrote: > Josh Sebastian wrote: >> [sizeof(int)] must be >= 2. > > Wrong. It must, however, be an exact multiple of 1. It must be greater than 1, on hosted implementations. Chapter and verse, please. and subsequent posts. This should be a FAQ. -Arthur Nov 14 '05 #14

 P: n/a Mark McIntyre wrote: On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete wrote:Grumble wrote:Richard Heathfield wrote: Josh Sebastian wrote: >syntax wrote:>>>>sizeof(int)=4;>>Maybe. It must be >= 2.Wrong. It must, however, be an exact multiple of 1.It must be greater than 1, on hosted implementations. Not if a char were 16 bits wide. Is there any alive implementation that uses 16bit chars?? (I know of the existance of a machine that a byte is 6-bit) -- #include #define p(s) printf(#s" endian") int main(void){int v=1;*(char*)&v?p(Little):p(Big);return 0;} Giannis Papadopoulos http://dop.users.uth.gr/ University of Thessaly Computer & Communications Engineering dept. Nov 14 '05 #15

 P: n/a On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis wrote:Is there any alive implementation that uses 16bit chars?? (I know of theexistance of a machine that a byte is 6-bit) Unicode springs to mind. I suspect that quite a few DSPs do, tho typically they're freestanding implementations. That aside, I'd be unsurprised to see future implementations using 16 bits for chars. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #16

 P: n/a "Grumble" wrote in message An implementation cannot have 16-bit chars and 24-bit ints? How about 16-bit chars and 24-bit pointers? Not allowed. chars and bytes, or to be pedantic unsigned chars and bytes, are the same thing in C. An unfortunate hangover from the early days. All types have to be a whole multiple of char. Nov 14 '05 #17

 P: n/a "CBFalconer" wrote in message For instance, most computers have an address space of 4GB. 32 bits allows you 4GB, so the size of a pointer will be 32 bits, or 4 (char is usually 8 bits). On some microcomputers the address space is only 64K, so 16-bit pointers are used. Nope. A pointer points. What information it needs to hold to do that is up to the implementation. It could consist of a URL and other information, just as a not too wild example. Another might be "Malcolms house, under the bed beside the dirty socks, last Tuesday". The amount of information needed is usually constrained by limiting the things that the pointer is allowed to point to. Clear now? Don't patronise. You and I both know that perverse implementations are allowed. Since pointers have to be a fixed size then using a URL would be grossly inefficient. Since the OP needs to understand how pointers are represented in memory on a typical system such as the one he will certainly be using, telling him that 32 bit pointers are needed to address 4GB gets across the message clearly. Talk about URL pointers is liable to confuse. You should neither know nor care, unless you are implementing the system. Well you very often need to break the bounds of ANSI C and go to a lower level. An example would be if you have a custom memory scheme. How do you know if a pointer comes from your arena or from elsewhere? Another example would be using a debugger. Invalid pointers are often set to some defined bit pattern. You need to know something about addressing to detect these bad pointers. Programming is practical. It doesn't make sense to hand someone a copy of the standard and expect them to be able to write fully-conforming ANSI C. You need to play with a real implementation on a real machine to have any hope of understanding what is going on. Nov 14 '05 #18

 P: n/a On Sun, 8 Feb 2004 22:04:50 -0000, "Malcolm" wrote: "Keith Thompson" wrote in message No, there is no array/pointer equivalence (or rather, "equivalence" is a misleading term for what's really going on). Array names are implicitly converted to pointer values in many contexts. See the C FAQ at , particularly section 6, particularly question 6.3.Exactly. "Equivalence" is the accepted term for what is going on, which isconfusing. I've never heard the term before starting to read this newsgroup. I've always called it "array/pointer duality" -leor Leor Zolman BD Software le**@bdsoft.com www.bdsoft.com -- On-Site Training in C/C++, Java, Perl & Unix C++ users: Download BD Software's free STL Error Message Decryptor at www.bdsoft.com/tools/stlfilt.html Nov 14 '05 #19

 P: n/a Mark McIntyre wrote: On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis wrote: That aside, I'd be unsurprised to see future implementations using 16 bits for chars. If we use 16-bit values as char, then the new C0x spec must define something like "byte" (java's char is unicode and it haves an 8-bit type).. There is of course wchar_t so there is definately no need for 16bit chars.. Or so I think... Comments? -- #include #define p(s) printf(#s" endian") int main(void){int v=1;*(char*)&v?p(Little):p(Big);return 0;} Giannis Papadopoulos http://dop.users.uth.gr/ University of Thessaly Computer & Communications Engineering dept. Nov 14 '05 #20

 P: n/a Papadopoulos Giannis writes: Mark McIntyre wrote: On Mon, 09 Feb 2004 19:28:21 +0200, in comp.lang.c , Papadopoulos Giannis wrote: That aside, I'd be unsurprised to see future implementations using 16 bits for chars. If we use 16-bit values as char, then the new C0x spec must define something like "byte" (java's char is unicode and it haves an 8-bit type).. There is of course wchar_t so there is definately no need for 16bit chars.. Or so I think... Comments? I think C will always define a char as being one byte (sizeof(char)==1). There's too much code that would break if that were changed. The process that led to the 1989 ANSI standard was probably the last real opportunity to change this. I'd greatly prefer the concepts of "character" and "uniquely addressable storage unit" to be separate, but it's too late to fix it. It just might be possible to deprecate the use of the word "byte" (which is part of the desciption of the language, not part of the language itself) while continuing to guarantee that sizeof(char)==1, but I doubt that even that will be done. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> Schroedinger does Shakespeare: "To be *and* not to be" Nov 14 '05 #21

 P: n/a Mark McIntyre wrote: On Mon, 09 Feb 2004 12:40:21 GMT, in comp.lang.c , pete wrote:Grumble wrote: Richard Heathfield wrote: > Josh Sebastian wrote: > >> syntax wrote: >> >>> sizeof(int)=4; >> >> Maybe. It must be >= 2. > > Wrong. It must, however, be an exact multiple of 1.It must be greater than 1, on hosted implementations. Not if a char were 16 bits wide. You can't implement the whole standard library, if sizeof(int) is one. putchar(EOF) has to be able to return EOF converted to an unsigned char value, converted back to a nonnegative int. http://groups.google.com/groups?selm...andrew.cmu.edu -- pete Nov 14 '05 #22

 P: n/a [snips] On Tue, 10 Feb 2004 06:23:04 +0000, Mike Wahler wrote: Since pointers have to be a fixed size C & V please. then using a URL would be grossly inefficient. Since the OP needs to understand how pointers are represented in memory That's platform/implemenatation dependent. I've always favord SQL queries. Store all the values in a database and the pointers are all just queries to retrieve them. telling him that 32 bit pointers are needed to address 4GB gets across the message clearly. That's one of many possible ways to represent such an address space. Anyone who ever used older DOS compilers will appreciate the clarity of not assuming pointers make any sort of inherent sense. :) Nov 14 '05 #24

 P: n/a "Mike Wahler" wrote: "Malcolm" wrote in message news:c0**********@newsg1.svr.pol.co.uk... Programming is practical. The subject of clc is not programming. Well, yes, it is. Where Malcolm goes wrong is in believing that locking yourself into the Wintel platform is part of that practicality. Richard Nov 14 '05 #25

 P: n/a "Richard Bos" wrote in message Well, yes, it is. Where Malcolm goes wrong is in believing that locking yourself into the Wintel platform is part of that practicality. So you think that Wintel is the only platform that uses 32-bit pointers to address a 4GB memory space? Nov 14 '05 #27

 P: n/a On Tue, 10 Feb 2004 21:08:17 -0000, in comp.lang.c , "Malcolm" wrote: "Mike Wahler" wrote in message > Since > pointers have to be a fixed size C & V please.Uggle *ptr = 0;Uggle **uptr = malloc(sizeof(Uggle *));*uptr = ptr;*uptr now must be set to NULL. How is this achieved if an Uggle * is ofvariable width? Mike meant that different types' pointers might be different widths. Thus an Uggle** might be wider (or narrower) than an Uggle*, which might in turn be wider (or narrower) than an int*. -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #28

 P: n/a In article , "Malcolm" writes: In fact a non-perverse use of pointers would be to store the bounds of the data item pointed to in every pointer. Then an attempt to address memeory illegally could be caught. To my knowledge not a single implemetation actually uses safe pointers. Your knowledge is incomplete. At least three C implementations for the AS/400 - EPM C, System C, and ILE C - use 16-byte / 128-bit pointers (CHAR_BIT is 8) which are not simple addresses but descriptors, and which include a reference to a memory space, an offset in that memory space, and a validity flag which can only be set by a privileged-mode instruction. Mucking about with a pointer's internals resets the flag, rendering the pointer invalid. All three implementations will immediately trap on invalid pointer access. I believe ILE C (the current one) is a fully conforming C94 hosted implementation, and System C was a fully conforming C90 hosted implementation. I suspect EPM C wasn't a conforming hosted implementation, though it probably came fairly close, and may have been a conforming freestanding implementation. The reason of course is that C programmers expect pointer dereferences to compile to single machine instructions - something again not mentioned in the standard but highly relevant to anyone who programs in C. C programmers working on the AS/400 will find that expectation is incorrect. In C on the AS/400, *nothing* compiles to machine instructions, single or otherwise. It compiles to a pseudoassembly language called "MI". And that's a good thing, for AS/400 software, since it's one of the qualities that allowed IBM to completely change the machine's architecture without breaking working programs. (That's *binaries*, with no recompilation required, in many cases.) On the AS/400, robustness trumps performance. That was the design decision for the whole architecture, and C needed to fall in line. One of the nice things about the C standard was that it could accomodate that. More C programmers should do some work on the AS/400. (For one thing, it'd make them appreciate their other development environments all the more, if they use IBM's awful Program Development Manager and Source Entry Utility.) You can learn a lot about what a conforming hosted implementation can do. And if you're using a real 5250 terminal, you can also learn those swell trigraph sequences (or the EBCDIC code points for various C punctuation characters). -- Michael Wojcik mi************@microfocus.com Pseudoscientific Nonsense Quote o' the Day: From the scientific standpoint, until these energies are directly sensed by the evolving perceptions of the individual, via the right brain, inner-conscious, intuitive faculties, scientists will never grasp the true workings of the universe's ubiquitous computer system. -- Noel Huntley Nov 14 '05 #31

 P: n/a "Michael Wojcik" wrote in message C programmers working on the AS/400 will find that expectation[that pointer dereferences compile to single machine instructions ] is incorrect. In C on the AS/400, *nothing* compiles to machine instructions, single or otherwise. It compiles to a pseudoassembly language called "MI". This really is the exception that proves the point. A platform that disallows native machine langauge programs cannot really be said to have a compiler. Nor is C the ideal language for such an environment - you need something which does memory management for you. Nov 14 '05 #32

 P: n/a >"Michael Wojcik" wrote in message C programmers working on the AS/400 will find that expectation[that pointer dereferences compile to single machine instructions ] is incorrect. In C on the AS/400, *nothing* compiles to machine instructions, single or otherwise. It compiles to a pseudoassembly language called "MI". In article Malcolm writes:This really is the exception that proves the point. A platform thatdisallows native machine langauge programs cannot really be said to have acompiler. Nor is C the ideal language for such an environment - you needsomething which does memory management for you. But if you believe that C on this machine is not "compiled", then you must believe that *nothing* on the AS/400 is *ever* compiled -- not COBOL, not RPG, not Modula-2. Yet IBM will sell you "compilers" for all of these, as well as for C and C++. There are even AS/400 assemblers that read "MI" source and produces "machine code": . Would you also claim that any machine on which the machine's "opcodes" are interpreted by microcode has no compilers? If not, why do you distinguish between OMI opcodes and microcoded-machine opcodes? -- In-Real-Life: Chris Torek, Wind River Systems Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 email: forget about it http://web.torek.net/torek/index.html Reading email is like searching for food in the garbage, thanks to spammers. Nov 14 '05 #33

 P: n/a "Malcolm" writes: "Michael Wojcik" wrote in message C programmers working on the AS/400 will find that expectation[that pointer dereferences compile to single machine instructions ] is incorrect. In C on the AS/400, *nothing* compiles to machine instructions, single or otherwise. It compiles to a pseudoassembly language called "MI". This really is the exception that proves the point. A platform that disallows native machine langauge programs cannot really be said to have a compiler. Nor is C the ideal language for such an environment - you need something which does memory management for you. Exceptions don't prove points, as least not in the sense you mean. There are plenty of compilers that generate something other than machine code. I'm not familiar with the AS/400, but I haven't seen anything to suggest that C is a poor language for it. -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> Schroedinger does Shakespeare: "To be *and* not to be" Nov 14 '05 #34

 P: n/a Keith Thompson wrote: "Malcolm" writes: This really is the exception that proves the point. If you ever see me using sophistry like that, here, it will be the first of April. Exceptions don't prove points, as least not in the sense you mean. -- pete Nov 14 '05 #35

 P: n/a In Keith Thompson writes: machine code. I'm not familiar with the AS/400, but I haven't seenanything to suggest that C is a poor language for it. It depends on how you define the notion of poor language. It is a fact that C is not the language of choice for the primary application domain of this machine (small business server) and that very little (if any) of the open source C code available on the Internet has been ported to that platform (or written with portability to this platform in mind). It is possible to program in C on this machine, but apparently few of those who did it actually enjoyed the experience. And this has precious little to do with the unusual pointer size/representation. Dan -- Dan Pop DESY Zeuthen, RZ group Email: Da*****@ifh.de Nov 14 '05 #36

 P: n/a "Chris Torek" wrote in message Would you also claim that any machine on which the machine's "opcodes" are interpreted by microcode has no compilers? If not, why do you distinguish between OMI opcodes and microcoded- machine opcodes? Let's say someone produces a tool that converts C code to compliant C++ code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds explicit casts of void * etc. Would you describe such a program as a C compiler? If not, why not? Microcode creates a grey area. I would say that the difference is between an intermediate bytecode that is designed for arbitrary hardware, and a program which is hardware-specific, although it relies on some microcode to support that hardware. Of course a really good optimising C compiler should build and load microcode itself :-). Ultimately it's just a question of definition - how far can we extend the term "compiler" until we're talking about two totally different things? The AS/400 almost certainly contains a substantial amount of C code which is compiled to native machine code and runs the OS. How are we to distiguish this compiler from the "compiler" shipped to customers? Nov 14 '05 #37

 P: n/a "pete" wrote in message This really is the exception that proves the point. If you ever see me using sophistry like that, here, it will be the first of April. Exceptions don't prove points, as least not in the sense you mean. "The exception proves the rule" is a famous proverb. "Prove" means "tests", not "demonstrates the point". Now I claimed that not a single compiler, to my knowledge, implemented safe pointers. An exception was raised. However on examination we see that the "compiler" isn't really a compiler at all, if we define "compiler" as "something that translates source code to machine code". So the exception actually demonstrates that the point is valid. Nov 14 '05 #38

 P: n/a In article , Chris Torek writes: But if you believe that C on this machine is not "compiled", then you must believe that *nothing* on the AS/400 is *ever* compiled -- not COBOL, not RPG, not Modula-2. Yet IBM will sell you "compilers" for all of these, as well as for C and C++. Indeed, though I suppose we shouldn't in general allow IBM to define "compiler" for us. Still, I think the consensus among AS/400 programmers is that we are, indeed, compiling our programs, and I defy Malcolm to prove otherwise. There are even AS/400 assemblers that read "MI" source and produces "machine code": . In fact, there used to be (and probably still is) a C API supplied by IBM for this purpose; IIRC, it was just a function that took a FILE* referring to a file open for writing and a string containing MI source, assembled the latter, and wrote it into the former. Which made the AS/400 the easiest machine I knew of to write an assembler for... (MI is a nicely CISCy pseudo-assembly, with opcodes like "translate byte using table". Not as CISCy as VAX assembly, as I recall, but pretty rich.) -- Michael Wojcik mi************@microfocus.com This record comes with a coupon that wins you a trip around the world. -- Pizzicato Five Nov 14 '05 #39

 P: n/a In article , "Malcolm" writes: "Michael Wojcik" wrote in message C programmers working on the AS/400 will find that expectation[that pointer dereferences compile to single machine instructions ] is incorrect. In C on the AS/400, *nothing* compiles to machine instructions, single or otherwise. It compiles to a pseudoassembly language called "MI". This really is the exception that proves the point. That's not what that idiom means. "The exception proves the rule" is a partial vernacular translation of a Latin legal principle which means that when an exception is explicit in the law ("No parking between 9AM and 5PM"), it implies a general rule where the exception does not apply ("You may park between 5PM and 9AM"). In what logical system does the existence of an exception prove that the general thesis is true? In fact, what we have here is an exception which disproves the thesis. See [1]. A platform that disallows native machine langauge programs cannot really be said to have a compiler. Oh yes it can. Observe: There are compiled languages on the AS/400. Perhaps you need to review what a "compiler" is. Hint: it's not a system for translating some source language into "native machine language". That's why Java, for example, is still a compiled language. A compiler *compiles*. It collects multiple source statements and processes them as a whole into some form more amenable for execution. Contrast that with an interpreter, which is incremental - it processes and executes one "statement" (however defined by the language) at a time. In any case, the C standard says nothing about compilation. There is an implementation, which acts upon translation units. A program is composed of one or more translation units, which undergo the various translation stages specified by the standard. Nor is C the ideal language for such an environment - you need something which does memory management for you. Really. Care to expand upon this rather bizarre thesis? In what way do the characteristics of the AS/400 1) make C any less "ideal" there than on any other platform, or 2) require automatic memory management? 1. http://alt-usage-english.org/excerpts/fxtheexc.html -- Michael Wojcik mi************@microfocus.com Is it any wonder the world's gone insane, with information come to be the only real medium of exchange? -- Thomas Pynchon Nov 14 '05 #40

 P: n/a In article Malcolm writes:Let's say someone produces a tool that converts C code to compliant C++code - e.g. alters C++ keywords used as identifiers, adds prototypes, addsexplicit casts of void * etc. Would you describe such a program as a Ccompiler? If not, why not? Generally, I *would* call it a compiler (provided it produced an executable image in the process, perhaps by later invoking the "assembler" that translates the C++ to machine code). But if this particular translator depended on the C++-to-machine-code step to find certain fundamental errors, that is a -- perhaps even the only -- condition under which I would not call it a compiler. I am not sure I can define it very well, so consider the following as an example, before I go on to an attempt at a definition: % cat bug.c int main(void] { return *42; } % ctocxx -C bug.c (Here, please assume the -C option means "leave the C++ `assembly' visible for inspection, and that no diagnostics occur.) % cat bug.c++ int main(] { return *42; } % This fails the "compiler" criterion by missing the obvious syntax error ("]" should be "}") and semantic error (unary "*" cannot be applied to an integer constant). (And of course, if main() were to call itself recursively in the C version, the C++ code would have to use some other function, or depend on that particular C++ implementation to allow recursive calls to main() -- either would be acceptable, provided the "C compiler" comes *with* the C++ compiler portion. If the C compiler is meant to work with *any* C++ compiler, depending on implementation-defined characteristics would be at best a bug.) The difference is basically one of responsibility: to be called a "compiler", the program must make a complete syntactic and semantic analysis of the source code, determine its "intended meaning" (or one of several meanings, in cases where the source language has various freedoms), and generate as its output code that is intended to pass cleanly through any (required and/or supplied) intermediate stages before it produces the final "executable". If something fails to "assemble" without the "compiler" stage first pointing out an error, this indicates a bug in the compiler. A preprocessor, macro-processor, or textual-substitution system, on the other hand, does not need to make complete analyses -- if the input is erroneous, its output can be arbitrarily malformed without this necessarily being a bug. Diagnostics from later passes are acceptable and expected. Of course, escape hatches (as commonly found in C compilers with __asm__ keywords and the like) can muddy things up a bit. If you use __asm__ to insert invalid assembly code, while the compiler assumes that you know what you are doing, this is probably "your fault". Likewise, a C-via-C++-to-executable compiler might provide an escape hatch to "raw C++", and if you muck that up, it would be your fault, rather than a compiler bug or disqualifier. (Note that a clever implementor might even use the C++ stage to find [some of the] required-diagnostic bugs in incorrect C code. I consider this "OK" and "not a disqualifier" *if* the C compiler actually reads and digests the C++ stage's diagnostics, and re-forms them back to refer to the original C code, so that the process is invisible to the C programmer.) -- In-Real-Life: Chris Torek, Wind River Systems Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603 email: forget about it http://web.torek.net/torek/index.html Reading email is like searching for food in the garbage, thanks to spammers. Nov 14 '05 #41

 P: n/a On 13 Feb 2004 20:49:18 GMT, in comp.lang.c , mw*****@newsguy.com (Michael Wojcik) wrote: In article , "Malcolm" writes: "Michael Wojcik" wrote in message > > C programmers working on the AS/400 will find that expectation >[that pointer dereferences compile to single machine instructions ] is > incorrect. In C on the AS/400, *nothing* compiles to machine > instructions, single or otherwise. It compiles to a pseudoassembly > language called "MI". This really is the exception that proves the point.That's not what that idiom means. "The exception proves the rule"is a partial vernacular translation of a Latin legal principle possibly. Its probably more likely that the saying uses the alternate meaning of "prove" which is "test". As in the "proof of the pudding is in the eating". -- Mark McIntyre CLC FAQ CLC readme: ----== Posted via Newsfeed.Com - Unlimited-Uncensored-Secure Usenet News==---- http://www.newsfeed.com The #1 Newsgroup Service in the World! >100,000 Newsgroups ---= 19 East/West-Coast Specialized Servers - Total Privacy via Encryption =--- Nov 14 '05 #42

 P: n/a On Fri, 13 Feb 2004, Malcolm wrote: "Chris Torek" wrote in message Would you also claim that any machine on which the machine's "opcodes" are interpreted by microcode has no compilers? If not, why do you distinguish between OMI opcodes and microcoded- machine opcodes? Let's say someone produces a tool that converts C code to compliant C++ code - e.g. alters C++ keywords used as identifiers, adds prototypes, adds explicit casts of void * etc. Would you describe such a program as a C compiler? If not, why not? I would call it a "translator" -- a translator that translates C code to C++ code. It would definitely be an "implementation" of C in the sense the word is used in the Standard (which is a broad term encompassing compilers, interpreters, translators, and whatever else an implementor can think up). [Oh yeah, and BTW, that's a pretty neat idea. An open-source C munger would cut down on at least *one* of the perennial flamewar topics in this newsgroup.] Ultimately it's just a question of definition - how far can we extend the term "compiler" until we're talking about two totally different things? How far can a dog run into the woods? Just call a thing a "compiler" unless you think it's not -- and then don't. At root, my opinion would simply be that all compilers are translators, and not necessarily vice versa, and that the word "compiler" has taken on a special connotation in the minds of those who market software, to the point where they think it's something magic involving hardware and stuff. :) The AS/400 almost certainly contains a substantial amount of C code which is compiled to native machine code and runs the OS. How are we to distiguish this compiler from the "compiler" shipped to customers? Why do you think we need to distinguish between them? I think we should call a spade a spade, and if someone's not sure it *is* a spade, then we should just call it a digging implement that looks very much *like* a spade, and move on to more C-related topics. ;-) -Arthur Nov 14 '05 #43

 P: n/a Chris Torek wrote: In article Malcolm writes:Let's say someone produces a tool that converts C code to compliant C++code - e.g. alters C++ keywords used as identifiers, adds prototypes, addsexplicit casts of void * etc. Would you describe such a program as a Ccompiler? If not, why not? Generally, I *would* call it a compiler (provided it produced an executable image in the process, perhaps by later invoking the "assembler" that translates the C++ to machine code). Well, it's obviously your prerogative to use words as you choose, but your proviso here flies in the face of Aho, Sethi and Ullman's definition: "a compiler is a program that reads a program written in one language - the source language - and translates it to an equivalent program in another language - the target language" - no mention there of executable images. Source: Dragon Book (Chapter 1, page 1!) -- Richard Heathfield : bi****@eton.powernet.co.uk "Usenet is a strange place." - Dennis M Ritchie, 29 July 1999. C FAQ: http://www.eskimo.com/~scs/C-faq/top.html K&R answers, C books, etc: http://users.powernet.co.uk/eton Nov 14 '05 #44

 P: n/a "Malcolm" writes: [...] Ultimately it's just a question of definition - how far can we extend the term "compiler" until we're talking about two totally different things? The AS/400 almost certainly contains a substantial amount of C code which is compiled to native machine code and runs the OS. How are we to distiguish this compiler from the "compiler" shipped to customers? You're almost certain that the AS/400 OS is written in C? You may be right, but my guess is that it's written in some other language(s). -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> Schroedinger does Shakespeare: "To be *and* not to be" Nov 14 '05 #45

 P: n/a "Michael Wojcik" wrote in message This really is the exception that proves the point. That's not what that idiom means. "The exception proves the rule" is a partial vernacular translation of a Latin legal principle which means that when an exception is explicit in the law ("No parking between 9AM and 5PM"), it implies a general rule where the exception does not apply ("You may park between 5PM and 9AM"). Etymology isn't meaning. The proverb is not used in that way. (By the way the etymology itself is dodgy) http://www.icsi.berkeley.edu/~nchang...exception.html In what logical system does the existence of an exception prove that the general thesis is true? In fact, what we have here is an exception which disproves the thesis. See [1]. It's not something in formal logic, but a rule of thumb. To see if a rule applies, look at cases that appear to be exceptions. For instance if I say "All mammals are viviparous" then looking at chickens, or bats, which are not exceptions to the rule, isn't helpful. However if we look at the duck-billed platypus and echidna, which lay eggs, we find that they are formally mammals, but they split off from the rest of the mammals a long time ago. The rule is still useful - we won't find an oviparous antelope. Let's take another example, "No mammals are eusocial." Well the naked mole rat is eusocial, and wolves are a borderline case. There is nothing much else in common between these animals, and they are not otherwise special. We conclude that the rule isn't too useful - there's nothing special about being a mammal that precludes eusociality. End of logic 101, back to C. A platform that disallows native machine langauge programs cannot really be said to have a compiler. Oh yes it can. Observe: There are compiled languages on the AS/400. It depends how you want to use the word. If anyone with a little bit of computer knowledge asked "What's a compiler?" I would say "Something that translates a high-level language to machine code." In any case, the C standard says nothing about compilation. You can build a C interpreter. What's your point, that mentioning "the compiler" makes a post off-topic? Nor is C the ideal language for such an environment - you need something which does memory management for you. Really. Care to expand upon this rather bizarre thesis? In what way do the characteristics of the AS/400 1) make C any less "ideal" there than on any other platform, or 2) require automatic memory management? Because C sacrifices safety in memory access for efficiency. Since the platform won't allow this, the safety has to be put in at an inappropriate level. So I would guess that when writing a function to iterate over a string, the pointer is checked for out-of-bounds at every increment. Certainly passing a pointer, if it contains validity information, will be very slow. If you do memory management at a higher level then you can have similar safety, but raw pointers can be used internally (where the user code can't mess with them). There ceases to be a point in using C on the AS/400, except that C is a very popular language, and there is always a point in supporting a standard. A bit like driving a sports car over a traffic-calmed road - it can't go very fast and a hatchback would make more sense, but if you own a sports car already then you might want to do it. Nov 14 '05 #46

 P: n/a Michael Wojcik scribbled the following: A compiler *compiles*. It collects multiple source statements and processes them as a whole into some form more amenable for execution. Contrast that with an interpreter, which is incremental - it processes and executes one "statement" (however defined by the language) at a time. Could the distinction between a compiler and an interpreter be that when they encounter program code, compilers translate it into another language, while interpreters execute it? In other words, more or less, compilers store away code for later execution while interpreters execute it when they see it? -- /-- Joona Palaste (pa*****@cc.helsinki.fi) ------------- Finland --------\ \-- http://www.helsinki.fi/~palaste --------------------- rules! --------/ "All that flower power is no match for my glower power!" - Montgomery Burns Nov 14 '05 #47

 P: n/a Richard Heathfield wrote: Chris Torek wrote: In article Malcolm writes:Let's say someone produces a tool that converts C code to compliant C++code - e.g. alters C++ keywords used as identifiers, adds prototypes,adds explicit casts of void * etc. Would you describe such a program as aC compiler? If not, why not? Generally, I *would* call it a compiler (provided it produced an executable image in the process, perhaps by later invoking the "assembler" that translates the C++ to machine code). Well, it's obviously your prerogative to use words as you choose, but your proviso here flies in the face of Aho, Sethi and Ullman's definition: "a compiler is a program that reads a program written in one language - the source language - and translates it to an equivalent program in another language - the target language" - no mention there of executable images. Source: Dragon Book (Chapter 1, page 1!) From "Advanced Compiler Design and Implementation" by Steven S. Muchnik: Strictly speaking, compilers are software systems that translate programs written in higher-level languages into equivalent programs in object code or machine language for execution on a computer. .... The definition can be widened to include systems that translate from one higher-level language to an indermediate-level form, etc. One might argue that an author/book cannot serve as an authoritative definition of a term, but considering the widespread use and popularity of the book, I would tend to take this to be an appropriate definition. -nrk. -- Remove devnull for email Nov 14 '05 #48

 P: n/a nrk wrote: Richard Heathfield wrote: Chris Torek wrote: In article Malcolm writes:Let's say someone produces a tool that converts C code to compliant C++code - e.g. alters C++ keywords used as identifiers, adds prototypes,adds explicit casts of void * etc. Would you describe such a program as aC compiler? If not, why not? Generally, I *would* call it a compiler (provided it produced an executable image in the process, perhaps by later invoking the "assembler" that translates the C++ to machine code). Well, it's obviously your prerogative to use words as you choose, but your proviso here flies in the face of Aho, Sethi and Ullman's definition: "a compiler is a program that reads a program written in one language - the source language - and translates it to an equivalent program in another language - the target language" - no mention there of executable images. Source: Dragon Book (Chapter 1, page 1!) From "Advanced Compiler Design and Implementation" by Steven S. Muchnik: Strictly speaking, compilers are software systems that translate programs written in higher-level languages into equivalent programs in object code or machine language for execution on a computer. ... The definition can be widened to include systems that translate from one higher-level language to an indermediate-level form, etc. One might argue that an author/book cannot serve as an authoritative definition of a term, but considering the widespread use and popularity of the book, I would tend to take this to be an appropriate definition. -nrk. -- Remove devnull for email Muchnik's book's version is problematic. The Aho, Sethi and Ullman version is a much more disciplined definition. Object code itself is a "language". There may be, and usually are, several stages to producing executables from source code - compilation, assembly ( which is just a specialization of compilation ), linking, (possibly) locating and loading/storage. That some of thes estages are hidden behind one command or ( button on an IDE) is a matter of packaging, not of much else. -- Les Cargill Nov 14 '05 #49

 P: n/a nrk writes: [...] From "Advanced Compiler Design and Implementation" by Steven S. Muchnik: Strictly speaking, compilers are software systems that translate programs written in higher-level languages into equivalent programs in object code or machine language for execution on a computer. ... The definition can be widened to include systems that translate from one higher-level language to an indermediate-level form, etc. One might argue that an author/book cannot serve as an authoritative definition of a term, but considering the widespread use and popularity of the book, I would tend to take this to be an appropriate definition. Doesn't IEEE have an official dictionary of computer terms? Can someone who has a copy look up "compiler"? For what it's worth, the first compiler I used (UCSD Pascal) generated a pseudo-code (P-code) which was then interpreted; nobody ever called it a translator rather than a compiler. (Later, one company started making chips that executed P-code in hardware, or at least in microcode.) -- Keith Thompson (The_Other_Keith) ks***@mib.org San Diego Supercomputer Center <*> Schroedinger does Shakespeare: "To be *and* not to be" Nov 14 '05 #50

79 Replies

### This discussion thread is closed

Replies have been disabled for this discussion.

### Similar topics

Browse more C / C++ Questions on Bytes