In article <news:cg******* ***@news-int.gatech.edu> <ur**@ur8x.co m> wrote:
Ok, here is what I want to know: What exactly happens when
p[i] is called, as far accessing and dereferncing that makes
the code wrong (yes, I know it should not work, I just want
to know why).
In some cases, a picture is worth a thousand words. (Be sure to
view this in a fixed-width font.)
void f(void) {
char a[6] = { '1', '2', '3' };
char *p;
...
}
+-----------------------------------+
| '1' | '2' | '3' | 0 | 0 | 0 |
+-----------------------------------+
+-------------------+ /------------->
| <garbage address> |---------/
+-------------------+
The larger box represents "a", which is made up of six bytes (each
char in C is a "C byte", always). The six bytes have known values
because we initialized "a".
The smaller box represents p, the pointer. We did not initialize
it, so (assuming these are inside a function, as in the example
code) it is full of trash. If viewed as a pointer, the result is
unpredictable -- in this case I have drawn it as a "wild pointer"
pointing off into the weeds somewhere.
Now, if we set p to point to the first element of "a":
p = &a[0];
we get a new picture:
+-----------------------------------+
| '1' | '2' | '3' | 0 | 0 | 0 |
+-----------------------------------+
^
|
+--------------------+
|
+-------------------+ |
| <valid address> -|---+
+-------------------+
Now p contains an arrow pointing to &a[0].
When you write a[i], the compiler says to itself: "aha, `a', that
is declared as an array, and you want to do something with the
`value' of `a' -- index it like an array, in this case -- so I will
construct a pointer pointing to &a[0] and use that."
This special rule about arrays is a quirk of C. Many other languages
are very different in their treatment of arrays. There is no
fundamental reason the C language *has* to work this way; it just
does. That means that you simply have to memorize this rule. It
is a thing you have to know about C that has no reason other than
"the guy who wrote the language decided to do it that way" -- rather
like the syntax for declarations.
On the other hand, when you write p[i], the compiler takes the
pointer value p already has -- here, pointing to &a[0] -- and
follows the arrow and then "moves right" according to the number
in "i". Moreover, if you have the variable "p", you can set it
to point to some place other than &a[0]:
p = &a[2];
makes p point to the '3', and p[1] is the first 0 (or '\0' -- same
thing) byte, while p[-2] and p[-1] now exist, naming the '1' and
'2' in a[0] and a[1] respectively. This is because the compiler
generates code that follows the arrow and then "moves right" as
requested, and you have already moved right -- which lets you move
left again, if you want to.
The difference between using a pointer ("p") and using the array
name ("a"), then, is that when you use the array name, the compiler
has to take an extra step to *construct* the pointer it needs, just
so that it can then follow the pointer. Curiously, this extra work
*can* (not necessarily "does", just "can") result in faster machine
code. The reason is that the compiler is allowed to know a lot
more about the pointer it constructed here, *because it constructed
it*. It is not some unknown pointer taken in off the street, with
a mysterious and shady background. The constructed pointer has a
solid pedigree. Of course, given a local variable like "p", a
smart compiler can probably look around and figure out whether "p"
has a similar pedigree -- so on *good* compilers, there tends to
be little if any performance difference. On not-so-good compilers,
it is hard to tell which will be faster -- the array, because the
compiler knows about the pointer it makes, or the pointer, because
the compiler does not have to do the extra "make a pointer" step.
Or perhaps neither will be faster there, either.
The moral of the "performanc e story" above, as it were, is: use
whichever one is clearer to the human programmer. On a good compiler
it will make no real difference, and on a bad one, you cannot predict
what kind of difference it will make.
For more on The Rule about arrays and pointers in C, see also
<http://web.torek.net/torek/c/pa.html>.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it
http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.