[I am going to snip all of the background.]
In article <11*********************@c13g2000cwb.googlegroups. com>
Old Wolf <ol*****@inspire.net.nz> wrote:
... I'm not even sure what he is trying to increment. ...
Almost invariably, when people try to write these weird things with
casts, what they *really* have is a buffer, with a pointer --
usually of type "char *" or "unsigned char *", but sometimes some
other type -- pointing into the buffer, and they are doing the
equivalent of marshalling or unmarshalling parameters for a remote
procedure call. (This also covers writing data structures to
files.)
For instance:
unsigned char buf[SOME_SIZE];
unsigned char *cp = &buf[0];
...
(code that sets up various values of various types)
At this point, the user wants to copy the bitwise representation of
each value into the buffer. A clear, simple, and correct way to
do this is:
memcpy(cp, &var1, sizeof var1);
cp += sizeof var1;
memcpy(cp, &var2, sizeof var2);
cp += sizeof var2;
memcpy(cp, &var3, sizeof var3);
cp += sizeof var3;
/* and so on */
Note that this works no matter what the type of the various "var"s
is, as long as they all fit into the provided buffer.
What they write instead, thinking this is somehow "better", is:
*((short *)cp)++ = var1; /* assuming var1 is a short */
*((double *)cp)++ = var2; /* assuming var2 is a double */
*((int *)cp)++ = var3; /* assuming var3 is an int */
Of course, this fails to compile cleanly in any correct C compiler,
because the result of a cast is not an lvalue. Having converted
"cp" from "unsigned char *" to "short *", it *might* be OK to do:
*(short *)cp = var1;
but they are also trying to "postfix ++ update" cp by sizeof(short)
in the same C expression, "because it is so much more efficient
that way".
Of course, it might *not* be OK, and it often is no more efficient
after all. If we write it as valid (but maybe-not-OK) C code:
*(short *)cp = var1;
cp += sizeof(short);
*(double *)cp = var2;
cp += sizeof(double);
*(int *)cp = var3;
cp += sizeof(int);
the potential problem is reasonably apparent: there is no guarantee
that "cp" is properly aligned, and it is actually quite likely that
*(double *)cp, at least, is NOT properly aligned. (This code will
usually fail on a SPARC or MIPS machine, for instance. There is a
small chance -- 1 out of 8 if "buf" is aligned randomly; in practice,
much less than 1 out of 8 -- that it will work.)
Perhaps amusingly, even if you do manage to "trick" the machine
into generating appropriate (for that machine) code to store the
three values given here, and update "cp" by the appropriate size,
all in one line, the resulting machine code (on the SPARC or MIPS
at least, and typically on the i386 as well) can be much worse
than if you just use the separate "cp +=" lines:
unsigned char *f2(unsigned char *cp) {
*(*(short **)&cp)++ = var1;
*(*(double **)&cp)++ = var2;
*(*(int **)&cp)++ = var3;
return cp;
}
Many C compilers will be unable to put "cp" in a register here,
so that you would get SPARC code of the form:
st %i0, [%fp - 4] ! copy "cp" parameter to memory
sethi %hi(var1), %g1 ! copy var1 to %g1 reg
ldh [%g1 + %lo(var1)], %g1
ld [%fp - 4], %l0 ! get *(short **)&cp
sth %g1, [%l0] ! store var1 at *(that)
add %l0, 2, %l0 ! increment by 2
st %l0, [%fp - 4] ! and update *(short **)&cp
sethi %hi(var2), %g1 ! copy var2 to %g1
ldd [%g1 + %lo(var1)], %o2
ld [%fp - 4], %l0 ! get *(double **)&cp
std %o2, [%l0] ! store var2 at *(that)
add %l0, 8, %l0 ! increment by 8
st %l0, [%fp - 4] ! and update *(double **)&cp
[repeat for var3, using "ld/st" instead of ldh/sth / ldd/std]
ld [%fp - 4], %i0 ! set return value
/* actually we could probably do this whole thing as a leaf */
ret; restore
Use the simpler (but two lines per store) version:
unsigned char *f3(unsigned char *cp) {
*(short *)cp = var1;
cp += sizeof(short);
*(double *)cp = var2;
cp += sizeof(double);
*(int *)cp = var3;
cp += sizeof(int);
return cp;
}
will definitely give you a leaf routine, and considerably shorter
machine code, even if the compiler is not all that clever:
sethi %hi(var1), %g1 ! copy var1 to %g1
ldh [%g1 + %lo(var1)], %g1
sth %g1, [%o0] ! store at *(short *)cp
add %o0, 2, %o0 ! cp += 2
sethi %hi(var2), %g1 ! copy var2 to %g1
ldd [%g1 + %lo(var1)], %o2
std %o2, [%o0] ! store at *(double *)cp
add %o0, 8, %o0 ! cp += 8
[repeat for var3]
/* delay slot will probably be filled by last "add", actually */
retl; nop
Of course, even this shorter version still generally fails at runtime,
because "*(type *)cp" is not aligned properly for the given type.
The memcpy() version always works. Therefore, people do not write
that one. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it
http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.