Making Fatal Hidden Assumptions - Page 31

CBFalconer

We often find hidden, and totally unnecessary, assumptions being
made in code. The following leans heavily on one particular
example, which happens to be in C. However similar things can (and
do) occur in any language.

These assumptions are generally made because of familiarity with
the language. As a non-code example, consider the idea that the
faulty code is written by blackguards bent on foulling the
language. The term blackguards is not in favor these days, and for
good reason. However, the older you are, the more likely you are
to have used it since childhood, and to use it again, barring
specific thought on the subject. The same type of thing applies to
writing code.

I hope, with this little monograph, to encourage people to examine
some hidden assumptions they are making in their code. As ever, in
dealing with C, the reference standard is the ISO C standard.
Versions can be found in text and pdf format, by searching for N869
and N1124. [1] The latter does not have a text version, but is
more up-to-date.

We will always have innocent appearing code with these kinds of
assumptions built-in. However it would be wise to annotate such
code to make the assumptions explicit, which can avoid a great deal
of agony when the code is reused under other systems.

In the following example, the code is as downloaded from the
referenced URL, and the comments are entirely mine, including the
'every 5' linenumber references.

/* Making fatal hidden assumptions */
/* Paul Hsiehs version of strlen.
http://www.azillionmonkeys.com/qed/asmexample.html

Some sneaky hidden assumptions here:
1. p = s - 1 is valid. Not guaranteed. Careless coding.
2. cast (int) p is meaningful. Not guaranteed.
3. Use of 2's complement arithmetic.
4. ints have no trap representations or hidden bits.
5. 4 == sizeof(int) && 8 == CHAR_BIT.
6. size_t is actually int.
7. sizeof(int) is a power of 2.
8. int alignment depends on a zeroed bit field.

Since strlen is normally supplied by the system, the system
designer can guarantee all but item 1. Otherwise this is
not portable. Item 1 can probably be beaten by suitable
code reorganization to avoid the initial p = s - 1. This
is a serious bug which, for example, can cause segfaults
on many systems. It is most likely to foul when (int)s
has the value 0, and is meaningful.

He fails to make the valid assumption: 1 == sizeof(char).
*/

#define hasNulByte(x) ((x - 0x01010101) & ~x & 0x80808080)
#define SW (sizeof (int) / sizeof (char))

int xstrlen (const char *s) {
const char *p; /* 5 */
int d;

p = s - 1;
do {
p++; /* 10 */
if ((((int) p) & (SW - 1)) == 0) {
do {
d = *((int *) p);
p += SW;
} while (!hasNulByte (d)); /* 15 */
p -= SW;
}
} while (*p != 0);
return p - s;
} /* 20 */

Let us start with line 1! The constants appear to require that
sizeof(int) be 4, and that CHAR_BIT be precisely 8. I haven't
really looked too closely, and it is possible that the ~x term
allows for larger sizeof(int), but nothing allows for larger
CHAR_BIT. A further hidden assumption is that there are no trap
values in the representation of an int. Its functioning is
doubtful when sizeof(int) is less that 4. At the least it will
force promotion to long, which will seriously affect the speed.

This is an ingenious and speedy way of detecting a zero byte within
an int, provided the preconditions are met. There is nothing wrong
with it, PROVIDED we know when it is valid.

In line 2 we have the confusing use of sizeof(char), which is 1 by
definition. This just serves to obscure the fact that SW is
actually sizeof(int) later. No hidden assumptions have been made
here, but the usage helps to conceal later assumptions.

Line 4. Since this is intended to replace the systems strlen()
function, it would seem advantageous to use the appropriate
signature for the function. In particular strlen returns a size_t,
not an int. size_t is always unsigned.

In line 8 we come to a biggie. The standard specifically does not
guarantee the action of a pointer below an object. The only real
purpose of this statement is to compensate for the initial
increment in line 10. This can be avoided by rearrangement of the
code, which will then let the routine function where the
assumptions are valid. This is the only real error in the code
that I see.

In line 11 we have several hidden assumptions. The first is that
the cast of a pointer to an int is valid. This is never
guaranteed. A pointer can be much larger than an int, and may have
all sorts of non-integer like information embedded, such as segment
id. If sizeof(int) is less than 4 the validity of this is even
less likely.

Then we come to the purpose of the statement, which is to discover
if the pointer is suitably aligned for an int. It does this by
bit-anding with SW-1, which is the concealed sizeof(int)-1. This
won't be very useful if sizeof(int) is, say, 3 or any other
non-poweroftwo. In addition, it assumes that an aligned pointer
will have those bits zero. While this last is very likely in
todays systems, it is still an assumption. The system designer is
entitled to assume this, but user code is not.

Line 13 again uses the unwarranted cast of a pointer to an int.
This enables the use of the already suspicious macro hasNulByte in
line 15.

If all these assumptions are correct, line 19 finally calculates a
pointer difference (which is valid, and of type size_t or ssize_t,
but will always fit into a size_t). It then does a concealed cast
of this into an int, which could cause undefined or implementation
defined behaviour if the value exceeds what will fit into an int.
This one is also unnecessary, since it is trivial to define the
return type as size_t and guarantee success.

I haven't even mentioned the assumption of 2's complement
arithmetic, which I believe to be embedded in the hasNulByte
macro. I haven't bothered to think this out.

Would you believe that so many hidden assumptions can be embedded
in such innocent looking code? The sneaky thing is that the code
appears trivially correct at first glance. This is the stuff that
Heisenbugs are made of. Yet use of such code is fairly safe if we
are aware of those hidden assumptions.

I have cross-posted this without setting follow-ups, because I
believe that discussion will be valid in all the newsgroups posted.

[1] The draft C standards can be found at:
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/>

--
"If you want to post a followup via groups.google.c om, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell. org/google/>
Also see <http://www.safalra.com/special/googlegroupsrep ly/>

Mar 6 '06

Subscribe Reply

351

13145

Keith Thompson

"Stephen Sprunk" <st*****@sprunk .org> writes:
[...]

Just about the only thing that all modern machines agree on is
CHAR_BIT == 8 (and I bet someone will dispute that). Ints, pointers,
address space semantics, etc. are all up for grabs, and that's a good
thing -- it allows systems to evolve in useful ways instead of locking
us into something that, while appearing optimal today, is not likely
to be tomorrow.

I don't know of any modern hosted implementations with CHAR_BIT > 8
(though there have certainly been such systems in the past (though I
don't know whether any of them had C compilers)), but CHAR_BIT values
of 16 and 32 are common on DSPs (Digital Signal Processors).

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 24 '06 #301

Dik T. Winter

In article <pa************ *************** *@areilly.bpc-users.org> Andrew Reilly <an************ *@areilly.bpc-users.org> writes:
....

This restriction (undefined semantics IS a restriction) makes
pointer-walking versions of algorithms second-class citizens to otherwize
equivalent indexed versions of algorithms.

void
foo(int *p, int n)
{
for (; --n >= 0;)
p[n] = n;
}

is "legal" and "defined" on all architectures, but the equivalent with a
pointer cursor isn't:

void
foo(int *p, int n)
{
p += n-1;
for (; --n >= 0;)
*p-- = n;
}
I have no idea on what you base your assertion. When the first is valid,
the second is valid, and the reverse. In your first example your first
assignment is to p[n-1] (using the initial value of n), the same for the
second version. But it is worse:
void
foo(int *p, int n)
{
p += n;
for(; --n >= 0)
*--p = n;
}
is just as valid.
I suppose that this is the root of my surprise and annoyance on
discovering what the standard says. These versions *should* be
equivalent, and equivalently well-defined.

They are.
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Mar 24 '06 #302

Andrew Reilly

On Fri, 24 Mar 2006 04:53:24 +0000, Dik T. Winter wrote:

In article <pa************ *************** *@areilly.bpc-users.org> Andrew Reilly <an************ *@areilly.bpc-users.org> writes:
...
> This restriction (undefined semantics IS a restriction) makes
> pointer-walking versions of algorithms second-class citizens to otherwize
> equivalent indexed versions of algorithms.
>
> void
> foo(int *p, int n)
> {
> for (; --n >= 0;)
> p[n] = n;
> }
>
> is "legal" and "defined" on all architectures, but the equivalent with a
> pointer cursor isn't:
>
> void
> foo(int *p, int n)
> {
> p += n-1;
> for (; --n >= 0;)
> *p-- = n;
> }
I have no idea on what you base your assertion. When the first is valid,
the second is valid, and the reverse. In your first example your first
assignment is to p[n-1] (using the initial value of n), the same for the
second version.

But, the second version *finishes* with p pointing to the -1st element of
the array, which (we now know) is undefined, and guaranteed to break an
AS/400. The first version only finishes with the integer n == -1, and the
pointer p is stil "valid". This is the discrepancy that irks me.
But it is worse:
void
foo(int *p, int n)
{
p += n;
for(; --n >= 0)
*--p = n;
}
is just as valid.

Yes, that one is certainly going to fly, even on the AS/400, as p doesn't
ever point to p(initial) - 1. But it is (IMO) less idiomatic than the
other construction. Certainly, different people's experience will differ,
there, and certainly, different processor architectures often have better
or worse support for one form or the other. In my experience,
post-modification is more common (or, rather, more often fast, where both
are available), but quite a few processors have no specific support for
address register increment or decrement addressing modes.

> I suppose that this is the root of my surprise and annoyance on
> discovering what the standard says. These versions *should* be
> equivalent, and equivalently well-defined.

They are.

Come again? This is the whole point that people (well, me, anyway) have
been arguing about! If they were truly equivalent (and the
non-unit-stride cousins), I'd go home happy.

--
Andrew

Mar 24 '06 #303

Andrew Reilly

On Thu, 23 Mar 2006 21:41:59 -0600, Stephen Sprunk wrote:

The problem is that there's really no such thing as a "normal-looking"
architecture. Every implementation differs in at least a few fundamental
things you'd find it useful to nail down, so to provide enough detail to be
meaningful your sub-standard would basically be defining the behavior of a
particular implementation.
Sure there is. All the world's a VAX (but with IEEE float), with
plausible exceptions for pointers different length than int. I'd also
wear alignment restrictions pretty happily, as long as they're reasonable.
Either-endian word significance is fine, too. Show me a "normal-looking"
modern architecture that doesn't fit that description, in a material
sense. Even most of the DSPs developed in the last ten years fit that
mould. Mostly, so that they can run existing C code well. [The few that
have been developed in that time frame, which *don't* fit that mould, are
not intended to be programmed in C, and there's no reason to expect that
they will be.]
Just about the only thing that all modern machines agree on is CHAR_BIT
== 8 (and I bet someone will dispute that). Ints, pointers, address
space semantics, etc. are all up for grabs, and that's a good thing --
it allows systems to evolve in useful ways instead of locking us into
something that, while appearing optimal today, is not likely to be
tomorrow.

If you disagree, please list all of the various currently-undefined
behaviors you want to define and what implementations conform to your
spec. Who knows, ISO might adopt it...

I'd like the pointer memory model to be "flat" in the sense that for p, a
pointer to some object, (p += i, p -= i) == p for any int i. (In a fixed
word-legnth, non-saturating arithmetic, the "flat" geometry is really
circular, or modulo. That's as it should be.)

[I'm not interested in arguing multi-processor or multi-thread consistency
semantics here. That's traditionally been outside the realm of C, and
that's probably an OK thing too, IMO.]

Cheers,

--
Andrew

Mar 24 '06 #304

Jordan Abel

On 2006-03-24, Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:

On Thu, 23 Mar 2006 21:41:59 -0600, Stephen Sprunk wrote:
The problem is that there's really no such thing as a "normal-looking"
architecture. Every implementation differs in at least a few fundamental
things you'd find it useful to nail down, so to provide enough detail to be
meaningful your sub-standard would basically be defining the behavior of a
particular implementation.

Sure there is. All the world's a VAX (but with IEEE float),

Ironic, considering VAXen don't have IEEE float. Why not just say all
the world's a 386? Oh, wait, 386 has that segmented-addressing
silliness.

Mar 24 '06 #305

David Holland

On 2006-03-23, Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:

[segmented architectures and C]
How about:

int a[10];
foo(a + 1);

where

foo(int *p)
{
p -= 1;
/* do something with p[0]..p[9] */
}
[snip]

Does p -= 1 still trap, in the first line of foo, given the way that it's
called in the main routine?

If not, how could foo() be compiled in a separate unit, in the AS/400
scenario that you described earlier?

I think you're overlooking something about how segmented architectures
actually work. I'm not sure exactly what, so I'll go through this in
excruciating detail and hopefully it'll help. (I don't know the
AS/400, so this is likely wrong in detail, but the general principle
is the same.)

Going back to the code:
int a[10];
This compiles to "get me a new segment, of size 10*sizeof(int), to
hold a."

At runtime, this results in a new segment identifier, which I'll
suppose has the value 1234. This value isn't an address; it doesn't
"mean" anything except that it's an index into some OS-level or
machine-level table somewhere. That table holds the size of the
segment; supposing sizeof(int) is 4, that size is 40.
foo(a + 1);
This compiles to "create a pointer to a, add 1 to it, and call foo."

At runtime, the pointer to a has two parts: a segment identifier,
namely 1234, and an offset, namely 0, which I'll write as
1234:0. Adding 1 (4 at the machine level) gives 1234:4. This is within
bounds, so no traps occur.

Meanwhile,
foo(int *p)
{
p -= 1;
This compiles to "subtract 4 from the pointer p".

At runtime, when called as above, this subtraction converts 1234:4 to
1234:0. This is within bounds, and no traps occur.
/* do something with p[0]..p[9] */
and this compiles to "add values between 0 and 36 to p and
dereference".

At runtime, these additions yield pointers between 1234:0 and 1234:36;
these are all within bounds, so no traps occur.
}
You'll note that the compilation of foo does not require knowing what
or how many values p points to.
If it does trap, why? It's not forming an "illegal" pointer, even for the
AS/400 world.
It doesn't.
If it doesn't trap, why should p -= 1 succeed, but p -= 2 fail?
Because p -= 2, when performed on the pointer 1234:4, tries to deduct
8 from the offset field. This underflows and traps.
What if my algorithm's natural expression is to refer to p[0]..p[-9], and
expects to be handed a pointer to the last element of a[]?
That's fine too, because 1234:36 - 36 yields 1234:0, which is still
within bounds.

You may have noticed that the compilation I did above is not actually
standards-conforming; because of the one-past-the-end rule, the size
of the segment for the array "a" has to be one larger than the array.
Otherwise, forming the pointer to the one-past-the-end element would
trap.
The significant difference of C, to other languages (besides the
assembly language of most architectures) is that you can form, store, and
use as arguments pointers into the middle of "objects". Given that
difference, the memory model is obvious,

That doesn't follow. What I've described above allows pointing into
the middle of objects, but it doesn't yield the memory model you're
envisioning.

--
- David A. Holland
(the above address works if unscrambled but isn't checked often)

Mar 24 '06 #306

Andrew Reilly

On Fri, 24 Mar 2006 08:20:12 +0000, David Holland wrote:

I think you're overlooking something about how segmented architectures
actually work.
I think you're overlooking my assertions that I don't care how
(some) segmented architectures actually work. They are by *far* the least
significant of the processor architectures in use at the moment. Most of
the segmented architectures (by installed number: x86) are not used as
such, and even those that are do *not* have the particular behaviour with
respect to the range of the "offset" component of the non-flat pointer.

Sure, it's possible to write C code that operates within the restrictions
of such architecures. It's even easy. It is, however, not historically
representative of quite large bodies of C code. That's OK. The
architecture wasn't designed to run C code, and is not primarily coded in
C.

> If it doesn't trap, why should p -= 1 succeed, but p -= 2 fail?

Because p -= 2, when performed on the pointer 1234:4, tries to deduct
8 from the offset field. This underflows and traps.

And this is the behaviour that is at odds with idiomatic C. The standard
has effectively forbidden such a construction (which is well defined and
works perfectly on every other architecture, and is convenient in many
obvious algorithms) just because this one architecture traps before any
attempt is made to access memory. The onus should instead have been on
the C implementation on this particular machine to paper over this machine
defect.
You may have noticed that the compilation I did above is not actually
standards-conforming; because of the one-past-the-end rule, the size
of the segment for the array "a" has to be one larger than the array.
Otherwise, forming the pointer to the one-past-the-end element would
trap.

Only because the machine architecture is antipathetic to C. The syntax
and behaviour of C operators offers no suggestion that symmetrical
behaviour, or non-unit steps past the end of the "object" would fail, when
that particular idiom is catered-to (by hacking the underlying object
model: over-allocating the memory segment).

> The significant difference of C, to other languages (besides the
> assembly language of most architectures) is that you can form, store,
> and use as arguments pointers into the middle of "objects". Given
> that difference, the memory model is obvious,

That doesn't follow. What I've described above allows pointing into the
middle of objects, but it doesn't yield the memory model you're
envisioning.

Just because you can use a restricted subset of the behavour of C
"naturally" on such trap-on-point segment machines is a pretty poor
argument for restricting the semantically defined "correct" behaviour on
all other architectures to that particular subset.

Look: lots of machine architectures have restrictions such that the full
semantics of C require more effort on the part of the compiler. For
example, lots of modern processors have quite restricted range for
"fast" immediate address offsets. Does this mean that the standard
should restrict the size of stack frames and structures so that all
elements can be accessed with "fast" instructions? No, of course not. The
compiler must issue more instructions to load and use large offsets in
those situations so that larger structures can be accessed.

--
Andrew

Mar 24 '06 #307

Al Balmer

On 23 Mar 2006 05:39:18 GMT, Jordan Abel <ra*******@gmai l.com> wrote:

You can't "catch it and do nothing"? What are you expected to _do_ about
an invalid or protected address being loaded [not dereferenced], anyway?
What _can_ you do, having caught the machine check? What responses are
typical?

Clean up, release resources, and get out.

--
Al Balmer
Sun City, AZ

Mar 24 '06 #308

Al Balmer

On Fri, 24 Mar 2006 10:05:23 +1100, Andrew Reilly
<an************ *@areilly.bpc-users.org> wrote:

Why do you want to create an address that you're not allowed to
dereference?

Because the ability to do so is implied by the syntax of pointer
arithmetic.

Heh. The presence of an open balcony on the 15th floor implies the
ability to jump off.

--
Al Balmer
Sun City, AZ

Mar 24 '06 #309

CBFalconer

Andrew Reilly wrote:

On Fri, 24 Mar 2006 08:20:12 +0000, David Holland wrote:
.... snip about segmented address error catching ...
You may have noticed that the compilation I did above is not
actually standards-conforming; because of the one-past-the-end
rule, the size of the segment for the array "a" has to be one
larger than the array. Otherwise, forming the pointer to the
one-past-the-end element would trap.

Only because the machine architecture is antipathetic to C. The
syntax and behaviour of C operators offers no suggestion that
symmetrical behaviour, or non-unit steps past the end of the
"object" would fail, when that particular idiom is catered-to
(by hacking the underlying object model: over-allocating the
memory segment).

No, because the machine architecture is designed to catch out of
bounds addressing in hardware, without serious effects on
run-time. In fact, the architecture is very usable with C, it just
exposes gross misprogramming.

You can always use assembly code if you want no rules. However the
demonstrated architecture will even catch out-of-range addressing
there.

--
Read about the Sony stealthware that is a security leak, phones
home, and is generally illegal in most parts of the world. Also
the apparent connivance of the various security software firms.
http://www.schneier.com/blog/archive...drm_rootk.html

Mar 24 '06 #310