473,842 Members | 1,872 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Making Fatal Hidden Assumptions

We often find hidden, and totally unnecessary, assumptions being
made in code. The following leans heavily on one particular
example, which happens to be in C. However similar things can (and
do) occur in any language.

These assumptions are generally made because of familiarity with
the language. As a non-code example, consider the idea that the
faulty code is written by blackguards bent on foulling the
language. The term blackguards is not in favor these days, and for
good reason. However, the older you are, the more likely you are
to have used it since childhood, and to use it again, barring
specific thought on the subject. The same type of thing applies to
writing code.

I hope, with this little monograph, to encourage people to examine
some hidden assumptions they are making in their code. As ever, in
dealing with C, the reference standard is the ISO C standard.
Versions can be found in text and pdf format, by searching for N869
and N1124. [1] The latter does not have a text version, but is
more up-to-date.

We will always have innocent appearing code with these kinds of
assumptions built-in. However it would be wise to annotate such
code to make the assumptions explicit, which can avoid a great deal
of agony when the code is reused under other systems.

In the following example, the code is as downloaded from the
referenced URL, and the comments are entirely mine, including the
'every 5' linenumber references.

/* Making fatal hidden assumptions */
/* Paul Hsiehs version of strlen.
http://www.azillionmonkeys.com/qed/asmexample.html

Some sneaky hidden assumptions here:
1. p = s - 1 is valid. Not guaranteed. Careless coding.
2. cast (int) p is meaningful. Not guaranteed.
3. Use of 2's complement arithmetic.
4. ints have no trap representations or hidden bits.
5. 4 == sizeof(int) && 8 == CHAR_BIT.
6. size_t is actually int.
7. sizeof(int) is a power of 2.
8. int alignment depends on a zeroed bit field.

Since strlen is normally supplied by the system, the system
designer can guarantee all but item 1. Otherwise this is
not portable. Item 1 can probably be beaten by suitable
code reorganization to avoid the initial p = s - 1. This
is a serious bug which, for example, can cause segfaults
on many systems. It is most likely to foul when (int)s
has the value 0, and is meaningful.

He fails to make the valid assumption: 1 == sizeof(char).
*/

#define hasNulByte(x) ((x - 0x01010101) & ~x & 0x80808080)
#define SW (sizeof (int) / sizeof (char))

int xstrlen (const char *s) {
const char *p; /* 5 */
int d;

p = s - 1;
do {
p++; /* 10 */
if ((((int) p) & (SW - 1)) == 0) {
do {
d = *((int *) p);
p += SW;
} while (!hasNulByte (d)); /* 15 */
p -= SW;
}
} while (*p != 0);
return p - s;
} /* 20 */

Let us start with line 1! The constants appear to require that
sizeof(int) be 4, and that CHAR_BIT be precisely 8. I haven't
really looked too closely, and it is possible that the ~x term
allows for larger sizeof(int), but nothing allows for larger
CHAR_BIT. A further hidden assumption is that there are no trap
values in the representation of an int. Its functioning is
doubtful when sizeof(int) is less that 4. At the least it will
force promotion to long, which will seriously affect the speed.

This is an ingenious and speedy way of detecting a zero byte within
an int, provided the preconditions are met. There is nothing wrong
with it, PROVIDED we know when it is valid.

In line 2 we have the confusing use of sizeof(char), which is 1 by
definition. This just serves to obscure the fact that SW is
actually sizeof(int) later. No hidden assumptions have been made
here, but the usage helps to conceal later assumptions.

Line 4. Since this is intended to replace the systems strlen()
function, it would seem advantageous to use the appropriate
signature for the function. In particular strlen returns a size_t,
not an int. size_t is always unsigned.

In line 8 we come to a biggie. The standard specifically does not
guarantee the action of a pointer below an object. The only real
purpose of this statement is to compensate for the initial
increment in line 10. This can be avoided by rearrangement of the
code, which will then let the routine function where the
assumptions are valid. This is the only real error in the code
that I see.

In line 11 we have several hidden assumptions. The first is that
the cast of a pointer to an int is valid. This is never
guaranteed. A pointer can be much larger than an int, and may have
all sorts of non-integer like information embedded, such as segment
id. If sizeof(int) is less than 4 the validity of this is even
less likely.

Then we come to the purpose of the statement, which is to discover
if the pointer is suitably aligned for an int. It does this by
bit-anding with SW-1, which is the concealed sizeof(int)-1. This
won't be very useful if sizeof(int) is, say, 3 or any other
non-poweroftwo. In addition, it assumes that an aligned pointer
will have those bits zero. While this last is very likely in
todays systems, it is still an assumption. The system designer is
entitled to assume this, but user code is not.

Line 13 again uses the unwarranted cast of a pointer to an int.
This enables the use of the already suspicious macro hasNulByte in
line 15.

If all these assumptions are correct, line 19 finally calculates a
pointer difference (which is valid, and of type size_t or ssize_t,
but will always fit into a size_t). It then does a concealed cast
of this into an int, which could cause undefined or implementation
defined behaviour if the value exceeds what will fit into an int.
This one is also unnecessary, since it is trivial to define the
return type as size_t and guarantee success.

I haven't even mentioned the assumption of 2's complement
arithmetic, which I believe to be embedded in the hasNulByte
macro. I haven't bothered to think this out.

Would you believe that so many hidden assumptions can be embedded
in such innocent looking code? The sneaky thing is that the code
appears trivially correct at first glance. This is the stuff that
Heisenbugs are made of. Yet use of such code is fairly safe if we
are aware of those hidden assumptions.

I have cross-posted this without setting follow-ups, because I
believe that discussion will be valid in all the newsgroups posted.

[1] The draft C standards can be found at:
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/>

--
"If you want to post a followup via groups.google.c om, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell. org/google/>
Also see <http://www.safalra.com/special/googlegroupsrep ly/>

Mar 6 '06
351 13191
>Keith Thompson said:
Um, I always thought that "within" and "outside" were two different
things.

In article <du**********@n wrdmz03.dmz.ncs .ea.ibs-infra.bt.com>
Richard Heathfield <in*****@invali d.invalid> wrote:Ask Jack to lend you his bottle. You'll soon change your mind.


To clarify a bit ...

A mathematician named Klein
Thought the Moebius band was divine
Said he, "If you glue
The edges of two
You'll get a weird bottle like mine!"

:-)

(A Moebius band has only one side. It is a two-dimensional object
that exists only in a 3-dimensional [or higher] space. A Klein
bottle can only be made in a 4-dimensional [or higher] space, and
is a 3-D object with only one side. The concept can be carried on
indefinitely, but a Klein bottle is hard enough to contemplate
already.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (4039.22'N, 11150.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Mar 8 '06 #81
On 2006-03-08, Vladimir S. Oka <no****@btopenw orld.com> wrote:
Al Balmer wrote:
On 8 Mar 2006 08:52:13 -0800, "Vladimir S. Oka"
<no****@btopenw orld.com> wrote:

Al Balmer wrote:
On 8 Mar 2006 08:08:48 GMT, Jordan Abel <ra*******@gmai l.com> wrote:

>On 2006-03-08, Randy Howard <ra*********@FO OverizonBAR.net > wrote:
>> blackguard:
>> A thoroughly unprincipled person; a scoundrel.
>> A foul-mouthed person.
>
>Sure, that's what it _means_, but...
>
>> Does everything have to become a racism experiment?
>
>the question is one of etymology.

?
What do you imagine the etymology to be?

FWIW, from <http://www.wordorigins .org/wordorb.htm>:


Actually, I wasn't asking that. I wondered what Jordan was imagining
it to be.


Ah, sorry. I didn't read the lot carefully enough.


I don't imagine it to be anything. I suspect others do, and that's why
there is a potential for accusations of racism
Mar 8 '06 #82
On Wed, 08 Mar 2006 11:52:52 -0800, Andrey Tarasevich
<an************ **@hotmail.com> wrote:
James Dow Allen wrote:
Mr. Hsieh immediately does p++ and his code will be correct if then
p == s. I don't question Chuck's argument, or whether the C standard
allows the C compiler to trash the hard disk when it sees p=s-1,
but I'm sincerely curious whether anyone knows of an *actual*
environment
where p == s will ever be false after (p = s-1; p++).
...


There are actual environments where 's - 1' alone is enough to cause a
crash. In fact, any non-flat memory model environment (i.e. environment
with 'segment:offset ' pointers) would be a good candidate. The modern
x86 will normally crash, unless the implementation takes specific steps
to avoid it.


Exactly which x86 mode are you referring to ?

16 bit real mode, virtual86 mode or some 32 mode (which are after all
segmented modes with all segmented registers with the same value) ?

If s is stored in 16 bit mode in ES:DX with DX=0, then p=s-1 would
need to decrement ES by one and store 000F in DX. Why would reloading
ES cause any traps, since no actual memory reference is attempted ?
Doing p++ would most likely just increment DX by one to 0010, thus
ES:DX would point to s again, which is a legal address, but with a
different internal representation.

IIRC some 32 bit addressing mode would trap if one tried to load the
segment register, but again, how could the caller generate such
constructs as s = ES:0 at least from user mode. In practice s = ES:0
could only be set by a kernel mode routine calling a user mode
routine, so this is really an issue only with main() parameters.

Paul

Mar 8 '06 #83
Andrey Tarasevich wrote:
James Dow Allen wrote:
Mr. Hsieh immediately does p++ and his code will be correct if then
p == s. I don't question Chuck's argument, or whether the C standard
allows the C compiler to trash the hard disk when it sees p=s-1,
but I'm sincerely curious whether anyone knows of an *actual*
environment where p == s will ever be false after (p = s-1; p++).
...


There are actual environments where 's - 1' alone is enough to cause a
crash. In fact, any non-flat memory model environment (i.e. environment
with 'segment:offset ' pointers) would be a good candidate. The modern
x86 will normally crash, unless the implementation takes specific steps
to avoid it.


This illustrates the fact that usenet threads are uncontrollable.
I wrote the original to draw attention to hidden assumptions, and
it has immediately degenerated into thrashing about the one real
error in the sample code. I could have corrected and eliminated
that error by a slight code rework, but then I would have modified
Mr Hsiehs code. There were at least seven further assumptions,
most of which were necessary for the purposes of the code, but
strictly limited its applicability.

My aim was to get people to recognize and document such hidden
assumptions, rather than leaving them lying there to create sneaky
bugs in apparently portable code.

--
"If you want to post a followup via groups.google.c om, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell. org/google/>
Also see <http://www.safalra.com/special/googlegroupsrep ly/>
Mar 8 '06 #84
On Wed, 08 Mar 2006 13:14:39 -0800, Andrey Tarasevich wrote:
Andrew Reilly wrote:
...
Bah, humbug. Think I'll go back to assembly language, where pointers do
what you tell them to, and don't complain about it to their lawyers.
...
Incorrect. It is not about "lawyers", it is about actual _crashes_. The
reason why 's - 1' itself can (an will) crash on certain platforms is
the same as the one that will make it crash in exactly the same way in
"assembly language" on such platforms.


No, my point was that the language lawyers have taken a perfectly
appealing and generally applicable abstraction, and outlawed certain
obvious constructions on the flimsy grounds that it was easier to
pervert the abstraction than to support it on some uncommon (or indeed
hypothetical) hardware.
Trying to implement the same code in assembly language on such a
platform would specifically force you to work around the potential
crash, sacrificing efficiency for safety. In other words, you'd be
forced to use different techniques for doing 's - 1' in contexts where
it might underflow and in contexts where it definitely will not
underflow.
The assembly language version of the algorithm would *not* crash, because
the assembly language of the perverted platform on which that was a
possibility would require a construction (probably using an explicit
integer array index, rather than pointer manipulation) that would cause
exactly zero inefficiency or impairment of safety. (Because the index is
only *used* in-bounds.)
C language, on the other hand, doesn't offer two different '-' operators
to for these two specific situations. Instead C language outlaws (in
essence) pointer underflows.
No it doesn't. The C language allows "inner" pointers to be passed to
functions, with no other way for the function to tell whether s - 1 is
legal or illegal in any particular call context. It is therefore clear
what the abstraction of pointer arithmetic implies. That some platforms
(may) have a problem with this is not the language's fault. It's just a
bit harder to support C on them. That's OK. There are plenty of other
languages that don't allow that construct at all (or even have pointers as
such), and they were clearly the targets in mind for the people who
developed such hardware. The standard authors erred. It should have been
incumbant on implementers on odd platforms to support the full power of
the language or not at all, rather than for all other (C-like) platforms
to carry the oddness around with them, in their code. However, it's clear
that's a very old mistake, and no-one's going to back away from it now.
This is a perfectly reasonable approach for a higher level language.


C is not a higher-level language. It's a universal assembler. Pick
another one.

--
Andrew

Mar 8 '06 #85
In article <pa************ *************** *@areilly.bpc-users.org>,
Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:
On Tue, 07 Mar 2006 13:28:37 -0500, Arthur J. O'Dwyer wrote:
K&R answers your question. If pa points to some element of an array,
then pa-1 points to the /previous element/. But what's the "previous
element" relative to the first element in the array? It doesn't exist. So
we have undefined behavior.


Only because the standard says so. Didn't have to be that way. There are
plenty of logically correct algorithms that could exist that involve
pointers that point somewhere outside of a[0..N]. As long as there's no
de-referencing, no harm, no foul. (Consider the simple case of iterating
through an array at a non-unit stride, using the normal p < s + N
termination condition. The loop finishes with p > s + N and the standard
says "pow, you're dead", when the semantically identical code written with
integer indexes has no legal problems.


Consider a typical implementation with 32 bit pointers and objects that
can be close to 2 GByte in size.

typedef struct { char a [2000000000]; } giantobject;

giantobject anobject;

giantobject* p = &anobject;
giantobject* q = &anobject - 1;
giantobject* r = &anobject + 1;
giantobject* s = &anobject + 2;

It would be very hard to implement this in a way that both q and s would
be valid; for example, it would be very hard to achieve that q < p, p <
r and r < s are all true. If q and s cannot be both valid, and there
isn't much reason why one should be valid and the other shouldn't, then
neither can be used in a program with any useful guarantees by the
standard.
Mar 8 '06 #86
"Clark S. Cox III" wrote:
"Peter Harrison" <pe***@cannock. ac.uk> said:
Paul Burke <pa**@scazon.co m> writes:

My simple mind must be missing something big here. If for
pointer p, (p-1) is deprecated because it's not guaranteed that
it points to anything sensible, why is p++ OK? There's no
boundary checking in C (unless you put it in).

It's not deprecated, it's illegal. Once you have involved UB all
bets are off. Without the p-1 the p++ statements are fine, as long
as they don't advance the pointer more than one past the end of the
object.

It seems I too have a simple mind. I read the recent replies to this
and found myself not sure I am better off.

This is what I think I understand:

int x;
int *p;
int *q;

p = &x; /* is OK */


Correct, p now points to x


and a statement --p or p-- would be illegal. However p++ would be
legal. But *(++p) would be illegal, because it dereferences past
the confines of the object x.

--
"If you want to post a followup via groups.google.c om, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell. org/google/>
Also see <http://www.safalra.com/special/googlegroupsrep ly/>
Mar 8 '06 #87
In article <pa************ *************** *@areilly.bpc-users.org>,
Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:
On Tue, 07 Mar 2006 22:26:57 +0000, Keith Thompson wrote:
Andrew Reilly <an************ *@areilly.bpc-users.org> writes:
On Tue, 07 Mar 2006 13:28:37 -0500, Arthur J. O'Dwyer wrote:
K&R answers your question. If pa points to some element of an array,
then pa-1 points to the /previous element/. But what's the "previous
element" relative to the first element in the array? It doesn't exist.
So we have undefined behavior.

Only because the standard says so. Didn't have to be that way. There
are plenty of logically correct algorithms that could exist that involve
pointers that point somewhere outside of a[0..N]. As long as there's no
de-referencing, no harm, no foul. (Consider the simple case of
iterating through an array at a non-unit stride, using the normal p < s
+ N termination condition. The loop finishes with p > s + N and the
standard says "pow, you're dead", when the semantically identical code
written with integer indexes has no legal problems.


The standard is specifically designed to allow for architectures where
constructing an invalid pointer value can cause a trap even if the pointer
is not dereferenced.


And are there any? Any in common use? Any where the equivalent (well
defined) pointer+offset code would be slower?


Question: If the C Standard guarantees that for any array a, &a [-1]
should be valid, should it also guarantee that &a [-1] != NULL and that
&a [-1] < &a [0] and &a [-1] < &a [0]?

In that case, what happens when I create an array with a single element
that is an enormously large struct?
Mar 8 '06 #88
In article <pa************ *************** *@areilly.bpc-users.org>,
Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:
On Wed, 08 Mar 2006 00:56:03 +0000, Robin Haigh wrote:
It's not always equivalent. The trouble starts with

char a[8];
char *p;

for ( p = a+1 ; p < a+8 ; p += 2 ) {}

intending that the loop terminates on p == a+9 (since it skips a+8). But
how do we know that a+9 > a+8 ? If the array is right at the top of some
kind of segment, the arithmetic might have wrapped round.


a+9 > a+8 because a + 9 - (a + 8) == 1, which is > 0. Doesn't matter if
the signed or unsigned pointer value wrapped around in an intermediate
term. On many machines that's how the comparison is done anyway. You're
suggesting that having the compiler ensure that a+8 doesn't wrap around
wrt a is OK, but a+9 is too hard. I don't buy it.


I just tried the following program (CodeWarrior 10 on MacOS X):

#include <stdio.h>

#define SIZE (50*1000000L)
typedef struct {
char a [SIZE];
} bigstruct;

static bigstruct bigarray [8];

int main(void)
{
printf("%lx\n", (unsigned long) &bigarray [0]);
printf("%lx\n", (unsigned long) &bigarray [9]);
printf("%lx\n", (unsigned long) &bigarray [-1]);

if (&bigarray [-1] < & bigarray [0])
printf ("Everything is fine\n");
else
printf ("The C Standard is right: &bigarray [-1] is broken\n");

return 0;
}

The output is:

2008ce0
1cd30160
ff059c60
The C Standard is right: &bigarray [-1] is broken
Mar 8 '06 #89
On 8 Mar 2006 20:35:24 GMT, Chris Torek <no****@torek.n et> wrote:
Keith Thompson said:
Um, I always thought that "within" and "outside" were two different
things.


In article <du**********@n wrdmz03.dmz.ncs .ea.ibs-infra.bt.com>
Richard Heathfield <in*****@invali d.invalid> wrote:
Ask Jack to lend you his bottle. You'll soon change your mind.


To clarify a bit ...

A mathematician named Klein
Thought the Moebius band was divine
Said he, "If you glue
The edges of two
You'll get a weird bottle like mine!"

:-)

(A Moebius band has only one side. It is a two-dimensional object
that exists only in a 3-dimensional [or higher] space. A Klein
bottle can only be made in a 4-dimensional [or higher] space, and
is a 3-D object with only one side. The concept can be carried on
indefinitely , but a Klein bottle is hard enough to contemplate
already.)


But that was Felix. Who's Jack?

--
Al Balmer
Sun City, AZ
Mar 9 '06 #90

This thread has been closed and replies have been disabled. Please start a new discussion.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.