Making Fatal Hidden Assumptions - Page 34

CBFalconer

We often find hidden, and totally unnecessary, assumptions being
made in code. The following leans heavily on one particular
example, which happens to be in C. However similar things can (and
do) occur in any language.

These assumptions are generally made because of familiarity with
the language. As a non-code example, consider the idea that the
faulty code is written by blackguards bent on foulling the
language. The term blackguards is not in favor these days, and for
good reason. However, the older you are, the more likely you are
to have used it since childhood, and to use it again, barring
specific thought on the subject. The same type of thing applies to
writing code.

I hope, with this little monograph, to encourage people to examine
some hidden assumptions they are making in their code. As ever, in
dealing with C, the reference standard is the ISO C standard.
Versions can be found in text and pdf format, by searching for N869
and N1124. [1] The latter does not have a text version, but is
more up-to-date.

We will always have innocent appearing code with these kinds of
assumptions built-in. However it would be wise to annotate such
code to make the assumptions explicit, which can avoid a great deal
of agony when the code is reused under other systems.

In the following example, the code is as downloaded from the
referenced URL, and the comments are entirely mine, including the
'every 5' linenumber references.

/* Making fatal hidden assumptions */
/* Paul Hsiehs version of strlen.
http://www.azillionmonkeys.com/qed/asmexample.html

Some sneaky hidden assumptions here:
1. p = s - 1 is valid. Not guaranteed. Careless coding.
2. cast (int) p is meaningful. Not guaranteed.
3. Use of 2's complement arithmetic.
4. ints have no trap representations or hidden bits.
5. 4 == sizeof(int) && 8 == CHAR_BIT.
6. size_t is actually int.
7. sizeof(int) is a power of 2.
8. int alignment depends on a zeroed bit field.

Since strlen is normally supplied by the system, the system
designer can guarantee all but item 1. Otherwise this is
not portable. Item 1 can probably be beaten by suitable
code reorganization to avoid the initial p = s - 1. This
is a serious bug which, for example, can cause segfaults
on many systems. It is most likely to foul when (int)s
has the value 0, and is meaningful.

He fails to make the valid assumption: 1 == sizeof(char).
*/

#define hasNulByte(x) ((x - 0x01010101) & ~x & 0x80808080)
#define SW (sizeof (int) / sizeof (char))

int xstrlen (const char *s) {
const char *p; /* 5 */
int d;

p = s - 1;
do {
p++; /* 10 */
if ((((int) p) & (SW - 1)) == 0) {
do {
d = *((int *) p);
p += SW;
} while (!hasNulByte (d)); /* 15 */
p -= SW;
}
} while (*p != 0);
return p - s;
} /* 20 */

Let us start with line 1! The constants appear to require that
sizeof(int) be 4, and that CHAR_BIT be precisely 8. I haven't
really looked too closely, and it is possible that the ~x term
allows for larger sizeof(int), but nothing allows for larger
CHAR_BIT. A further hidden assumption is that there are no trap
values in the representation of an int. Its functioning is
doubtful when sizeof(int) is less that 4. At the least it will
force promotion to long, which will seriously affect the speed.

This is an ingenious and speedy way of detecting a zero byte within
an int, provided the preconditions are met. There is nothing wrong
with it, PROVIDED we know when it is valid.

In line 2 we have the confusing use of sizeof(char), which is 1 by
definition. This just serves to obscure the fact that SW is
actually sizeof(int) later. No hidden assumptions have been made
here, but the usage helps to conceal later assumptions.

Line 4. Since this is intended to replace the systems strlen()
function, it would seem advantageous to use the appropriate
signature for the function. In particular strlen returns a size_t,
not an int. size_t is always unsigned.

In line 8 we come to a biggie. The standard specifically does not
guarantee the action of a pointer below an object. The only real
purpose of this statement is to compensate for the initial
increment in line 10. This can be avoided by rearrangement of the
code, which will then let the routine function where the
assumptions are valid. This is the only real error in the code
that I see.

In line 11 we have several hidden assumptions. The first is that
the cast of a pointer to an int is valid. This is never
guaranteed. A pointer can be much larger than an int, and may have
all sorts of non-integer like information embedded, such as segment
id. If sizeof(int) is less than 4 the validity of this is even
less likely.

Then we come to the purpose of the statement, which is to discover
if the pointer is suitably aligned for an int. It does this by
bit-anding with SW-1, which is the concealed sizeof(int)-1. This
won't be very useful if sizeof(int) is, say, 3 or any other
non-poweroftwo. In addition, it assumes that an aligned pointer
will have those bits zero. While this last is very likely in
todays systems, it is still an assumption. The system designer is
entitled to assume this, but user code is not.

Line 13 again uses the unwarranted cast of a pointer to an int.
This enables the use of the already suspicious macro hasNulByte in
line 15.

If all these assumptions are correct, line 19 finally calculates a
pointer difference (which is valid, and of type size_t or ssize_t,
but will always fit into a size_t). It then does a concealed cast
of this into an int, which could cause undefined or implementation
defined behaviour if the value exceeds what will fit into an int.
This one is also unnecessary, since it is trivial to define the
return type as size_t and guarantee success.

I haven't even mentioned the assumption of 2's complement
arithmetic, which I believe to be embedded in the hasNulByte
macro. I haven't bothered to think this out.

Would you believe that so many hidden assumptions can be embedded
in such innocent looking code? The sneaky thing is that the code
appears trivially correct at first glance. This is the stuff that
Heisenbugs are made of. Yet use of such code is fairly safe if we
are aware of those hidden assumptions.

I have cross-posted this without setting follow-ups, because I
believe that discussion will be valid in all the newsgroups posted.

[1] The draft C standards can be found at:
<http://www.open-std.org/jtc1/sc22/wg14/www/docs/>

--
"If you want to post a followup via groups.google.c om, don't use
the broken "Reply" link at the bottom of the article. Click on
"show options" at the top of the article, then click on the
"Reply" at the bottom of the article headers." - Keith Thompson
More details at: <http://cfaj.freeshell. org/google/>
Also see <http://www.safalra.com/special/googlegroupsrep ly/>

Mar 6 '06

Subscribe Reply

351

13057

« First
<
24
32
33
34
35
36
>

Keith Thompson

Chris Torek <no****@torek.n et> writes:

Chris Torek <no****@torek.n et> writes:
And now the x86-64 is coming, and everything old will be new again.

Keith Thompson <ks***@mib.or g> writes:
As far as I can tell, the x86-64 uses (or at least is capable of
using) a flat 64-bit address space.

In article <ln************ @nuthaus.mib.or g>
Keith Thompson <ks***@mib.or g> wrote:
The piece I missed is that an x86-64 system can run 32-bit code. If I
compile and run a program on an x86-64 system, it uses 64-bit
pointers. If I compile a program on an x86-32 system and copy the
executable to an x86-64 system, it runs properly and uses 32-bit
pointers. (At least on the systems I have access to.)

Yes. I am not saying that x86-64 has re-created the old 80x86
segmentation model. No, this is merely the thin end of the wedge.
Segmentation will come back, sooner or later. :-)

Why does it need to?

If we restrict the discussion to hosted environments, the trend seems
to be toward 64-bit systems. That provides an address space that
should be big enough for at least several decades, even assuming
exponential grown in memory sizes. A flat 64-bit virtual address
space should be the simplest way to manage this, and the need to run
32-bit code should diminish over time.

Segmentation done right could be useful for bounds checking; assigning
a segment to each malloc()ed chunk of memory, and for each declared
object, could nearly eliminate buffer overruns. But it hasn't really
been done yet, and I'm not convinced it will be in the future.

Why do you think segmentation will come back?

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.

Mar 26 '06 #331

Chris Torek

In article <ln************ @nuthaus.mib.or g>
Keith Thompson <ks***@mib.or g> wrote:

Segmentation done right could be useful for bounds checking; assigning
a segment to each malloc()ed chunk of memory, and for each declared
object, could nearly eliminate buffer overruns. But it hasn't really
been done yet, and I'm not convinced it will be in the future.
Something like this *is* done on the AS/400.
Why do you think segmentation will come back?

When done right, it works quite well (see the AS/400) and allows
single-level-store (with "capability " protections). This is a very
functional and fast model (and it is "multiproce ssor-friendly" and
has other good properties).

(Right now, one big penalty for context switches in general is that
you lose cached data: TLBs, and RAM-cache in virtual cache systems.
This is partly patched-up, in some architectures at least, by
tagging TLB entries with "address space identifiers" and doing
flushes only when running out of ASIDs, but this is a kludge.)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Mar 26 '06 #332

Richard G. Riley

On 2006-03-25, Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:

This rule essentially means that *p-- is an invalid access mechanism,
unless peculiar care is taken to exit loops early, while *p++ is valid,
*only* because they made a particular exception for that particular case,
because they figured that C compilers on AS/400 systems could afford to
over-allocate all arrays by one byte, so that that last p++ would not
leave the pointer pointing to an "invalid" location. That's a hack, plain
and simple.

Having written a lot of low level stuff in years gone by in assembler,
c and c++ I have to agree with you. For *p-- to be invalid when we are
looking at possible home brew memory allocations and cleverly aligned
objects while allowing an out of range *p++ is a tad
inconsistent. Having said that I dont think I ever had any such
trap/breakdown so maybe I was lucky or too careful.

Mar 27 '06 #333

Dik T. Winter

In article <sl************ **********@rand om.yi.org> Jordan Abel <ra*******@gmai l.com> writes:

On 2006-03-26, Stephen Sprunk <st*****@sprunk .org> wrote:
It simply doesn't make sense to do things that way since the only
purpose is to allow violations of the processor's memory protection
model. Work with the model, not against it.

Because it's a stupid memory protection model.

Why can't the trap be caught and ignored?

It can be ignored. But the result is that the operation is a no-op. Again
consider:
char a[10];
char *p;
p = a - 1;
p = p + 1;
what is the value of p after the fourth statement if the trap in the third
statement is ignored?
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Mar 27 '06 #334

Andrew Reilly

On Mon, 27 Mar 2006 03:07:28 +0000, Dik T. Winter wrote:

In article <sl************ **********@rand om.yi.org> Jordan Abel <ra*******@gmai l.com> writes:
> On 2006-03-26, Stephen Sprunk <st*****@sprunk .org> wrote:
> > It simply doesn't make sense to do things that way since the only
> > purpose is to allow violations of the processor's memory protection
> > model. Work with the model, not against it.

>
> Because it's a stupid memory protection model.
>
> Why can't the trap be caught and ignored?

It can be ignored. But the result is that the operation is a no-op. Again
consider:
char a[10];
char *p;
p = a - 1;
p = p + 1;
what is the value of p after the fourth statement if the trap in the third
statement is ignored?

The trap isn't ignored. There is no trap: the platform's "sane C memory
model" compiler and run-time system updated p.array_index to -1 and
p.array_base to a.array_base at the third line, as expected. The trap
would be left enabled, so that it would actually hit if/when a real
pointer was formed from &p.array_bas e[p.C_pointer_ind ex] if/when *p was
ever referenced in the subsequent code.

Consequently, the above code leaves p == a, as expected, and no trap is
encountered. Neat, huh?

:-)

--
Andrew

Mar 27 '06 #335

Dave Thompson

On Wed, 15 Mar 2006 11:48:52 GMT, "Dik T. Winter" <Di********@cwi .nl>
wrote:

In article <11************ **********@j52g 2000cwj.googleg roups.com> "Ed Prochak" <ed*******@gmai l.com> writes:
> Dik T. Winter wrote:

...
> > Indeed. But even when we look at the published instructions C falls
> > short of providing a construct for every one. Where is the C construct
> > to do a multply step available in quite a few early RISC machines?
> > Note also that in assembler you can access the special bits indicating
> > overflow and whatever (if they are available on the machine). How to
> > do that in C?

>
> Well you cannot, but those processors did not even exist when C was
> created. So those features didn't make it. To some degree, C is more of
> a PDP assembler.

How do you get access to the condition bits?

On some PDP-11 models (only), the PSW is also addressable as memory,
somewhere in the vicinity of 0177660; I don't recall exactly.

Admittedly, even among the CPU design(er)s that do use condition codes
I know of no others that provided this option for accessing them.

(I'm not counting cases where an interrupt or trap, and sometimes at
least some calls, saves state including the CC on the stack or in
memory. That's much more common.)

- David.Thompson1 at worldnet.att.ne t

Mar 27 '06 #336

Richard Bos

Andrew Reilly <an************ *@areilly.bpc-users.org> wrote:

On Fri, 24 Mar 2006 08:20:12 +0000, David Holland wrote:
Because p -= 2, when performed on the pointer 1234:4, tries to deduct
8 from the offset field. This underflows and traps.

And this is the behaviour that is at odds with idiomatic C.

_Whose_ idiom? No programmer I'd respect writes such code intentionally.

Richard

Mar 27 '06 #337

Paul Keinanen

On Mon, 27 Mar 2006 04:07:44 GMT, Dave Thompson
<da************ *@worldnet.att. net> wrote:

On Wed, 15 Mar 2006 11:48:52 GMT, "Dik T. Winter" <Di********@cwi .nl>
wrote:
In article <11************ **********@j52g 2000cwj.googleg roups.com> "Ed Prochak" <ed*******@gmai l.com> writes:
> Dik T. Winter wrote: ...
> > Indeed. But even when we look at the published instructions C falls
> > short of providing a construct for every one. Where is the C construct
> > to do a multply step available in quite a few early RISC machines?
> > Note also that in assembler you can access the special bits indicating
> > overflow and whatever (if they are available on the machine). How to
> > do that in C?
>
> Well you cannot, but those processors did not even exist when C was
> created. So those features didn't make it. To some degree, C is more of
> a PDP assembler.

How do you get access to the condition bits?

If you are just interested in zero, negative, signed and unsigned
overflows, you do not need to read directly these bits. Using
conditional branches, you can determine which bits are set. In any
sensible architecture the conditional branch instructions do not alter
these bits, so by combining conditional branches, multiple bits (such
as C and V) can be obtained. The state of negative and zero bits can
easily be determined in C-language, however, getting carry, signed
overflow half-carry etc. is very problematic.
On some PDP-11 models (only), the PSW is also addressable as memory,
somewhere in the vicinity of 0177660; I don't recall exactly.

The PSW and all the general purpose registers (and hence you could get
the address of a register :-) are available in the 8 KiB I/O page
which is in the top of the physical memory starting at different
physical addresses in systems with 16, 18 or 22 physical address bits.

For 18 and 22 address bit systems, the I/O address had to be mapped to
the 64 KiB program address space, which on most operating systems
required special privileges and consumed 8 KiB of your precious 64 KiB
program address space. If you only needed the PSW, direct mapping of
the I/O would usually avoided by using the trapping mechanism. I used
it mostly to alter the trace bit, but of course, you could get also
the N, Z, C, V bits at once.

Paul

Mar 27 '06 #338

Dik T. Winter

In article <pa************ *************** *@areilly.bpc-users.org> Andrew Reilly <an************ *@areilly.bpc-users.org> writes:

On Mon, 27 Mar 2006 03:07:28 +0000, Dik T. Winter wrote:
In article <sl************ **********@rand om.yi.org> Jordan Abel <ra*******@gmai l.com> writes: ...
> Why can't the trap be caught and ignored?
It can be ignored. But the result is that the operation is a no-op. Again
consider:
char a[10];
char *p;
p = a - 1;
p = p + 1;
what is the value of p after the fourth statement if the trap in the third
statement is ignored?

The trap isn't ignored.

Eh? Jordan Abel asked why the trap can not be ignored.
The trap isn't ignored. There is no trap: the platform's "sane C memory
model" compiler and run-time system updated p.array_index to -1 and
p.array_base to a.array_base at the third line, as expected. The trap
would be left enabled, so that it would actually hit if/when a real
pointer was formed from &p.array_bas e[p.C_pointer_ind ex] if/when *p was
ever referenced in the subsequent code.

Consequently, the above code leaves p == a, as expected, and no trap is
encountered. Neat, huh?

How many instructions will it take in that case to dereference p?
--
dik t. winter, cwi, kruislaan 413, 1098 sj amsterdam, nederland, +31205924131
home: bovenover 215, 1025 jn amsterdam, nederland; http://www.cwi.nl/~dik/

Mar 27 '06 #339

Chris Dollin

Andrew Reilly wrote:

On Fri, 24 Mar 2006 12:04:43 -0700, Al Balmer wrote:
On Fri, 24 Mar 2006 10:05:23 +1100, Andrew Reilly
<an************ *@areilly.bpc-users.org> wrote:
Why do you want to create an address that you're not allowed to
dereference?

Because the ability to do so is implied by the syntax of pointer
arithmetic .

Heh. The presence of an open balcony on the 15th floor implies the
ability to jump off.

Sure. And it's OK to talk about it, too. No harm, no foul.

Forming a pointer to non-object space is "talking about it". Outlawing
talking about it goes against the grain of C, IMO.

The C standard don't /outlaw/ forming illegal pointer values; they
just say that if you do that, they don't say anything more about the
behaviour of your code, so if you want defined behaviour, you have
to look elsewhere for the definition.

If you're writing code that has, for whatever reason, to rely on
non-C-standard definitions, well then, rely on them. I've written
code that relies on non-C-standard behaviour, too - but I didn't
expect it to port everywhere, and I didn't expect such use to be
a requirement on future standardisation to support it, much as I
might like to; the leaves-it-undefined /allows/ the code to work
where it works.

--
Chris "x.f(y) == f(x, y) == (x, y).f" Dollin
The shortcuts are all full of people using them.

Mar 27 '06 #340