473,769 Members | 5,823 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

What does the standard say about array access wraparound?

If this:

int i,sum;
int *array;
for(sum=0, i=0; i<len; i++){
sum += array[i];
}

is converted to this (never mind why for the moment):

int i,sum;
int *array;
int *arrl;
arl=&array[-len];
for(sum=0,i=len ; i<2*len; i++){
sum += arrl[i];
}

it should give the same result. But there are some funny
things that can happen. For instance, if &array is 1000 and
len is 100000. In that case arrl will hold an address
(1000-100000) which presumably wraps around since the
pointer should be an unsigned int (whatever size int is).
The address it points to will be MAX_POINTER - 100000 + 1000.
When the second form loop loop begins i=len (100000) so
arrl[100000] will wrap back around and point to the same
place as array[0].

Or will it?

It seems possible that this sort of array access "off the top of
memory" could trigger a fault.

What does the C standard say about this (if anything)?

Thanks,

David Mathog
ma****@caltech. edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
Nov 14 '05
24 3822
David Mathog <ma****@caltech .edu> writes:
[...]
in ANSI C address 0 (NULL) is special, is address -1 (top of memory)
also special?


A null pointer in C is not necessarily "address 0". It can be
represented as an integer constant 0 in C source, but the actual
address could be anything. See section 5 of the C FAQ.

--
Keith Thompson (The_Other_Keit h) ks***@mib.org <http://www.ghoti.net/~kst>
San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
We must do something. This is something. Therefore, we must do this.
Nov 14 '05 #21
In <20************ *************** *@caltech.edu> David Mathog <ma****@caltech .edu> writes:
in ANSI C address 0 (NULL) is special,
Read the FAQ! NO address is special in C. The null pointer constant
need not correspond to any address.
is address -1 (top of memory) also special?
NO address is special in C. The result of converting -1 to a pointer
value is implementation-defined.
This must come up on microcontroller s and other similar small
computing devices. (Yes, those are usually programmed in assembler
but there are C compilers for them too.)


So what? Those C compilers provide all the extensions needed to access
all the underlying hardware features. And C code using them is inherently
non-portable. Furthermore, microcontroller s are notorious for having
multiple address spaces (e.g. internal ROM, internal RAM, external ROM,
external RAM).

Dan
--
Dan Pop
DESY Zeuthen, RZ group
Email: Da*****@ifh.de
Nov 14 '05 #22
On 3 Jun 2004 13:24:35 GMT
Da*****@cern.ch (Dan Pop) wrote:
In <20************ *************** *@caltech.edu> David Mathog <ma****@caltech .edu> writes:
in ANSI C address 0 (NULL) is special,


Read the FAQ! NO address is special in C. The null pointer constant
need not correspond to any address.
is address -1 (top of memory) also special?


NO address is special in C. The result of converting -1 to a pointer
value is implementation-defined.


So the standard says the following:

1. Pointer access to a memory block is valid when that pointer lies within that memory block or one address above it (but not one address
below it).

2. There is nothing special about either address 0 (typically used
as the NULL pointer, but not necessarily so) or the top of memory, or
any other memory location.

and real machines have the property:

3. Memory is finite.

So what exactly in the ANSI standard (as opposed to each compiler's implementation of it) guarantees that the following
code will work?

#define DTYPE int
#define ASIZE 100
DTYPE *pa;
DTYPE *pp;
DTYPE *plim;
pa=malloc(sizeo f(DTYPE)*ASIZE) ;
if(pa){
plim = &pa[ASIZE];
for(pp=pa; pp<plim; pp++){ /* some operation on *pp */}
}
else {
(void) fprintf(stderr, "Oops, malloc failed, exiting now...\n");
exit(EXIT_FAILU RE);
}

If malloc returns pa such that pa[ASIZE-1] is the last int at the
top of memory then the expression &pa[ASIZE] is going to resolve
to something peculiar (probably 0 in most implementations ) and
the test

pp < plim

will fail on every iteration.

In other words, I don't see how the C standard reconciles statement 1
(a pointer value to a memory location one unit above the allocated
block is ok) and statement 2 (there are no special memory locations) with statement 3 (memory is finite).

In a particular implementation I can see that this problem can be avoided by, for instance, not letting malloc or the compiler
allocate a block of memory which ends exactly at the top of memory, or by using memory pointers with more range than exists in physical memory.

The example above uses DTYPE just to indicate that this isn't
a problem for a particular data type, it could also occur
for huge structures or single characters. Unless something
else prevents it, ASIZE could always be adjusted upwards until
pa[ASIZE-1] fell at the top of memory and triggered the problem.

And yes, I do see that recoding to this:

/* check the allocated memory location, not one above it*/
plim = &(pa[ASIZE-1]);
for(pp=pa; pp<=plim; pp++){ /* some operation on *pp */}

avoids the test on a possibly whacky pointer value
no matter where pa falls in memory.

Statement 2 seems to not be entirely accurate in any case.
If in some implementation malloc were to return a memory block
which began with an address corresponding to the bit
representation of NULL the program would exit when it
checked the address returned, even though
memory was allocated at that location. Presumably no
extant malloc will return such a block. And that does make
the memory location corresponding to NULL "special"
at least to the extent that it cannot be returned by malloc,
nor released by free().
Regards,

David Mathog
ma****@caltech. edu
Manager, Sequence Analysis Facility, Biology Division, Caltech
Nov 14 '05 #23
In article <20************ *************** *@caltech.edu>
David Mathog <ma****@caltech .edu> writes:
So the standard says the following:

1. Pointer access to a memory block is valid when that pointer lies
within that memory block or one address above it (but not one address
below it).
Correct (although not in these words).
2. There is nothing special about either address 0 (typically used
as the NULL pointer, but not necessarily so) or the top of memory, or
any other memory location.
The C standards (C89/C90, "C95", C99) do not say this, but they do
not say there *is* something special about them, either. They
leave the details up to the implementor.
and real machines have the property:

3. Memory is finite.
Yes. The Standards' concerns with real machines are somewhat
tangential, though.
So what exactly in the ANSI standard (as opposed to each compiler's
implementati on of it) guarantees that the following
code will work?

#define DTYPE int
#define ASIZE 100
DTYPE *pa;
DTYPE *pp;
DTYPE *plim;
pa=malloc(sizeo f(DTYPE)*ASIZE) ;
if(pa){
plim = &pa[ASIZE];
for(pp=pa; pp<plim; pp++){ /* some operation on *pp */}
}
else {
(void) fprintf(stderr, "Oops, malloc failed, exiting now...\n");
exit(EXIT_FAILU RE);
}
The wording in the standard.

Which wording? Well, you have to put a number of pieces together,
such as this key section on relational operators:

[#5] When two pointers are compared, the result depends on
the relative locations in the address space of the objects
pointed to. ... If the
expression P points to an element of an array object and the
expression Q points to the last element of the same array
object, the pointer expression Q+1 compares greater than P.
If malloc returns pa such that pa[ASIZE-1] is the last int at the
top of memory then the expression &pa[ASIZE] is going to resolve
to something peculiar (probably 0 in most implementations ) and
the test

pp < plim

will fail on every iteration.
Yes, if malloc() returned such a "pa" and the machine worked in the
way you describe here, then "pp < plim" would fail. This would
contradict paragraph 5, rendering the implementation non-conforming.
In a particular implementation I can see that this problem can be
avoided by, for instance, not letting malloc or the compiler
allocate a block of memory which ends exactly at the top of memory, or
by using memory pointers with more range than exists in physical memory.
Those are two methods by which the implementation can correct the
problem and become conforming.
The example above uses DTYPE just to indicate that this isn't
a problem for a particular data type, it could also occur
for huge structures or single characters. Unless something
else prevents it, ASIZE could always be adjusted upwards until
pa[ASIZE-1] fell at the top of memory and triggered the problem.
The implementor can make use of implementation-specific tricks.
For instance, suppose that the absolute maximum alignment required
for any C code is 8 bytes (and the machine is a conventional 8-bit
byte-addressed one). Then malloc() need only avoid handing out
"last 8" bytes of the total address space.
If in some implementation malloc were to return a memory block
which began with an address corresponding to the bit
representati on of NULL ...


In this case, the implementation might fail to conform -- although
actually *deciding* this is another matter entirely, since the
observable behavior is the same as "malloc() was unable to get
memory". In other words, if malloc returns a value that compares
equal to NULL, malloc() has failed to obtain memory, even if the
implementor incorrectly thinks it has succeeded -- but malloc() is
*always* allowed to fail, so the implementor has simply produced
a poor implementation, rather than a non-conforming one.

In other words, if malloc() returns a pointer that compares equal
to NULL even though there is memory available and the memory has
been allocated, then the implementor has goofed. The malloc()
function has a bug. The bug does not make the implementation
non-conforming; it just reflects badly on the implementor. :-)

This is typical of a standard, or indeed any work that attempts to
describe "desired outcome" instead of "mechanism" . One does not
prescribe how malloc() is supposed to work, or the bit patterns
for various null-pointers; instead, one says "when malloc() succeeds,
the returned value compares unequal to NULL" or "if P points into
an array, and P+1 is `one-past-the-end', then computing P+1 is OK
and (P+1)>P produces the value 1" -- without saying *how* these
are to be achieved, so that implementors are free to come up with
new, wonderful ways of achieving them.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.
Nov 14 '05 #24
David Mathog wrote: [long lines wrapped for legibility]
On 3 Jun 2004 13:24:35 GMT
Da*****@cern.ch (Dan Pop) wrote:

In <20************ *************** *@caltech.edu> David Mathog <ma****@caltech .edu> writes:

in ANSI C address 0 (NULL) is special,
Read the FAQ! NO address is special in C. The null pointer constant
need not correspond to any address.

is address -1 (top of memory) also special?


NO address is special in C. The result of converting -1 to a pointer
value is implementation-defined.

So the standard says the following:

1. Pointer access to a memory block is valid when that pointer lies
within that memory block or one address above it (but not one address
below it).


Depends what you mean by "pointer access to a memory block." It's
legal to compute a pointer value designating any element of an array
(considering a free-standing object to be an array of one element),
and it's legal to use such a value to access the array element. It's
also legal to compute `&array[N]' in an N-element array, and it's legal
to use this value in comparisons and for further arithmetic, but it's
*not* legal to use this value to access that non-existent array element.
2. There is nothing special about either address 0 (typically used
as the NULL pointer, but not necessarily so) or the top of memory, or
any other memory location.
True. There is not even a requirement that memory addresses be
numbers.
and real machines have the property:

3. Memory is finite.
Also, a pointer value has a finite number of bits: Even if you
had infinite memory, a C program could use only a finite amount of it.
So what exactly in the ANSI standard (as opposed to each compiler's
implementation of it) guarantees that the following
code will work?

#define DTYPE int
#define ASIZE 100
DTYPE *pa;
DTYPE *pp;
DTYPE *plim;
pa=malloc(sizeo f(DTYPE)*ASIZE) ;
if(pa){
plim = &pa[ASIZE];
for(pp=pa; pp<plim; pp++){ /* some operation on *pp */}
}
else {
(void) fprintf(stderr, "Oops, malloc failed, exiting now...\n");
exit(EXIT_FAILU RE);
}

If malloc returns pa such that pa[ASIZE-1] is the last int at the
top of memory then the expression &pa[ASIZE] is going to resolve
to something peculiar (probably 0 in most implementations ) and
the test

pp < plim

will fail on every iteration.
The implementation must make this work "somehow." The Standard
doesn't specify the "how," but it requires the "what."
In other words, I don't see how the C standard reconciles statement 1
(a pointer value to a memory location one unit above the allocated
block is ok) and statement 2 (there are no special memory locations)
with statement 3 (memory is finite).

In a particular implementation I can see that this problem can be
avoided by, for instance, not letting malloc or the compiler
allocate a block of memory which ends exactly at the top of memory,
or by using memory pointers with more range than exists in physical
memory.
The first of these stratagems is commonly used. In the second
I think you probably mean "virtual" instead of "physical;" I haven't
encountered an implementation that works this way, but such a thing
could certainly be done.
The example above uses DTYPE just to indicate that this isn't
a problem for a particular data type, it could also occur
for huge structures or single characters. Unless something
else prevents it, ASIZE could always be adjusted upwards until
pa[ASIZE-1] fell at the top of memory and triggered the problem.
Doesn't matter. One unallocated byte suffices for the first
stratagem, and one unused pointer-value bit is enough for the
second. Remember, the "one past the end" pointer does not point
to an actual DTYPE object; there need not be sizeof(DTYPE) bytes
at that spot. All that's required is that the first byte of the
non-existent element be "addressabl e;" there's no need for any
additional bytes' addresses to make any sense.
And yes, I do see that recoding to this:

/* check the allocated memory location, not one above it*/
plim = &(pa[ASIZE-1]);
for(pp=pa; pp<=plim; pp++){ /* some operation on *pp */}

avoids the test on a possibly whacky pointer value
no matter where pa falls in memory.
`plim' would not be "whacky," but `pp' becomes so on the
final iteration.
Statement 2 seems to not be entirely accurate in any case.
If in some implementation malloc were to return a memory block
which began with an address corresponding to the bit
representation of NULL the program would exit when it
checked the address returned, even though
memory was allocated at that location. Presumably no
extant malloc will return such a block. And that does make
the memory location corresponding to NULL "special"
at least to the extent that it cannot be returned by malloc,
nor released by free().


The Standard does not require the existence of a "memory
location corresponding to NULL." It's true that on many machines
the representation of a null pointer would "work" as an address
if somehow fed into a load or store or other machine instruction.
But C does not require this, and (sez the FAQ) there have been
machines that implemented NULL values differently.

On "practical" machines, where NULL is "address zero" and an
"end of memory" exists, it usually turns out that keeping these
locations off-limits to C programs is no hardship. For instance,
some systems put a stack at the top of memory and let it grow
downward; if they can guarantee that the first thing pushed on
the stack is not a data object -- a return address, say -- then
there's no way a program can get a data object to butt against
the end of memory. The addresses starting at zero and working
upwards might be used for environment variables, or for data
exchange with the host system -- or simply made inaccessible
altogether, as a debugging aid. The upshot is that the program
and its data fit "between" the extremes of the hypothetical range
of addresses without coming "too close" to either end.

... and, of course, the Standard permits any other shenanigans
the implementation chooses to indulge in, provided the pointer
calculations produce the results they're supposed to.

--
Er*********@sun .com

Nov 14 '05 #25

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
7484
by: Dave Rahardja | last post by:
I've tried looking this topic up in the standard manual but came up empty... 1. What is the value of an unsigned integral type after it is decremented below zero? 2. What is the value of an unsigned integral type after it is incremented past its maximum value? 3. What is the value of a signed integral type after it is decremented below
125
14843
by: Sarah Tanembaum | last post by:
Beside its an opensource and supported by community, what's the fundamental differences between PostgreSQL and those high-price commercial database (and some are bloated such as Oracle) from software giant such as Microsoft SQL Server, Oracle, and Sybase? Is PostgreSQL reliable enough to be used for high-end commercial application? Thanks
12
3304
by: Steven T. Hatton | last post by:
This is something I've been looking at because it is central to a currently broken part of the KDevelop new application wizard. I'm not complaining about it being broken, It's a CVS images. Such things happen. The whole subsystem is going through radical changes. I don't really want to say what I think of the code just yet. That would influence the opinions of others, and I really want to know how other people view these things,...
140
7900
by: Oliver Brausch | last post by:
Hello, have you ever heard about this MS-visual c compiler bug? look at the small prog: static int x=0; int bit32() { return ++x; }
2
2242
by: Thomas G. Marshall | last post by:
Arthur J. O'Dwyer <ajo@nospam.andrew.cmu.edu> coughed up the following: > On Thu, 1 Jul 2004, Thomas G. Marshall wrote: >> >> Aside: I've looked repeatedly in google and for some reason cannot >> find what is considered to be the latest ansi/iso C spec. I cannot >> even find C99 in its final draft. Where in ansi.org or the like do >> I find it? > > The official C99 specification is copyright ISO and distributed by > various national...
6
5765
by: alternativa | last post by:
Hi, I have problem with the following function - it was intended to ask a user for a 4-digits number: double ask_for_number (void) { char *notint; char s2; double entered_number;
5
348
by: sherifffruitfly | last post by:
Hi, I'm just learning cpp, and the exercise I'm working on is basically as follows: 1) Create a struct type with 4 members (char, char, char, int). 2) Create an array of, say 3 instances of the struct, and populate them with data. 3) cin 1, 2, 3, or 4 from the user 4) If the user selected, say, 2, display the contents of the 2nd data
669
26192
by: Xah Lee | last post by:
in March, i posted a essay “What is Expressiveness in a Computer Languageâ€, archived at: http://xahlee.org/perl-python/what_is_expresiveness.html I was informed then that there is a academic paper written on this subject. On the Expressive Power of Programming Languages, by Matthias Felleisen, 1990. http://www.ccs.neu.edu/home/cobbe/pl-seminar-jr/notes/2003-sep-26/expressive-slides.pdf
8
6645
by: Dan | last post by:
Hey hey, I'm trying to code a program for generating cyclic cellular automaton (http://en.wikipedia.org/wiki/Cyclic_cellular_automaton) and have gotten it working well enough to generate pretty pictures but have run into a problem with it wrapping around the array properly when wanting to check cell values beyond the edge of the screen. In the picture link it shows that the wrapping works correctly when cells are checking the value...
0
9423
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10039
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9990
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9860
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8869
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7406
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6668
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5297
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3955
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.