ANSI C question about 'volatile'

Tim Rentsch

I have a question about what ANSI C allows/requires in a particular
context related to 'volatile'. Consider the following:

volatile int x;

int
x_remainder_arg( int y ){
return x % y;
}

Suppose we had a machine that doesn't have a remainder
instruction and the compiler implements remainder using
the re-writing

a % b :=: a - a/b*b

Let's assume for the sake of discussion that the identity
is ok in the value sense - so if 'a' and 'b' are values,
the expression on the right hand side yields the correct
value. My question is how does 'x' being 'volatile' affect
things? For example, if 'x_remainder_arg' were compiled
as though it were written like this (and no optimization):

int
x_remainder_arg_1( int y ){
int t = x;
return t - t/y*y;
}

that seems alright. But what if 'x_remainder_arg' were
compiled as though it were written like this:

int
x_remainder_arg_2( int y ){
return x - x/y*y;
}

Note that 'x', which is 'volatile', is referenced only
once in the original function, but has two references
in the compiled code.

Now for my questions:

1) Does the ANSI C standard _permit_ an interpretation
like 'x_remainder_arg_2' where two references to 'x'
are made in the compiled code when only one reference
to 'x' is made in the source? If so, why?

2) Does the ANSI C standard _forbid_ an interpretation
like 'x_remainder_arg_2' where two references to 'x'
are made in the compiled code when only one reference
to 'x' is made in the source? If so, why?
Please note that I am not asking what the compiler
ought to do, or what the standard ought to permit
or to forbid. I'm asking only what the standard
does permit or does forbid.

If different versions of the ANSI standard say different
things on these questions I'd like to know that too.
Please identify which version is being referenced if
that's relevant.

thanks!

Nov 14 '05 #1

Subscribe Post Reply

2729

Jack Klein

On 12 Aug 2004 19:55:43 -0700, Tim Rentsch <tx*@alumnus.caltech.edu>
wrote in comp.lang.c:

I have a question about what ANSI C allows/requires in a particular
context related to 'volatile'. Consider the following:

volatile int x;

int
x_remainder_arg( int y ){
return x % y;
}

Suppose we had a machine that doesn't have a remainder
instruction and the compiler implements remainder using
the re-writing

a % b :=: a - a/b*b

Let's assume for the sake of discussion that the identity
is ok in the value sense - so if 'a' and 'b' are values,
the expression on the right hand side yields the correct
value. My question is how does 'x' being 'volatile' affect
things? For example, if 'x_remainder_arg' were compiled
as though it were written like this (and no optimization):

int
x_remainder_arg_1( int y ){
int t = x;
return t - t/y*y;
}

that seems alright. But what if 'x_remainder_arg' were
compiled as though it were written like this:

int
x_remainder_arg_2( int y ){
return x - x/y*y;
}

Note that 'x', which is 'volatile', is referenced only
once in the original function, but has two references
in the compiled code.

Now for my questions:

1) Does the ANSI C standard _permit_ an interpretation
like 'x_remainder_arg_2' where two references to 'x'
are made in the compiled code when only one reference
to 'x' is made in the source? If so, why?
No.
2) Does the ANSI C standard _forbid_ an interpretation
like 'x_remainder_arg_2' where two references to 'x'
are made in the compiled code when only one reference
to 'x' is made in the source? If so, why?
Yes.
Please note that I am not asking what the compiler
ought to do, or what the standard ought to permit
or to forbid. I'm asking only what the standard
does permit or does forbid.

If different versions of the ANSI standard say different
things on these questions I'd like to know that too.
Please identify which version is being referenced if
that's relevant.

thanks!

Your "If so, why?" questions have to do with the rationale, which is
not part of the standard. Unless you are happy with the answer
"because the standard says so". All versions of the standard.

On the other hand, if you want references to portions of the standard,
here are some from the current (1999) version:

Definition of the volatile type qualifier, 6.7.3 paragraph 6:

"An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects.
Therefore any expression referring to such an object shall be
evaluated strictly according to the rules of the abstract machine,
as described in 5.1.2.3."

And 5.1.2.3 paragraph 3:

"In the abstract machine, all expressions are evaluated as specified
by the semantics."

Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #2

Tim Rentsch

Jack Klein <ja*******@spamcop.net> writes:

Your "If so, why?" questions have to do with the rationale, which is
not part of the standard. Unless you are happy with the answer
"because the standard says so". All versions of the standard.
Sorry I was unclear on this. What I meant was "what is your
reasoning?" (which you addressed following...)

On the other hand, if you want references to portions of the standard,
here are some from the current (1999) version:

Definition of the volatile type qualifier, 6.7.3 paragraph 6:

"An object that has volatile-qualified type may be modified in ways
unknown to the implementation or have other unknown side effects.
Therefore any expression referring to such an object shall be
evaluated strictly according to the rules of the abstract machine,
as described in 5.1.2.3."

And 5.1.2.3 paragraph 3:

"In the abstract machine, all expressions are evaluated as specified
by the semantics."
I saw these paragraphs (or something very much like them). What
I did not see was a clear statement about what the semantics
implied in the case of volatile. I looked! Maybe I overlooked
something, but I didn't see any clear statement about what the
semantics specify (do you have a section reference?). Hence
my questions.
Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.

This piece is what I'm missing - where is the statement of "what
the semantics specify"?

thanks!

Nov 14 '05 #3

Chris Torek

In article <ga********************************@4ax.com>
Jack Klein <ja*******@spamcop.net> writes:

Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.

On the other hand, the same C Standard says:

... What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

which leaves the implementor a truck-sized loophole: he can simply
define away all but one of the actual memory references, leaving
only one of them as an "access".

I would encourage people not to buy such an implementation, though. :-)
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #4

CBFalconer

Tim Rentsch wrote:

Jack Klein <ja*******@spamcop.net> writes:
.... snip ...

Definition of the volatile type qualifier, 6.7.3 paragraph 6:
.... snip ...
as described in 5.1.2.3."

And 5.1.2.3 paragraph 3:

.... snip ...
I saw these paragraphs (or something very much like them). What
I did not see was a clear statement about what the semantics
implied in the case of volatile. I looked! Maybe I overlooked
something, but I didn't see any clear statement about what the
semantics specify (do you have a section reference?). Hence
my questions.

I think you need to take Richard Heathfields course on "Reading
for Comprehension".

--
Chuck F (cb********@yahoo.com) (cb********@worldnet.att.net)
Available for consulting/temporary embedded and systems.
<http://cbfalconer.home.att.net> USE worldnet address!

Nov 14 '05 #5

Eric Sosman

Chris Torek wrote:

In article <ga********************************@4ax.com>
Jack Klein <ja*******@spamcop.net> writes:
Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.

On the other hand, the same C Standard says:

... What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

which leaves the implementor a truck-sized loophole: he can simply
define away all but one of the actual memory references, leaving
only one of them as an "access".

I would encourage people not to buy such an implementation, though. :-)

Hard to avoid such gaps, I think.

struct s {
double trouble;
char coal[2048];
long long way_to_tipperary;
};
volatile struct s s1;
struct s s2;
...
s2 = s1;

Most implementations, I think, would be forced to define the
single C-level "access" to `s1' in terms of multiple hardware-
level "accesses."

--
Er*********@sun.com

Nov 14 '05 #6

j0mbolar

Eric Sosman <Er*********@sun.com> wrote in message news:<41**************@sun.com>...

Chris Torek wrote:
In article <ga********************************@4ax.com>
Jack Klein <ja*******@spamcop.net> writes:
Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.

On the other hand, the same C Standard says:

... What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

which leaves the implementor a truck-sized loophole: he can simply
define away all but one of the actual memory references, leaving
only one of them as an "access".

I would encourage people not to buy such an implementation, though. :-)

Hard to avoid such gaps, I think.

struct s {
double trouble;
char coal[2048];
long long way_to_tipperary;
};
volatile struct s s1;
struct s s2;
...
s2 = s1;

Most implementations, I think, would be forced to define the
single C-level "access" to `s1' in terms of multiple hardware-
level "accesses."

this begs the question, what in the world is multiple hardware-level
accesses? and how does this affect something volatile?

Nov 14 '05 #7

Eric Sosman

j0mbolar wrote:

Eric Sosman <Er*********@sun.com> wrote in message news:<41**************@sun.com>...
Chris Torek wrote:
In article <ga********************************@4ax.com>
Jack Klein <ja*******@spamcop.net> writes:
Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.
On the other hand, the same C Standard says:

... What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

which leaves the implementor a truck-sized loophole: he can simply
define away all but one of the actual memory references, leaving
only one of them as an "access".

I would encourage people not to buy such an implementation, though. :-)

Hard to avoid such gaps, I think.

struct s {
double trouble;
char coal[2048];
long long way_to_tipperary;
};
volatile struct s s1;
struct s s2;
...
s2 = s1;

Most implementations, I think, would be forced to define the
single C-level "access" to `s1' in terms of multiple hardware-
level "accesses."

this begs the question, what in the world is multiple hardware-level
accesses? and how does this affect something volatile?

Implementation-defined, of course. ;-)

Suppose some particular platform implements the above as
2064 byte-to-byte copies. Another might implement it as 516
word-to-word copies, and still another as 258 longword-to-
longword copies. The 2064 or 516 or 258 fetches are the
"hardware-level accesses" to the volatile object. No machine
I'm aware of can do the whole job with just one fetch, so
`volatile' cannot require it to do so.

What are the consequences? Well, the intent of `volatile'
is to inform the compiler that an object's value might change
in ways the compiler cannot discover[*]. So let's suppose that
there's some agency "out there" that occasionally changes the
contents of `s1'. If the change occurs when the copy has
performed only the first 100 of its fetches, what ends up in
`s2' is anyone's guess. `volatile' cannot imply atomicity.
[*] Or to tell it that apparently useless assignments to
the variable are important. For example, in code
like `s1.trouble = 0.0; s1.trouble = 42.0;' the
first assignment must not be optimized away.

--
Er*********@sun.com

Nov 14 '05 #8

Chris Torek

>> Chris Torek wrote:

> ... the same C Standard says:
> ... What constitutes an access to an object that has
> volatile-qualified type is implementation-defined.
> which leaves the implementor a truck-sized loophole: he can simply
> define away all but one of the actual memory references, leaving
> only one of them as an "access".

Eric Sosman <Er*********@sun.com> wrote in message
news:<41**************@sun.com>...
Hard to avoid such gaps, I think.
[example snipped, but it includes a large structure assignment] Most implementations, I think, would be forced to define the
single C-level "access" to `s1' in terms of multiple hardware-
level "accesses."

Indeed.

In article <news:2d************************@posting.google.co m>
j0mbolar <j0******@engineer.com> asked:this begs the question, what in the world is multiple hardware-level
accesses? and how does this affect something volatile?

The *intent* of the C Standard is clear: the hardware has some
set(s) of instruction(s) that perform hardware-level access, and
there is some mapping from "hardware access" to "C code". That
mapping is allowed to be optimized as much as possible *except*
in the presence of "volatile" qualifiers, where the mapping should
be as direct as possible.

Suppose we have a conventional load/store architecture, for instance,
in which there are only "two kinds" of "hardware access": the "load"
and the "store". In assembly these are achieved via "ld" and "st"
instructions. Only one "bus width" is supported (the 32-bit-word),
so that:

ld r1,(r2)
st r1,(r3)

"means": "do a 32-bit bus access to the address given by r2, putting
the value retrieved into r1; then do a 32-bit bus access to the
address given by r3, storing the value now in r1".

Hardware *devices* may then respond in particular (and peculiar)
ways to these two hardware-level bus transactions.

Since the C compiler for this particular machine has 32-bit "int"s,
we can do the same in C with:

int r1;
volatile int *r2, *r3;
r1 = *r2;
*r3 = r1;

and "expect" the C compiler to generate the "obvious" code (although
the register numbers might change in the process). The C Standard
gives us (C programmers) "volatile" to do it, but does not promise
us that the compiler will accede to our wishes; it is up to us to
obtain a C compiler that actually does so.

What happens, though, if we have a 16-bit or 8-bit hardware device
and have to connect it to this machine? The *machine* is PHYSICALLY
INCAPABLE of doing anything other than a 32-bit-wide access. How
can we take an AMD "Lance" Ethernet device, with its two 16-bit
registers, and make it work with this (MIPS-R2000-like) CPU?

The answer in this case was to put the 16-bit registers on 32-bit
boundaries:

struct lance_registers {
uint16_t pad1;
uint16_t rap; /* Register Address Port */
uint16_t pad2;
uint16_t rdp; /* Register Data Port */
}; /* (I might have the address and data ports backwards) */

This, however, is *not* how it is done on a conventional 80x86-like
CPU, which *does* have multiple different bus-size-transactions.
Here the compiler should use 16-bit bus accesses for 16-bit integers,
and 8-bit bus accesses for 8-bit integers, and the two "pad"s go
away in the structure.

Moreover, the 80x86 has what are called "read-modify-write" bus
cycles, as did the PDP-11 and VAX. Some PDP-11 Unibus hardware
devices *required* certain operations to use these r/m/w cycles
to obtain predictable results. To get such a bus cycle, an assembler
programmer might use the "bis" or "bic" instructions on the VAX:

bisw2 r1,(r6)

This instruction reads from the (presumably Unibus) location given
by r6, sets the bits given by r1, and writes the result back, all
within a single bus operation using the "r/m/w" cycle. The C programmer
familiar with all this would write the code as:

*r6 |= r1;

and "expect" to get the same bisw2 instruction (provided r6 has
type "volatile unsigned short *" or similar). Writing:

*r6 = *r6 | r1;

would instead produce an assembler sequence like:

movzwl (r6),r0 # or perhaps just movw
bisl2 r1,r0 # in which case this would be a bisw2
movw r0,(r6)

Again, while "volatile" is *necessary* to tell the compiler "please
do not attempt to optimize this", it is not *sufficient* -- the
compilre must actually generate different code for the "|=" operation.
A similer compiler on a load/store architecture *cannot* generate
a single instruction for this, though, because there IS NO SUCH
SINGLE INSTRUCTION (and there are no r/m/w bus cycles).

The answers to j0mbolar's questions, then, are: "access" is really
defined by the hardware, and as C programmers, we have to know not
only what the hardware does, but also whether we can convince our
C compilers to generate the necessary code. When C's types and
operations "map nicely" onto the hardware, we can expect, and should
really demand, that our C compilers do the "obvious thing".

What about the cases where C's types and operations do not fit well
with the hardware-level operations? Consider the V8 SPARC's "ldstub"
(load/store unsigned byte) instruction, or V9's compare and swap;
the 80x86 compare-and-exchange instructions; and the MIPS and PowerPC
style "load linked / store conditional" pairs. The ldstub
instruction is defined as an atomic bus cycle that:

- reads a byte from memory
- stores 0xff into memory

and gives you the original byte in the register. If two devices
or processors attempt this at the "same time", and the byte is
originally not 0xff, one of them will "see" the original byte and
the other will see the 0xff. The compare and swap (aka compare
and exchange) instructions, which are more powerful, take two
registers and a memory location and atomically:

- compare the first register with the memory value
- if they are equal, change the memory value to the second register,
but if they are not equal, leave the memory value alone
- leave the result of the comparison or the original memory value
(or both) in one of the registers and/or in some condition codes

The ll/sc sequence, which is perhaps the most powerful of all,
loads a value from memory into a register, and then later stores
a new value (as given by a register) into that memory location but
only if no one else has changed it yet. (This is done through the
cache protocols -- the CPU cache uses MESI or MOESI to cooperate
with other devices, and is alerted if the value gets changed between
the two separate instructions. While CAS can be used to implement
atomic adds and mutexes, LL+SC can be used to implement atomic
queues.)

The closest one can come to writing CAS in C, for instance, is:

tmp = *mem;
if (tmp == r1)
*mem = r2;
r1 = tmp;

but all this happens in a single bus cycle. There is no C operator
that compresses this down to one operation. The LL/SC sequence
actually takes multiple bus cycles and cannot be expressed at all
in C.

Today, the usual tack for handling the "cannot be written in C at
all" instructions is to use assembly code -- either a C-callable
subroutine, or inline expansion.
--
In-Real-Life: Chris Torek, Wind River Systems
Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
email: forget about it http://web.torek.net/torek/index.html
Reading email is like searching for food in the garbage, thanks to spammers.

Nov 14 '05 #9

Jack Klein

On 13 Aug 2004 03:53:46 GMT, Chris Torek <no****@torek.net> wrote in
comp.lang.c:

In article <ga********************************@4ax.com>
Jack Klein <ja*******@spamcop.net> writes:
Since the semantics of the source code specify that the function
accesses the object exactly once, a conforming implementation may not
access a volatile object more or less than once.

On the other hand, the same C Standard says:

... What constitutes an access to an object that has
volatile-qualified type is implementation-defined.

which leaves the implementor a truck-sized loophole: he can simply
define away all but one of the actual memory references, leaving
only one of them as an "access".

I would encourage people not to buy such an implementation, though. :-)

But surely you know, as you posted later on in this thread, that this
is really only an artifact of a poor choice of wording in the
standard, as has been discussed, without a satisfactory conclusion, in
comp.std.c.

You yourself bring up the issue of a number of relatively small sized
volatile objects sharing a single bus-width address. And there is the
opposite example, for example a 16-bit wide volatile object addressed
over an 8-bit bus. Indeed, a volatile int (minimum 16 bits) will
always require two bus cycles to read on an 8051 family derivative.

But no compiler worth its salt could wiggle out of making multiple
accesses to free-standing volatile object when the semantics specify a
single access.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html

Nov 14 '05 #10

ANSI C question about 'volatile'

Similar topics