Re: C++0x: release sequence

On Jun 16, 3:09 pm, Anthony Williams <anthony....@gmail.comwrote:

Note that the
use of fences in the C++0x WP has changed this week from
object-specific fences to global fences. See Peter Dimov's paper
N2633:http://www.open-std.org/JTC1/SC22/WG...008/n2633.html

Yes, I've already read this. It's just GREAT! It's far more useful and
intuitive.
And it contains clear and simple binding to memory model, i.e.
relations between acquire/release fences; and between acquire/release
fences and acquire/release operations.

Is it already generally approved by memory model working group?
For atomic_fence() I'm not worry :) But what about complier_fence()?

Btw, I see some problems in Peter Dimov's proposal.
First, it's possible to write:

x.store(1, std::memory_order_relaxed);
std::atomic_compiler_fence(std::memory_order_relea se);
y.store(1, std::memory_order_relaxed);

But it's not possible to write:

x.store(1, std::memory_order_relaxed);
y.store(1, std::memory_order_relaxed_but_compiler_order_relea se);
// or just y.store(1, std::compiler_order_release);

I.e. it's not possible to use complier ordering, when using acquire/
release operations. It's a bit inconsistent, especially taking into
account that acquire/release operations are primary and standalone
bidirectional fences are supplementary.

Second, more important moment. It's possible to write:

//thread 1:
data = 1;
std::atomic_memory_fence(std::memory_order_release );
x.store(1, std::memory_order_relaxed);

//thread 2:
if (x.load(std::memory_order_acquire))
assert(1 == data);

But it's not possible to write:

//thread 1:
data = 1;
z.store(1, std::memory_order_release);
x.store(1, std::memory_order_relaxed);

//thread 2:
if (x.load(std::memory_order_acquire))
assert(1 == data);

From point of view of Peter Dimov's proposal, this core contains race
on 'data'.

I think there must be following statements:

- release operation *is a* release fence
- acquire operation *is a* acquire fence

So this:
z.store(1, std::memory_order_release);
basically transforms to:
std::atomic_memory_fence(std::memory_order_release );
z.store(1, std::memory_order_release);

Then second example will be legal. What do you think?

Dmitriy V'jukov

Jun 27 '08 #1

Subscribe Post Reply

2791

Anthony Williams

"Dmitriy V'jukov" <dv*****@gmail.comwrites:

On Jun 16, 3:09 pm, Anthony Williams <anthony....@gmail.comwrote:

>Note that the
use of fences in the C++0x WP has changed this week from
object-specific fences to global fences. See Peter Dimov's paper
N2633:http://www.open-std.org/JTC1/SC22/WG...008/n2633.html

Yes, I've already read this. It's just GREAT! It's far more useful and
intuitive.
And it contains clear and simple binding to memory model, i.e.
relations between acquire/release fences; and between acquire/release
fences and acquire/release operations.

Is it already generally approved by memory model working group?
For atomic_fence() I'm not worry :) But what about complier_fence()?

Yes. It's been approved to be applied to the WP with minor renamings
(atomic_memory_fence -atomic_thread_fence, atomic_compiler_fence ->
atomic_signal_fence)

Btw, I see some problems in Peter Dimov's proposal.
First, it's possible to write:

x.store(1, std::memory_order_relaxed);
std::atomic_compiler_fence(std::memory_order_relea se);
y.store(1, std::memory_order_relaxed);

But it's not possible to write:

x.store(1, std::memory_order_relaxed);
y.store(1, std::memory_order_relaxed_but_compiler_order_relea se);
// or just y.store(1, std::compiler_order_release);

I.e. it's not possible to use complier ordering, when using acquire/
release operations. It's a bit inconsistent, especially taking into
account that acquire/release operations are primary and standalone
bidirectional fences are supplementary.

You're right that you can't do this. I don't think it's a problem as
compiler orderings are not really the same as the inter-thread
orderings.

Second, more important moment. It's possible to write:

//thread 1:
data = 1;
std::atomic_memory_fence(std::memory_order_release );
x.store(1, std::memory_order_relaxed);

//thread 2:
if (x.load(std::memory_order_acquire))
assert(1 == data);

But it's not possible to write:

//thread 1:
data = 1;
z.store(1, std::memory_order_release);
x.store(1, std::memory_order_relaxed);

//thread 2:
if (x.load(std::memory_order_acquire))
assert(1 == data);

From point of view of Peter Dimov's proposal, this core contains race
on 'data'.

Yes. Fences are global, whereas ordering on individual objects is
specific. The fence version is equivalent to:

// thread 1
data=1
x.store(1,std::memory_order_release);

I think there must be following statements:

- release operation *is a* release fence
- acquire operation *is a* acquire fence

So this:
z.store(1, std::memory_order_release);
basically transforms to:
std::atomic_memory_fence(std::memory_order_release );
z.store(1, std::memory_order_release);

Then second example will be legal. What do you think?

I think that compromises the model, because it makes release
operations contagious. The fence transformation is precisely the
reverse of this, which I think is correct.

Anthony
--
Anthony Williams | Just Software Solutions Ltd
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL

Jun 27 '08 #2

Dmitriy V'jukov

On Jun 16, 11:47 pm, Anthony Williams <anthony....@gmail.comwrote:

Yes, I've already read this. It's just GREAT! It's far more useful and
intuitive.
And it contains clear and simple binding to memory model, i.e.
relations between acquire/release fences; and between acquire/release
fences and acquire/release operations.

Is it already generally approved by memory model working group?
For atomic_fence() I'm not worry :) But what about complier_fence()?

Yes. It's been approved to be applied to the WP with minor renamings
(atomic_memory_fence -atomic_thread_fence, atomic_compiler_fence ->
atomic_signal_fence)

COOL!

Looking forward to next draft. Btw, what about dependent memory
ordering (memory_order_consume)? Is it going to be accepted?

Btw, I see some problems in Peter Dimov's proposal.
First, it's possible to write:

x.store(1, std::memory_order_relaxed);
std::atomic_compiler_fence(std::memory_order_relea se);
y.store(1, std::memory_order_relaxed);

But it's not possible to write:

x.store(1, std::memory_order_relaxed);
y.store(1, std::memory_order_relaxed_but_compiler_order_relea se);
// or just y.store(1, std::compiler_order_release);

I.e. it's not possible to use complier ordering, when using acquire/
release operations. It's a bit inconsistent, especially taking into
account that acquire/release operations are primary and standalone
bidirectional fences are supplementary.

You're right that you can't do this. I don't think it's a problem as
compiler orderings are not really the same as the inter-thread
orderings.

Yes, but why I can do and inter-thread orderings and compiler
orderings with stand-alone fences, and can do only inter-thread
orderings with operations? Why stand-alone fences are more 'powerful'?

Second, more important moment. It's possible to write:

//thread 1:
data = 1;
std::atomic_memory_fence(std::memory_order_release );
x.store(1, std::memory_order_relaxed);

//thread 2:
if (x.load(std::memory_order_acquire))
assert(1 == data);

But it's not possible to write:

//thread 1:
data = 1;
z.store(1, std::memory_order_release);
x.store(1, std::memory_order_relaxed);

//thread 2:
if (x.load(std::memory_order_acquire))
assert(1 == data);

From point of view of Peter Dimov's proposal, this core contains race
on 'data'.

Yes. Fences are global, whereas ordering on individual objects is
specific.

Hmmm... need to think some more on this...

The fence version is equivalent to:

// thread 1
data=1
x.store(1,std::memory_order_release);

I think there must be following statements:

- release operation *is a* release fence
- acquire operation *is a* acquire fence

So this:
z.store(1, std::memory_order_release);
basically transforms to:
std::atomic_memory_fence(std::memory_order_release );
z.store(1, std::memory_order_release);

Then second example will be legal. What do you think?

I think that compromises the model, because it makes release
operations contagious...

.... and this will interfere with efficient implementation on some
hardware. Right? Or there are some 'logical' reasons for this (why you
don't want to make release operations contagious)?

Dmitriy V'jukov

Jun 27 '08 #3

Anthony Williams

"Dmitriy V'jukov" <dv*****@gmail.comwrites:

On Jun 16, 11:47 pm, Anthony Williams <anthony....@gmail.comwrote:

Yes, I've already read this. It's just GREAT! It's far more useful and
intuitive.
And it contains clear and simple binding to memory model, i.e.
relations between acquire/release fences; and between acquire/release
fences and acquire/release operations.

Is it already generally approved by memory model working group?
For atomic_fence() I'm not worry :) But what about complier_fence()?

Yes. It's been approved to be applied to the WP with minor renamings
(atomic_memory_fence -atomic_thread_fence, atomic_compiler_fence ->
atomic_signal_fence)

COOL!

Looking forward to next draft. Btw, what about dependent memory
ordering (memory_order_consume)? Is it going to be accepted?

Yes. That's been voted in too.

Btw, I see some problems in Peter Dimov's proposal.
First, it's possible to write:

x.store(1, std::memory_order_relaxed);
std::atomic_compiler_fence(std::memory_order_relea se);
y.store(1, std::memory_order_relaxed);

But it's not possible to write:

x.store(1, std::memory_order_relaxed);
y.store(1, std::memory_order_relaxed_but_compiler_order_relea se);
// or just y.store(1, std::compiler_order_release);

I.e. it's not possible to use complier ordering, when using acquire/
release operations. It's a bit inconsistent, especially taking into
account that acquire/release operations are primary and standalone
bidirectional fences are supplementary.

You're right that you can't do this. I don't think it's a problem as
compiler orderings are not really the same as the inter-thread
orderings.

Yes, but why I can do and inter-thread orderings and compiler
orderings with stand-alone fences, and can do only inter-thread
orderings with operations? Why stand-alone fences are more 'powerful'?

Stand-alone fences affect all data touched by the executing thread, so
they are inherently more 'powerful'.

>The fence version is equivalent to:

// thread 1
data=1
x.store(1,std::memory_order_release);

I think there must be following statements:

- release operation *is a* release fence
- acquire operation *is a* acquire fence

So this:
z.store(1, std::memory_order_release);
basically transforms to:
std::atomic_memory_fence(std::memory_order_release );
z.store(1, std::memory_order_release);

Then second example will be legal. What do you think?

I think that compromises the model, because it makes release
operations contagious...

... and this will interfere with efficient implementation on some
hardware. Right? Or there are some 'logical' reasons for this (why you
don't want to make release operations contagious)?

It affects where you put the memory barrier instruction. The whole
point of relaxed operations is that they don't have memory barriers,
but if you make the release contagious the compiler might have to add
extra memory barriers in some cases. N2633 shows how you can
accidentally end up having to get full barriers all over the place.

Anthony
--
Anthony Williams | Just Software Solutions Ltd
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL

Jun 27 '08 #4

Dmitriy V'jukov

On 17 ÉÀÎ, 00:27, Anthony Williams <anthony....@gmail.comwrote:

Looking forward to next draft. Btw, what about dependent memory
ordering (memory_order_consume)? Is it going to be accepted?

Yes. That's been voted in too.

Oooo, it's a bad news. I only start understading current "1.10", and
they change it almost completely! :)

The latest proposal about dependent ordering is:
http://open-std.org/jtc1/sc22/wg21/d...008/n2556.html
Right?

And what about syntax with double square brackets
[[carries_dependency]]? It's quite unusual syntax addition for C/C+
+...

Btw, I see some problems in Peter Dimov's proposal.
First, it's possible to write:

x.store(1, std::memory_order_relaxed);
std::atomic_compiler_fence(std::memory_order_relea se);
y.store(1, std::memory_order_relaxed);

But it's not possible to write:

x.store(1, std::memory_order_relaxed);
y.store(1, std::memory_order_relaxed_but_compiler_order_relea se);
// or just y.store(1, std::compiler_order_release);

I.e. it's not possible to use complier ordering, when using acquire/
release operations. It's a bit inconsistent, especially taking into
account that acquire/release operations are primary and standalone
bidirectional fences are supplementary.

You're right that you can't do this. I don't think it's a problem as
compiler orderings are not really the same as the inter-thread
orderings.

Yes, but why I can do and inter-thread orderings and compiler
orderings with stand-alone fences, and can do only inter-thread
orderings with operations? Why stand-alone fences are more 'powerful'?

Stand-alone fences affect all data touched by the executing thread, so
they are inherently more 'powerful'.

I'm starting to understand. Initially I was thinking that it's just 2
forms of saying the same thing (stand-alone fence and acquire/release
operation). It turns out to be not true. Ok.

Dmitriy V'jukov

Jun 27 '08 #5

Anthony Williams

"Dmitriy V'jukov" <dv*****@gmail.comwrites:

On 17 Ð¸ÑŽÐ½, 00:27, Anthony Williams <anthony....@gmail.comwrote:

Looking forward to next draft. Btw, what about dependent memory
ordering (memory_order_consume)? Is it going to be accepted?

Yes. That's been voted in too.

Oooo, it's a bad news. I only start understading current "1.10", and
they change it almost completely! :)

It's all additions, so it's not too bad. The key thing is that the
paper adds memory_order_consume and dependency ordering, which
provides an additional mechanism for introducing a happens-before
relationship between threads.

The latest proposal about dependent ordering is:
http://open-std.org/jtc1/sc22/wg21/d...008/n2556.html
Right?

That's the latest pre-meeting paper. The latest (which is what was
voted on) is N2664 which is currently only available on the committee
site. It should be in the post-meeting mailing.

And what about syntax with double square brackets
[[carries_dependency]]? It's quite unusual syntax addition for C/C+
+...

That's the new attribute syntax. This part of the proposal has not
been included for now.

Anthony
--
Anthony Williams | Just Software Solutions Ltd
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL

Jun 27 '08 #6

Dmitriy V'jukov

On Jun 17, 10:48 am, Anthony Williams <anthony....@gmail.comwrote:

"Dmitriy V'jukov" <dvyu...@gmail.comwrites:
On 17 ÉÀÎ, 00:27, Anthony Williams <anthony....@gmail.comwrote:
Looking forward to next draft. Btw, what about dependent memory
ordering (memory_order_consume)? Is it going to be accepted?

Yes. That's been voted in too.

Oooo, it's a bad news. I only start understading current "1.10", and
they change it almost completely! :)

It's all additions, so it's not too bad. The key thing is that the
paper adds memory_order_consume and dependency ordering, which
provides an additional mechanism for introducing a happens-before
relationship between threads.

The latest proposal about dependent ordering is:
http://open-std.org/jtc1/sc22/wg21/d...008/n2556.html
Right?

That's the latest pre-meeting paper. The latest (which is what was
voted on) is N2664 which is currently only available on the committee
site. It should be in the post-meeting mailing.

I hope that in N2664 'happens before' definition is changed. Because
now I can't understand it.
For example in following code:

int data;
std::atomic<intx;

thread 1:
data = 1;
x.store(1, std::memory_order_release); (A)

thread2:
if (x.load(std::memory_order_consume)) (B)
assert(1 == data); (C)

A dependency-ordered before B, B sequenced before C.
So according to definition of 'happens before' in n2556, A happens-
before C.
According to my understanding, this is simply wrong. There is no data-
dependency between B and C, so A must not happens-before C. (there is
control dependency, but currently C++0x doesn't respect control-
dependency)

------------------------------------
Another moment:

An evaluation A carries a dependency to an evaluation B if
* the value of A is used as an operand of B, and:
o B is not an invocation of any specialization of
std::kill_dependency, and
o A is not the left operand to the comma (',') operator,

I think here ---------------------------------/\/\/\/\/\/\/\
must be 'built-in comma operator'. Because consider following example:

struct X
{
int data;
};

void operator , (int y, X& x)
{
x.data = y;
}

std::atomic<inta;

int main()
{
int y = a.load(std::memory_order_consume);
X x;
y, x; // here 'carries a dependency' is broken, because 'y' is a
left operand of comma operator
int z = x.data; // but I think, that 'z' still must be in
'dependency tree' rooted by 'y'
}
Where I am wrong this time? :)
Dmitriy V'jukov

Jun 27 '08 #7

Anthony Williams

"Dmitriy V'jukov" <dv*****@gmail.comwrites:

On Jun 17, 10:48 am, Anthony Williams <anthony....@gmail.comwrote:
>"Dmitriy V'jukov" <dvyu...@gmail.comwrites:
On 17 Ð¸ÑŽÐ½, 00:27, Anthony Williams <anthony....@gmail.comwrote:
Looking forward to next draft. Btw, what about dependent memory
ordering (memory_order_consume)? Is it going to be accepted?

>Yes. That's been voted in too.

Oooo, it's a bad news. I only start understading current "1.10", and
they change it almost completely! :)

It's all additions, so it's not too bad. The key thing is that the
paper adds memory_order_consume and dependency ordering, which
provides an additional mechanism for introducing a happens-before
relationship between threads.

The latest proposal about dependent ordering is:
http://open-std.org/jtc1/sc22/wg21/d...008/n2556.html
Right?

That's the latest pre-meeting paper. The latest (which is what was
voted on) is N2664 which is currently only available on the committee
site. It should be in the post-meeting mailing.

I hope that in N2664 'happens before' definition is changed. Because
now I can't understand it.

N2664 is almost the same as N2556.

For example in following code:

int data;
std::atomic<intx;

thread 1:
data = 1;
x.store(1, std::memory_order_release); (A)

thread2:
if (x.load(std::memory_order_consume)) (B)
assert(1 == data); (C)

A dependency-ordered before B, B sequenced before C.

Yes.

So according to definition of 'happens before' in n2556, A happens-
before C.

No. happens-before is no longer transitive if one of the legs is a
dependency ordering.

N2664 says:

"An evaluation A inter-thread happens before an evaluation B if,

* A synchronizes with B, or
* A is dependency-ordered before B, or
* for some evaluation X,
o A synchronizes with X and X is sequenced before B, or
o A is sequenced before X and X inter-thread happens before B, or
o A inter-thread happens before X and X inter-thread happens before B."

"An evaluation A happens before an evaluation B if:

* A is sequenced before B, or
* A inter-thread happens before B."

A is dependency-ordered before B, so A inter-thread happens-before B,
and A happens-before B.

However A neither synchronizes with B or C, nor is sequenced before B, so
the only way A could inter-thread-happen-before C is if B
inter-thread-happens-before C. Since C is not atomic, B cannot
synchronize with C or be dependency-ordered before C. Thus A does not
inter-thread-happen-before C, and A does not happen-before C.

According to my understanding, this is simply wrong. There is no data-
dependency between B and C, so A must not happens-before C. (there is
control dependency, but currently C++0x doesn't respect control-
dependency)

You're right in your analysis, but N2664 agrees with you.

------------------------------------
Another moment:

An evaluation A carries a dependency to an evaluation B if
* the value of A is used as an operand of B, and:
o B is not an invocation of any specialization of
std::kill_dependency, and
o A is not the left operand to the comma (',') operator,

I think here ---------------------------------/\/\/\/\/\/\/\
must be 'built-in comma operator'. Because consider following example:

Yes. That's fixed in N2664:

"An evaluation A carries a dependency to an evaluation B if

* the value of A is used as an operand of B, unless:
o B is an invocation of any specialization of std::kill_dependency (29.1), or
o A is the left operand of a built-in logical AND ('&&', see 5.14) or logical OR ('||', see 5.15) operator, or
o A is the left operand of a conditional ('?:') operator (5.16), or
o A is the left operand of the built-in comma (',') operator (5.18);
or
* A writes a scalar object or bit-field M, B reads the value written by A from M, and A is sequenced before B, or
* for some evaluation X, A carries a dependency to X, and X carries a dependency to B."

Anthony
--
Anthony Williams | Just Software Solutions Ltd
Custom Software Development | http://www.justsoftwaresolutions.co.uk
Registered in England, Company Number 5478976.
Registered Office: 15 Carrallack Mews, St Just, Cornwall, TR19 7UL

Jun 27 '08 #8

Re: C++0x: release sequence

Similar topics