473,395 Members | 1,466 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,395 software developers and data experts.

parallelism: how can I ensure data is seen by another thread or consumer?

Hello everyone.

Recently I stumbled upon an interesting problem related to thread-parallel
programming in C (and similarily C++). As an example assume a simple "buffer"
array of size 8, e.g. with static lifetime and exteral linkage. One thread fills
the buffer structure, and the other in some way evaluates its contents, e.g.

do {
synchronize_with_other_thread();
for (int i=0;i<8;++i) {
sum += buffer[i];
}
} while (sum < THRESHOLD);

From what I understand, the programming standard (C99) assumes a single serial
instruction stream. To ensure each loop iteration retrieves current values from
buffer[], one has to make it volatile, which prevents a lot of useful
optimizations by the compiler. In any other case, the compiler would be
_allowed_ to keep the whole contents of buffer[] in the register file and
operate on that.

It is clear, that in practice various ways exist to avoid "volatile" for buffer
[] and allow the compiler to optimize mor aggressively. Especially, one could do
the evaluation (here the sum) in a non-inline function. For data with external
linkage, any library call inside "synchronize_with_other_thread()" will be
sufficient, too, as a compiler cannot assure buffer[] is not affected by it. But
both solutions rely on the inablilities of compiler and linker, and would not be
guaranteed to work with an "omniscient" compiler that is allowed to perform
inter-procedure optimizations.

Clearly related to this, I could not find out about the actual meaning of
casting volatile pointers to non-volatile. Or applied to the example, is there
really a difference from the version above to

do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile[i];
}
} while (sum < THRESHOLD);

The problem I see here is that only the contents of buffer[] are volatile, not
its address. *buffer_nonvolatile is for sure invariant for all while-iterations
(if it is static), and the semantics are identical with just not declaring
buffer[] volatile.

I'd really be glad if someone could comment on that, either correcting or
confirming my assumptions.

Best regards,
Markus

Sep 11 '08 #1
8 1392
do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified
volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile[i];
}
} while (sum < THRESHOLD);
To do that you need buffer to be a volatile pointer, not a pointer to a
volatile integer.
This way the compiler will assign buffer_nonvolatile to buffers value each
time through the loop, thus it *could* change value and a compiler won't
optimise it away right?

Wrong. A smart compiler will figure out to store the last value of the non
volatile pointer, and if its the same value this time, it will use the same
sum values instead of fetching them again. So you really need buffer to be a
volatile pointer to a volatile integer, which means your buffer_nonvolatile
pointer does have to be pointing to volatile data after all.... in other
words your code is wrong and if it wasn't it would be pointless anyway. You
should just make buffer_nonvolatile volatile. But i guess this depends on
what your sync is doing.

You do know that:
volatile int *foo;
declares a pointer to a volatile integer, not a volatile pointer.
To do a volatile pointer to a normal integer you would do: int * volatile
foo;
and a volatile pointer to a volatile integer you do: volatile int * volatile
foo;

Sep 11 '08 #2
do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified
volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile[i];
}
} while (sum < THRESHOLD);
To do that you need buffer to be a volatile pointer, not a pointer to a
volatile integer.
This way the compiler will assign buffer_nonvolatile to buffers value each
time through the loop, thus it *could* change value and a compiler won't
optimise it away right?

Wrong. A smart compiler will figure out to store the last value of the non
volatile pointer, and if its the same value this time, it will use the same
sum values instead of fetching them again. So you really need buffer to be a
volatile pointer to a volatile integer, which means your buffer_nonvolatile
pointer does have to be pointing to volatile data after all.... in other
words your code is wrong and if it wasn't it would be pointless anyway. You
should just make buffer_nonvolatile volatile. But i guess this depends on
what your sync is doing.

You do know that:
volatile int *foo;
declares a pointer to a volatile integer, not a volatile pointer.
To do a volatile pointer to a normal integer you would do: int * volatile
foo;
and a volatile pointer to a volatile integer you do: volatile int * volatile
foo;

Sep 11 '08 #3
"MisterE" <mi*****@nigma.netschrieb:

Thanks for your reply, MisterE.
>
>do {
synchronize_with_other_thread();
int* buffer_nonvolatile = (int*) buffer; // buffer being qualified
volatile
for (int i=0;i<8;++i) {
sum += buffer_nonvolatile[i];
}
} while (sum < THRESHOLD);

To do that you need buffer to be a volatile pointer, not a pointer to a
volatile integer.
This way the compiler will assign buffer_nonvolatile to buffers value each
time through the loop, thus it *could* change value and a compiler won't
optimise it away right?

Wrong. A smart compiler will figure out to store the last value of the non
volatile pointer, and if its the same value this time, it will use the same
sum values instead of fetching them again. So you really need buffer to be a
volatile pointer to a volatile integer, which means your buffer_nonvolatile
pointer does have to be pointing to volatile data after all.... in other
words your code is wrong and if it wasn't it would be pointless anyway. You
should just make buffer_nonvolatile volatile. But i guess this depends on
what your sync is doing.
>The*problem*I*see*here*is*that*only*the*content s*
of*buffer[]*are*volatile,*not*its*address.
Sorry if my explanation was not clear enough, but I think you did not
contradict, but basically confirm what I assumed (but was not really sure
about): There is no other means of ensuring current data from the buffer is read
but declaring and using the data volatile. It is not possible to temporarily
have non-volatile access to otherwise volatile data.
Perhaps I did not make it clear enough that my idea was to get rid of the
volatile-ness in some compute intensive kernel. Assume, e.g., buffer[] contains
data for an image, and a complex filter--having enough potential for
optimization by the compiler--should be applied to it. If you write parallel
applications, preventing aggressive compiler optimizations by having everything
volatile is most probably not what you want.

You do know that:
volatile int *foo;
declares a pointer to a volatile integer, not a volatile pointer.
To do a volatile pointer to a normal integer you would do: int * volatile
foo;
and a volatile pointer to a volatile integer you do: volatile int * volatile
foo;
I did not really think about it, although I know this distinction in context of
the "const" qualifier. But thank you for the clarification.

Thanks,
Markus

Sep 11 '08 #4
Markus <ph********@web.dewrites:
<snip>
Sorry if my explanation was not clear enough, but I think you did
not contradict, but basically confirm what I assumed (but was not
really sure about): There is no other means of ensuring current data
from the buffer is read but declaring and using the data
volatile. It is not possible to temporarily have non-volatile access
to otherwise volatile data.

Perhaps I did not make it clear enough that my idea was to get rid
of the volatile-ness in some compute intensive kernel. Assume, e.g.,
buffer[] contains data for an image, and a complex filter--having
enough potential for optimization by the compiler--should be applied
to it. If you write parallel applications, preventing aggressive
compiler optimizations by having everything volatile is most
probably not what you want.
Yes, and it is very unlikely that you have to resort to that sort of
thing. The trouble is that you won't find out here. Standard C has
nothing to say about concurrency and what it has to say about volatile
is not enough for you know what is and is not safe.

The system you are programming for must provide thread primitives and
its documentation (or a Usenet group about it) is the only place where
you will find out what is and is not guaranteed. There will, most
likely, be some simply synchronisation primitive that will allow the
producer to put a pointer to a new frame into some queue where it can
be consumed by the filter without any interference.

If there is not, then you need to build one. Ask, say, in
comp.programming.threads about building a semaphore from whatever
atomic memory operations your system provides.

--
Ben.
Sep 11 '08 #5
Ben Bacarisse <be********@bsb.me.ukschrieb:
>Perhaps I did not make it clear enough that my idea was to get rid
of the volatile-ness in some compute intensive kernel. Assume, e.g.,
buffer[] contains data for an image, and a complex filter--having
enough potential for optimization by the compiler--should be applied
to it. If you write parallel applications, preventing aggressive
compiler optimizations by having everything volatile is most
probably not what you want.
Yes, and it is very unlikely that you have to resort to that sort of
thing. The trouble is that you won't find out here. Standard C has
nothing to say about concurrency and what it has to say about volatile
is not enough for you know what is and is not safe.
The system you are programming for must provide thread primitives and
its documentation (or a Usenet group about it) is the only place where
you will find out what is and is not guaranteed. There will, most
likely, be some simply synchronisation primitive that will allow the
producer to put a pointer to a new frame into some queue where it can
be consumed by the filter without any interference.
Even adding a real queue (what my example resembled is a single-entry queue),
the problem behind does not change, at least if you "re-use" your buffer storage
instead of allocating fresh memory every time. My point is, that the consumer
thread WILL get references to the SAME memory address sooner or later. If that
does not point to volatile data, there is no reason for the consumer to assume
that the referenced data has changed.

In practice, all this is usually not a concern, as the compiler will not create
code checking and exploiting that (although it would be allowed to). But from a
theoretical point of view, the buffer storage needs to be qualified volatile
IMHO.
If there is not, then you need to build one. Ask, say, in
comp.programming.threads about building a semaphore from whatever
atomic memory operations your system provides.
Sorry, my question was actually not about threading and synchronization (despite
volatile variables *are* actually sufficient for the synchronization necessary
in my little example). By the way, I thought comp.lang.c was the better place
for my problem than comp.programming.threads, because my problem is not related
to practical aspects of thread programming, and is also encountered in non-
threaded code.
Perhaps I can bring all my quite long emails down to the following question:

According to the language standard (not from a practical view that exploits the
weaknesses of compilation): Is it necessary or not to have each and every object
that is changed by one and read by another thread after proper synchronization
to be qualified volatile if one wants to ensure the second one also gets the new contents?

Sep 11 '08 #6
Markus <ph********@web.dewrites:
<snip>
Perhaps I can bring all my quite long emails down to the following question:

According to the language standard (not from a practical view that
exploits the weaknesses of compilation): Is it necessary or not to
have each and every object that is changed by one and read by
another thread after proper synchronization to be qualified volatile
if one wants to ensure the second one also gets the new contents?
We may be talking past each other. The C standard say nothing about
concurrency and very little about volatile. In practise the two
concepts are separate and rarely interact: volatile is not enough to
implement even the simplest concurrency control[1] and it is rarely
required by C extensions that provide concurrency.

Whatever extension you are using that provides the concurrency must do
so with some basic set of primitives. These are what you need to
use. Qualifying shared arrays as volatile is unlikely to be
required. I can't be more specific because I don't know what you are
using, so I can only make general remarks.

Finally, do consider posting in comp.programming.threads. There are
helpful people there and some real experts about everything from
concurrent systems design to the lowest-level memory barrier issues.
for one thing, nothing else can be said from the point of view of
standard C (the topic here).

[1] It *may* be enough, but that would be an accident of one
particular compiler/target machine combination. Standard C guarantees
that accesses won't be optimised away, but nothing more. If standard
C ever embraces concurrency it will have to provide some sort of
guarantees about the memory model but I'd bet the house that it won't
do that via tightening the meaning of volatile -- it will most likely
borrow the work done in the C++ committee.

--
Ben.
Sep 11 '08 #7
After some discussion on comp.programming.threads, I finally found the reason of
my incomprehension: Pthreads is not only "some" library, which only provides
unified access to system calls and platform-specific assembly.

This also was what Ben Bacarisse <be********@bsb.me.ukindicated:
[...] Whatever extension you are using that provides the concurrency must do
so with some basic set of primitives. These are what you need to use. [...]
The synchronization primitives, like mutexes and so on, do not only temporal
synchronization, but also "inform" the compiler that data might have been
changed. C does not provide such a concept itself, so compiler and pthread
library must agree on how giving that hint.

Locking a mutex is therefore more than a usual function call, it has additional
semantics a "normal" C-function could not provide. This was the point I just
didn't know and took me so long to understand.
Best regards and thanks for your help,
Markus

Sep 13 '08 #8
Ben Bacarisse <be********@bsb.me.ukwrote:
>
The C standard say nothing about concurrency
"Nothing" is a bit too strong -- the C Standard does say *something*
about concurrency in the guise of signal handlers, but not very much.
If standard C ever embraces concurrency it will have to provide some
sort of guarantees about the memory model but I'd bet the house that it
won't do that via tightening the meaning of volatile -- it will most
likely borrow the work done in the C++ committee.
Thread support is a hot topic for C1X, so it's likely that the C
Standard *will* embrace concurrency in the not too distant future. And
yes, we are borrowing heavily from the work being done in C++.
--
Larry Jones

My upbringing is filled with inconsistent messages. -- Calvin
Sep 22 '08 #9

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Qiangning Hong | last post by:
A class Collector, it spawns several threads to read from serial port. Collector.get_data() will get all the data they have read since last call. Who can tell me whether my implementation correct?...
3
by: jim_geissman | last post by:
I have a function that returns a table of information about residential properties. The main input is a property type and a location in grid coordinates. Because I want to get only a certain...
4
by: T Dubya | last post by:
We're experiencing a large number of deadlocks since we began running SQL Server 2000 Enterprise Edition SP3 on a Dell 6650 with hyper threading intel processors. We don't have the same problem on...
4
by: Leonardo Hyppolito | last post by:
Hello, I am trying to write a multithread program that simulates producers and consumers. My program can have many producers and many consumers (each in a separate thread). It has a storage...
6
by: Christian Convey | last post by:
Hello, I've got a program that (ideally) perpetually monitors sys.stdin for lines of text. As soon as a line comes in, my program takes some action. The problem is, it seems like a very large...
2
by: thomasamillergoogle | last post by:
I would like to have a web service continue a little bit of processsing after it returns the data to the consumer. What is the proper way to do this? Please provide an bit of example code on how to...
9
by: MR | last post by:
I get the following Exception "The data at the root level is invalid. Line 1, position 642" whenever I try to deserialize an incoming SOAP message. The incoming message is formed well and its...
22
by: Zytan | last post by:
I have public methods in a form. The main form calls them, to update that form's display. This form is like a real-time view of data that is changing. But, the form may not exist (it is...
2
by: patrick.waldo | last post by:
Hi all, Fairly new Python guy here. I am having a lot of trouble trying to figure this out. I have some data on some regulations in Excel and I need to basically add up the total regulations...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.