473,401 Members | 2,068 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,401 software developers and data experts.

Semantics of STL containers (std::map) in a multithreaded scenario


I understand the C++ standard does not talk about threading. My
question here is directed more towards what happens when a STL
container is used in a certain way. I'd appreciate any thoughts. I
re-iterate I don't want to probe into what C++ standard says when I
trample some data from multiple threads, I simply want to know if I
have understood this right.

I have a "std::map<somedatatype, someotherdatatype> myMap", where stuff
gets inserted regularly. In a multi-threaded environment, there are
other parts of code that simultaneously read from this map.

I would like to know if I have understood this right. Consider a
scenario like this:

// thread 1 in one part of the application

busy_inserting_stuff_into_myMap

// while thread 2 in another part of the application

1. gets a reference to the myMap
2. starts iterating by performing an ordinary loop:

std::map<>::iterator itrbegin = myMap.begin();
std::map<>::iterator itrend = myMap.end();
std::map<>::iterator itr;
for (itr = itrbegin; itr != itr.end(); ++itr)
{
}

Since thread 1 is regularly inserting stuff, is there a chance thread 2
finds all its iterators invalidated while its in the process of
looping? In other words what is the effect of inserting into a map
from one thread while simultaneously looping through the same map in
another thread? I don't care if thread 2 misses some info during the
looping (since the map gets updated all the time...)

thanks!

Jun 13 '06 #1
20 5812

**sigh**

I said:
I
re-iterate I don't want to probe into what C++ standard says when I
trample some data from multiple threads, I simply want to know if I
have understood this right.
and ended up asking:
In other words what is the effect of inserting into a map
from one thread while simultaneously looping through the same map in
another thread? I don't care if thread 2 misses some info during the
looping (since the map gets updated all the time...)


Apologies. If someone still wants to clarify this, I'd appreciate it a
lot. thanks!

Jun 13 '06 #2
Dilip schrieb:
Since thread 1 is regularly inserting stuff, is there a chance thread 2
finds all its iterators invalidated while its in the process of
looping? In other words what is the effect of inserting into a map
from one thread while simultaneously looping through the same map in
another thread? I don't care if thread 2 misses some info during the
looping (since the map gets updated all the time...)


<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.

You could even run into problems when writing to one single integer.
Thread A writes to variable "size", and Thread B starts reading it just
when Thread A wrote the first byte of "size" and left the other bytes
unchanged. So Thread B gets a byte-wise mixture of the old and the new
values of "size".

Use some kind of mutex when you access variables or objects from more
than one thread.
</OT>

Thomas
Jun 13 '06 #3

Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


Understood. Sadly, I am in an unenviable position of retrofitting
multithreading into a heavily C++/STL dependant application and I
frequently keep running into corner cases like this :-(

Jun 13 '06 #4
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?

Jun 13 '06 #5
Dilip schrieb:
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


In a single thread? Yes.
Jun 13 '06 #6
On Tue, 13 Jun 2006 15:43:00 -0700, Dilip wrote:
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


Understood. Sadly, I am in an unenviable position of retrofitting
multithreading into a heavily C++/STL dependant application and I
frequently keep running into corner cases like this :-(


You need to consult with the documentation for the platform(s) that your
STL is implemented on and see what threading guarantees it provides.
Jun 13 '06 #7
On Tue, 13 Jun 2006 15:51:07 -0700, Dilip wrote:
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


No. You have no control as to what's happening behind the scenes. The
only promise you have is that after the insert/erase is completed, no
other iterators into that map have been invalidated by that operation.
(Note I said _other_ iterators. In the case of erase, the iterator you're
working on becomes invalid....)
Jun 13 '06 #8
Dilip wrote:
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


No, because your inserter thread could be involved in a race with your
reader thread.

Inserter thread:
Insert-Start -------------------------------- Insert Finished

-------------------- IncrementIterator ----------------------
Reader Thread

See what happens? Not good.

Jun 13 '06 #9
In article <11********************@p79g2000cwp.googlegroups.c om>,
"Dilip" <rd*****@lycos.com> wrote:
**sigh**

I said:
I
re-iterate I don't want to probe into what C++ standard says when I
trample some data from multiple threads, I simply want to know if I
have understood this right.


and ended up asking:
In other words what is the effect of inserting into a map
from one thread while simultaneously looping through the same map in
another thread? I don't care if thread 2 misses some info during the
looping (since the map gets updated all the time...)


Apologies. If someone still wants to clarify this, I'd appreciate it a
lot. thanks!


I haven't prototyped this scenario, but I have experience implementing
std::map. I strongly suspect that this scenario could very rarely
introduce cases where the increment of the read thread gets redirected
into oblivion (say while the insert is rotating a sub-tree). Note that
"very rarely" is a worst case scenario. It means your testing probably
won't expose the bug. I wish I could give you a firmer answer than
that, but quite frankly that's a research project (which I lack
resources for). Retrofitting multithreading is not a pretty place to be.

-Howard
Jun 13 '06 #10
Dilip wrote:
I understand the C++ standard does not talk about threading. My
question here is directed more towards what happens when a STL
container is used in a certain way. I'd appreciate any thoughts. I
re-iterate I don't want to probe into what C++ standard says when I
trample some data from multiple threads, I simply want to know if I
have understood this right.

I have a "std::map<somedatatype, someotherdatatype> myMap", where stuff
gets inserted regularly. In a multi-threaded environment, there are
other parts of code that simultaneously read from this map.

I would like to know if I have understood this right. Consider a
scenario like this: .... Since thread 1 is regularly inserting stuff, is there a chance thread 2
finds all its iterators invalidated while its in the process of
looping? In other words what is the effect of inserting into a map
from one thread while simultaneously looping through the same map in
another thread? I don't care if thread 2 misses some info during the
looping (since the map gets updated all the time...)


It's undefined what may happen if you do that. What actually will
happend depends on the implementation. Probably something not nice.

You might want to take a look at the synchronization wrappers for
Java collection classes to see how they do it. Basically they
make all the methods synchronized except the iterator which requires
you to explicity get the lock before doing the interation or risk
getting a ConcurrentModificationException.

If you did your own interator implementation you could possibly
acquire the lock when you get the iterator and release it when
the iterator is d'tored, making more RAII like.

You can go lock-free but not yet. We're not there yet. :)
--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
Jun 14 '06 #11
Howard Hinnant wrote:
I strongly suspect that this scenario could very rarely
introduce cases where the increment of the read thread gets redirected
into oblivion (say while the insert is rotating a sub-tree).


Howard

Bingo! Thats _exactly_ what I was worrying about. But after some
reflection I think I am in a much deeper morass. Multiple writer
threads are messing around with various parts of the map -- while this
is happening, whether or not iterators are invalidated, any reader
thread is going to see inconsistencies. So I am in party land
already..

Sorry group for the threading distraction....

Jun 14 '06 #12
Dilip wrote:

I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


No, there's a deeper problem. The map is guaranteed to be properly
sorted when the code returns from an insertion or removal, but during
that operation it doesn't have to be. So halfway through an insertion it
can be in a nonsensical state, and if you try to read it from another
thread you'll get nonsense. The brute force solution is to make all
external operations on the map atomic, by locking a mutex when the
operation starts, and unlocking it when the operation ends. That will
work for your scenario, since you said you don't care about missing
updates in your reader. But more generally, multi-threading has to be
designed in from the top down, not hacked in from the bottom. As it is,
you're trying to put your socks on after you've put your shoes on.

--

Pete Becker
Roundhouse Consulting, Ltd.
Jun 14 '06 #13

Dilip wrote:
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


For my company's framework (FOST.3) we've been through this too. What
we did was to define a synchronisation object that allowed a limited
number of read locks or a single write lock. You can have a number of
readers, but only a single write thread at a time accessing the data
structure.

If you're writing about as often as you're reading this may not be the
best way. It really depends also on the larger context. Re-architecting
may be an option to sidestep the issue altogether, or using a different
map implementation. Some CPUs offer the possibility of lockless
structures, but they're very hard to implement properly.

Jun 14 '06 #14
Pete Becker wrote:
Dilip wrote:

I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?

No, there's a deeper problem. The map is guaranteed to be properly
sorted when the code returns from an insertion or removal, but during
that operation it doesn't have to be. So halfway through an insertion it
can be in a nonsensical state, and if you try to read it from another
thread you'll get nonsense.


Pete.. another poster (redfloyd?) also pointed out the same thing. I
am trying to wrap my head around this one. Just to make sure I
understand: there is no guarantee what can happen if you try to
perform any operation on a map when another operation is an interrupted
state, right? That is an incomplete operation leaves the internal
state of the map inconsistent. Just so that I can convince my
colleagues with some esoteric knowledge can you give me an example of
what _might_ be the state of a map when a thread performing an insert
operation is interrupted because its time-slice ran out and another
thread mistakenly tries to read from the same map?
The brute force solution is to make all
external operations on the map atomic, by locking a mutex when the
operation starts, and unlocking it when the operation ends.
If what I understood is correct do we even have a choice? I mean if
you are destined to use a map in a multithreaded environment what else
can one do? Right now I am trying to do something like this:

typedef std::map<std::string, SomeComplexStruct*> myMap;

I protect any access to the map whenever an element is being _inserted_
for the first time.

However if its just an update, I protect only the _find_ operation.
Later instead of locking the whole map, I lock only that instance of
SomeComplexStructure I retrieved and make whatever changes I need to.

Is this a reasonable approach? I understand this is fast getting OT
for this newsgroup -- I am not sure what to do if I need to continue to
engage Pete in this discussion. I am using the Dinkumware library
anyway.
That will
work for your scenario, since you said you don't care about missing
updates in your reader. But more generally, multi-threading has to be
designed in from the top down, not hacked in from the bottom. As it is,
you're trying to put your socks on after you've put your shoes on.


Well yeah as I am starting to discover the hole I stuck my foot is not
an opening of a shoe -- its a goddamn banyan tree. Forget putting my
socks, I don't even know how to take my foot out :-(

Jun 14 '06 #15
Kirit Sælensminde wrote:
Dilip wrote:
Thomas J. Gritzan wrote:
Dilip schrieb:
<OT>
You should not write to a variable in one thread while reading the same
variable in other threads. Its not just about invalidating iterators.


I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


For my company's framework (FOST.3) we've been through this too. What
we did was to define a synchronisation object that allowed a limited
number of read locks or a single write lock. You can have a number of
readers, but only a single write thread at a time accessing the data
structure.


That smells like a ReaderWriterLock, right? I decided it might not be
of use to me since I make way too many updates to the map.

Jun 14 '06 #16
Dilip wrote:

Pete.. another poster (redfloyd?) also pointed out the same thing. I
am trying to wrap my head around this one. Just to make sure I
understand: there is no guarantee what can happen if you try to
perform any operation on a map when another operation is an interrupted
state, right? That is an incomplete operation leaves the internal
state of the map inconsistent. Just so that I can convince my
colleagues with some esoteric knowledge can you give me an example of
what _might_ be the state of a map when a thread performing an insert
operation is interrupted because its time-slice ran out and another
thread mistakenly tries to read from the same map?


Well, without looking at the code, one possibility is that one of the
tree nodes will have a child pointer that points to its parent. Another
is that some node will have multiple parents.

It's not a matter of what the map's internal structure might be, but of
whether there's anything you can make sense of. For example, if writing
the value of an int is not atomic, then it's possible to get a time
slice between writing two parts of its value; if another thread then
looks at the stored value, it will be nonsense.

unsigned int i = 0;

in one thread:
i = 0xffffffff;
If it gets sliced between writing the upper and lower halves of the
value, then the other thread sees a value of 0xffff0000 or 0x0000ffff.
Neither one makes sense. Same problem with pointer updates, which occur
frequently when rebalancing a tree.

--

Pete Becker
Roundhouse Consulting, Ltd.
Jun 15 '06 #17

Dilip wrote:
Kirit Sælensminde wrote:
Dilip wrote:
Thomas J. Gritzan wrote:
> Dilip schrieb:
> <OT>
> You should not write to a variable in one thread while reading the same
> variable in other threads. Its not just about invalidating iterators.

I just looked at the docs for std::map and it looks like for
insertion/removal no iterators are invalidated. So atleast in this
particular case, aren't I safe?


For my company's framework (FOST.3) we've been through this too. What
we did was to define a synchronisation object that allowed a limited
number of read locks or a single write lock. You can have a number of
readers, but only a single write thread at a time accessing the data
structure.


That smells like a ReaderWriterLock, right? I decided it might not be
of use to me since I make way too many updates to the map.


It is indeed exactly that sort of lock. One exclusive writer and many
readers.

You're going to need to use something like this though or re-architect
so that you don't need to lock at all. I'm afraid those are your only
safe options. To start with get it working correctly, then see if it is
too slow. If so then re-tihink what you're doing.

You will need to work out what the correct way to gain the locks is
going to be. Normally I would say that a writer attempting a lock would
stop any new readers, but depending on your application you may want to
allow them. Reducing the number of readers that you allow may allow the
writer to get in more quickly and improve the overall speed. Similarly,
increasing them might work better in your application. You'll need to
experiment to find out.

Clearly you also need to ensure that the locks are held for the
smallest possible time.

Jun 15 '06 #18
Dilip <rd*****@lycos.com> wrote:
Just to make sure I
understand: there is no guarantee what can happen if you try to
perform any operation on a map when another operation is an interrupted
state, right? That is an incomplete operation leaves the internal
state of the map inconsistent.


This is true, however my (perhaps simplistic) concept of
multithreading is that it does not involve "interrupts"
in the way that multitasking does. Two threads operate
without interruption or context switching upon the same memory.
Neither need be in an interrupted state for a conflict to
occur. Were this not true you could solve all your problems
with the appropriate kernel calls to raise/lower priority
level.

Steve
Jun 15 '06 #19
Pete Becker wrote:
Dilip wrote:

Pete.. another poster (redfloyd?) also pointed out the same thing. I
am trying to wrap my head around this one. Just to make sure I
understand: there is no guarantee what can happen if you try to
perform any operation on a map when another operation is an interrupted
state, right? That is an incomplete operation leaves the internal
state of the map inconsistent. Just so that I can convince my
colleagues with some esoteric knowledge can you give me an example of
what _might_ be the state of a map when a thread performing an insert
operation is interrupted because its time-slice ran out and another
thread mistakenly tries to read from the same map?


Well, without looking at the code, one possibility is that one of the
tree nodes will have a child pointer that points to its parent. Another
is that some node will have multiple parents.

It's not a matter of what the map's internal structure might be, but of
whether there's anything you can make sense of. For example, if writing
the value of an int is not atomic, then it's possible to get a time
slice between writing two parts of its value; if another thread then
looks at the stored value, it will be nonsense.

unsigned int i = 0;

in one thread:
i = 0xffffffff;
If it gets sliced between writing the upper and lower halves of the
value, then the other thread sees a value of 0xffff0000 or 0x0000ffff.
Neither one makes sense. Same problem with pointer updates, which occur
frequently when rebalancing a tree.


That's a known issue. The usual solution is to use atomic types
with the right guarantees, i.e. atomicity and acquire or release
semantics if needed. STL don't use atomic types so you don't know
if they can be used in this way since it's compiler and platform
dependent.

The more insideous problem is you can't just arbitrarily move things
around and expect that the proper semantics will hold. For the
purposes of illustration I'll use a binary search on a sorted array
as it's equivalent to a binary tree and easier to show in text.

You have a sorted array X[] = (1, 2).
Thread 1 wants to find if the value 2 exists.
Thread 1 examines X[0] and finds 1.
Thread 2 deletes the value 1 and adds the value 5.
X[] = (2, 5).
Thread 1 examines X[1] and finds 5 and concludes
the value 2 does not exist in the set when in fact
it always was there.

While it may be acceptable to not see a value being concurrently
added, or see a value that's being concurrently deleted, this
isn't one of thost cases.

PCOW (partial copy on write) will solve this problem although
the keeping the tree balanced is a little tricky. It's almost
certain that STL doesn't deal with this problem correctly without
a conventional lock based solution.
--
Joe Seigh

When you get lemons, you make lemonade.
When you get hardware, you make software.
Jun 15 '06 #20
Joe Seigh wrote:
Pete Becker wrote:
Dilip wrote:

Pete.. another poster (redfloyd?) also pointed out the same thing. I
am trying to wrap my head around this one. Just to make sure I
understand: there is no guarantee what can happen if you try to
perform any operation on a map when another operation is an interrupted
state, right? That is an incomplete operation leaves the internal
state of the map inconsistent. Just so that I can convince my
colleagues with some esoteric knowledge can you give me an example of
what _might_ be the state of a map when a thread performing an insert
operation is interrupted because its time-slice ran out and another
thread mistakenly tries to read from the same map?

Well, without looking at the code, one possibility is that one of the
tree nodes will have a child pointer that points to its parent.
Another is that some node will have multiple parents.

It's not a matter of what the map's internal structure might be, but
of whether there's anything you can make sense of. For example, if
writing the value of an int is not atomic, then it's possible to get a
time slice between writing two parts of its value; if another thread
then looks at the stored value, it will be nonsense.

unsigned int i = 0;

in one thread:
i = 0xffffffff;
If it gets sliced between writing the upper and lower halves of the
value, then the other thread sees a value of 0xffff0000 or 0x0000ffff.
Neither one makes sense. Same problem with pointer updates, which
occur frequently when rebalancing a tree.


That's a known issue. The usual solution is to use atomic types
with the right guarantees, i.e. atomicity and acquire or release
semantics if needed. STL don't use atomic types so you don't know
if they can be used in this way since it's compiler and platform
dependent.


Yes, that is the point of the example.

The more insideous problem is you can't just arbitrarily move things
around and expect that the proper semantics will hold.


It's essentially the same issue: if you don't have a guarantee of
atomicity, all bets are off. (Unless you have some other eplicit guarantee)

--

Pete Becker
Roundhouse Consulting, Ltd.
Jun 15 '06 #21

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Woodster | last post by:
I have declared the following std::map<std::string, std::string> myMap; to pass myMap to functions should I be declaring functions as: void function(std::map<std::string, std::string>); ...
1
by: Antti Granqvist | last post by:
Hello! I have following object relations: Competition 1--* Category 1--* Course 1 | | * Course
1
by: Saeed Amrollahi | last post by:
Dear All C++ Programmers Hello I am Saeed Amrollahi. I am a software engineer in Tehran Sewerage Company. I try to use std::map and map::find member function. I use Visual Studio .NET. my...
19
by: Erik Wikström | last post by:
First of all, forgive me if this is the wrong place to ask this question, if it's a stupid question (it's my second week with C++), or if this is answered some place else (I've searched but not...
3
by: Dan Trowbridge | last post by:
Hi everyone, In my attempt to port code from VS 6.0 to VS.NET I had some code break along the way, mostly due to not adhereing closely to the C++ standard. This may be another instance but I...
1
by: Avery Fong | last post by:
The following program will result in a compile error when building under Debug but will compile under Release. Why does is work under Release mode but not under Debug This program is developed...
13
by: kamaraj80 | last post by:
Hi I am using the std:: map as following. typedef struct _SeatRowCols { long nSeatRow; unsigned char ucSeatLetter; }SeatRowCols; typedef struct _NetData
5
by: Dilip | last post by:
Hi Folks I know the C++ standard doesn't talk about threads. However I am a little bit curious as to what might or might not happen with a particular scenario I encountered in my project. Its...
8
by: mveygman | last post by:
Hi, I am writing code that is using std::map and having a bit of an issue with its performance. It appears that the std::map is significantly slower searching for an element then a sequential...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.