469,282 Members | 1,815 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,282 developers. It's quick & easy.

Rationale behind copy semantics in STL containers.

Just a simple theoritical question to the experts.

What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?

Thanks
/P

Oct 25 '06 #1
35 2444
dragoncoder wrote:
Just a simple theoritical question to the experts.

What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?

Thanks
/P
Because to have a valid reference you must somewhere else have an object
and you must therefore be aware of the lifetime of that object. This
leaves open the possibilities of dangling references and memory leaks.
STL containers own their objects and clean up after themselves when the
container is destructed, essentially solving both of these problems.

If you really want the equivalent of a stored reference, you can make a
container of pointers-- likely that's what a reference is "under the
hood" anyway.
Oct 25 '06 #2

Mark P kirjoitti:
If you really want the equivalent of a stored reference, you can make a
container of pointers-- likely that's what a reference is "under the
hood" anyway.
Or more precisely, a memory address. ;)

Oct 26 '06 #3

dragoncoder wrote:
Just a simple theoritical question to the experts.

What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
Because C and C++ always has been value-based and always has allowed
you to use pointers in case you really wanted "reference" semantics. (I
believe the wording is from Java - thus it is misleading in a C++
context). And references do have quite a lot of overhead: that is why
"simple objects" such as integers are value-based in languages such as
Java (needing "boxing" to work as real objects).

/Peter

Oct 26 '06 #4

peter koch wrote:
dragoncoder wrote:
Just a simple theoritical question to the experts.

What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
Because C and C++ always has been value-based and always has allowed
you to use pointers in case you really wanted "reference" semantics. (I
believe the wording is from Java - thus it is misleading in a C++
context). And references do have quite a lot of overhead: that is why
"simple objects" such as integers are value-based in languages such as
Java (needing "boxing" to work as real objects).

/Peter
and containers of built-in types (e.g. int and double) and value types
(e.g., complex and pair) are very common and important.

-- Bjarne Stroustrup; http://www.reasearch.att.com/~bs

Oct 30 '06 #5

"bjarne" <bj****@gmail.comwrote in message
news:11**********************@i42g2000cwa.googlegr oups.com...
>
peter koch wrote:
>dragoncoder wrote:
Just a simple theoritical question to the experts.

What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
Because C and C++ always has been value-based and always has allowed
you to use pointers in case you really wanted "reference" semantics. (I
believe the wording is from Java - thus it is misleading in a C++
context). And references do have quite a lot of overhead: that is why
"simple objects" such as integers are value-based in languages such as
Java (needing "boxing" to work as real objects).
Peter:

Can you quantify/explain, "references do have quite a lot of overhead" a
bit?

Tony
Dec 13 '06 #6
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"bjarne" <bja...@gmail.comwrote in messagenews:11**********************@i42g2000cwa.g ooglegroups.com...
peter koch wrote:
dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
Because C and C++ always has been value-based and always has allowed
you to use pointers in case you really wanted "reference" semantics. (I
believe the wording is from Java - thus it is misleading in a C++
context). And references do have quite a lot of overhead: that is why
"simple objects" such as integers are value-based in languages such as
Java (needing "boxing" to work as real objects).Peter:

Can you quantify/explain, "references do have quite a lot of overhead" a
bit?
When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance.

--
Erik Wikström

Dec 13 '06 #7

<er****@student.chalmers.sewrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"bjarne" <bja...@gmail.comwrote in
messagenews:11**********************@i42g2000cwa.g ooglegroups.com...
peter koch wrote:
dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
Because C and C++ always has been value-based and always has allowed
you to use pointers in case you really wanted "reference" semantics. (I
believe the wording is from Java - thus it is misleading in a C++
context). And references do have quite a lot of overhead: that is why
"simple objects" such as integers are value-based in languages such as
Java (needing "boxing" to work as real objects).Peter:

Can you quantify/explain, "references do have quite a lot of overhead" a
bit?
"When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance."

OK. I thought he (you?) meant that references had some kind of overhead
over that which pointers have (something behind the scenes). The above
is obvious. "references do have quite a lot of overhead" sounds like
something more than just the extra level of indirection.

Tony
Dec 13 '06 #8
On Dec 13, 12:12 pm, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
<eri...@student.chalmers.sewrote in messagenews:11**********************@f1g2000cwa.go oglegroups.com...
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
peter koch wrote:
>dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
>Because C and C++ always has been value-based and always has allowed
>you to use pointers in case you really wanted "reference" semantics.(I
>believe the wording is from Java - thus it is misleading in a C++
>context). And references do have quite a lot of overhead: that is why
>"simple objects" such as integers are value-based in languages such as
>Java (needing "boxing" to work as real objects).Peter:
Can you quantify/explain, "references do have quite a lot of overhead"a
bit?"When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance."

OK. I thought he (you?) meant that references had some kind of overhead
over that which pointers have (something behind the scenes). The above
is obvious. "references do have quite a lot of overhead" sounds like
something more than just the extra level of indirection.
Maybe, I don't know what he was thinking of, but as others have pointed
out reference semantics is not clearly defined in C++ since we have
both pointers and references and neither of them are the same thing as
in Java or C#. <speculationIn those languages the references are more
than just pointers since they are also used to manage garbage
collection, this might have some kind of impact on performance too.
</speculation>

Notice that in certain applications the overhead of the extra
indirection can be quite a large part of the total time needed to
perform an operation, say for example if you have a really large
std::vector<int*and you go through it sum up all the elements, since
the actual operation is quite fast the overhead could be quite
noticeable, if you on the other hand performed a complicated operation
on each element the impact can be relatively small compared to the
total running time.

--
Erik Wikström

Dec 13 '06 #9
On Wed, 13 Dec 2006 01:31:40 GMT, "Tony" wrote:
>>dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
STL uses 'value semantics' because Alexander Stepanov's view of
programming is rooted in the functional programming 'paradigm' which
is value based (Stepanov is the designer of STL). Anyone interested in
STL should know the basics of functional programming.
Generic programming (programing with templates) per se is not bound to
(i.e. independant of) value semantics. Generic containers and
functions (algorithms) can be designed with reference semantics alike
(just not by Stepanov).
Interestingly, the the theoretical background of STL is never
explained, at least not in the available books and (online) articles I
know. The probably best-selling STL-book, Scott Meyer's 'Effective
STL', not even mentions value semantics. Some information can be found
at the collection of Stepanov's papers: http://www.stepanovpapers.com/
Best wishes,
Roland Pibinger
Dec 13 '06 #10

er****@student.chalmers.se wrote:
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"bjarne" <bja...@gmail.comwrote in messagenews:11**********************@i42g2000cwa.g ooglegroups.com...
peter koch wrote:
>dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
>Because C and C++ always has been value-based and always has allowed
>you to use pointers in case you really wanted "reference" semantics. (I
>believe the wording is from Java - thus it is misleading in a C++
>context). And references do have quite a lot of overhead: that is why
>"simple objects" such as integers are value-based in languages such as
>Java (needing "boxing" to work as real objects).Peter:
Can you quantify/explain, "references do have quite a lot of overhead" a
bit?

When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance.
Yes. Great. Means we have container library able to store just values.

Translates in container library really efficient only for fundamental
and trivial concrete types.

Therefore you have to store anything non-copyable (or hard to copyable)
as pointer.

Means we are getting all troubles described above, plus the problem of
that container is no more able to manage object lifetime.

Regards,

Mirek

Dec 13 '06 #11

Roland Pibinger wrote:
On Wed, 13 Dec 2006 01:31:40 GMT, "Tony" wrote:
Generic programming (programing with templates) per se is not bound to
(i.e. independant of) value semantics. Generic containers and
functions (algorithms) can be designed with reference semantics alike
(just not by Stepanov).
BTW, lately I am less sure about permutating algorithms. Maybe things
like "sort" or "remove_if" should rather be implemented as container
methods. (I will most likely do that in next U++ Core iteration).

In fact, even current value based STL has to deal with this to some
degree.

Mirek

Dec 13 '06 #12

Tony skrev:
<er****@student.chalmers.sewrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"bjarne" <bja...@gmail.comwrote in
messagenews:11**********************@i42g2000cwa.g ooglegroups.com...
peter koch wrote:
>dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
>Because C and C++ always has been value-based and always has allowed
>you to use pointers in case you really wanted "reference" semantics. (I
>believe the wording is from Java - thus it is misleading in a C++
>context). And references do have quite a lot of overhead: that is why
>"simple objects" such as integers are value-based in languages such as
>Java (needing "boxing" to work as real objects).Peter:
Can you quantify/explain, "references do have quite a lot of overhead" a
bit?

"When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance."

OK. I thought he (you?) meant that references had some kind of overhead
over that which pointers have (something behind the scenes). The above
is obvious. "references do have quite a lot of overhead" sounds like
something more than just the extra level of indirection.
Well, I do not believe you should call it "just one extra level of
indirection". It is something that easily equates to a factor of 10 (or
more) in performance on modern processors (due to cache-misses).

/Peter
>
Tony
Dec 13 '06 #13
On 13 Dec 2006 04:18:30 -0800, "Mirek Fidler" wrote:
>Yes. Great. Means we have container library able to store just values.
Yes, because it was designed only for values but not for objects. Not
for some obscure performance reasons.
>Translates in container library really efficient only for fundamental
and trivial concrete types.

Therefore you have to store anything non-copyable (or hard to copyable)
as pointer.
You have to store pointers anyway (unless you use some inplace
creation method). The user interface need not be a 'pointer
interface'. But i guess you know that already :-)
>Means we are getting all troubles described above, plus the problem of
that container is no more able to manage object lifetime.
Don't fall into the same trap as Boost. Containment and ownership are
distinct and independant responsibilities.

Best wishes,
Roland Pibinger
Dec 13 '06 #14
On 13 Dec 2006 04:27:10 -0800, "Mirek Fidler" wrote:
>Roland Pibinger wrote:
>Generic programming (programing with templates) per se is not bound to
(i.e. independant of) value semantics. Generic containers and
functions (algorithms) can be designed with reference semantics alike
(just not by Stepanov).

BTW, lately I am less sure about permutating algorithms. Maybe things
like "sort" or "remove_if" should rather be implemented as container
methods. (I will most likely do that in next U++ Core iteration).
'sort' should be a container function, at least for convenience. Other
mutating algorithms must somehow know that they have to swap
(underlying) pointers instead of pointed-to objects.
Any fundamental changes in the next U++ Core version?

Best wishes,
Roland Pibinger
Dec 13 '06 #15

Roland Pibinger wrote:
On 13 Dec 2006 04:27:10 -0800, "Mirek Fidler" wrote:
Roland Pibinger wrote:
Generic programming (programing with templates) per se is not bound to
(i.e. independant of) value semantics. Generic containers and
functions (algorithms) can be designed with reference semantics alike
(just not by Stepanov).
BTW, lately I am less sure about permutating algorithms. Maybe things
like "sort" or "remove_if" should rather be implemented as container
methods. (I will most likely do that in next U++ Core iteration).

'sort' should be a container function, at least for convenience. Other
mutating algorithms must somehow know that they have to swap
(underlying) pointers instead of pointed-to objects.
Unfortunately, the problem is that swapping is often not enough... E.g.
you cannot implement stable_sort or remove_if using the swap (AFAIK!
:).
Any fundamental changes in the next U++ Core version?
No, nothing really fundamental. Maybe getting rid of last traces of
iterators and this one.

U++ Core works as expected (means great) for years now... Actually, the
only trouble is exactly this - stable sort of polymorphic containers.
Easy to be implemented as method, impossible as algorithm. OTOH, in
reality, I never really needed stable sort of polymorphic container :)

Mirek

Dec 13 '06 #16
Don't fall into the same trap as Boost. Containment and ownership are
distinct and independant responsibilities.
Well, IME it is really a good idea to merge two.

Mirek

Dec 13 '06 #17

Tony wrote:
<er****@student.chalmers.sewrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"bjarne" <bja...@gmail.comwrote in
messagenews:11**********************@i42g2000cwa.g ooglegroups.com...
peter koch wrote:
>dragoncoder wrote:
Just a simple theoritical question to the experts.
What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost always
make things easier without much of overhead. Then why not reference ?
>Because C and C++ always has been value-based and always has allowed
>you to use pointers in case you really wanted "reference" semantics. (I
>believe the wording is from Java - thus it is misleading in a C++
>context). And references do have quite a lot of overhead: that is why
>"simple objects" such as integers are value-based in languages such as
>Java (needing "boxing" to work as real objects).Peter:
Can you quantify/explain, "references do have quite a lot of overhead" a
bit?

"When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance."

OK. I thought he (you?) meant that references had some kind of overhead
over that which pointers have (something behind the scenes). The above
is obvious. "references do have quite a lot of overhead" sounds like
something more than just the extra level of indirection.
There's also a hidden cost of pretty much destroying optimization
opportunities on the part of traditional compilers.

AFAIK, compilers won't register allocate pointed-to values. Not really
a problem in containers, but to efficently use primitive types if they
were accessed through an indirection would require compilers to get
smarter.

Evan

Dec 13 '06 #18

"peter koch" <pe***************@gmail.comwrote in message
news:11**********************@n67g2000cwd.googlegr oups.com...
>
Tony skrev:
><er****@student.chalmers.sewrote in message
news:11**********************@f1g2000cwa.googlegr oups.com...
On Dec 13, 2:31 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"bjarne" <bja...@gmail.comwrote in
messagenews:11**********************@i42g2000cwa.g ooglegroups.com...

peter koch wrote:
dragoncoder wrote:
Just a simple theoritical question to the experts.

What was the rationale behind making STL containers follow copy
semantics rather than reference semantics. References almost
always
make things easier without much of overhead. Then why not
reference ?

Because C and C++ always has been value-based and always has allowed
you to use pointers in case you really wanted "reference" semantics.
(I
believe the wording is from Java - thus it is misleading in a C++
context). And references do have quite a lot of overhead: that is
why
"simple objects" such as integers are value-based in languages such
as
Java (needing "boxing" to work as real objects).Peter:

Can you quantify/explain, "references do have quite a lot of overhead"
a
bit?

"When using references/pointers you need to perform an additional
memory-access, first one when you get the pointer from the container
and a second one when following the pointer to the actual object. Take
for example a vector (in which the contained elements are stored
contiguously), what you need to keep in the processors cache when
working with it can be the iterator and the contained elements, if you
use pointers you need to keep the iterator, the contained elements
_and_ the actual data, which might be spread all over the place leading
to repeated cache-misses and severely affecting performance."

OK. I thought he (you?) meant that references had some kind of overhead
over that which pointers have (something behind the scenes). The above
is obvious. "references do have quite a lot of overhead" sounds like
something more than just the extra level of indirection.

Well, I do not believe you should call it "just one extra level of
indirection". It is something that easily equates to a factor of 10 (or
more) in performance on modern processors (due to cache-misses).
The cache misses are a container implementation issue and not something
inherent to references.

Tony
Dec 14 '06 #19

Tony skrev:
"peter koch" <pe***************@gmail.comwrote in message
news:11**********************@n67g2000cwd.googlegr oups.com...
[snip]

Well, I do not believe you should call it "just one extra level of
indirection". It is something that easily equates to a factor of 10 (or
more) in performance on modern processors (due to cache-misses).

The cache misses are a container implementation issue and not something
inherent to references.
They certainly can't be a container implementation issue. At best it
can be a memory allocation issue (requiring subsequent allocation to be
sequential in memory), but that would require that those allocations
happens in an optimal order - e.g. allocate element n before element n
+1 and require no allocations in between.

/Peter

Dec 14 '06 #20

"peter koch" <pe***************@gmail.comwrote in message
news:11**********************@n67g2000cwd.googlegr oups.com...
>
Tony skrev:
>"peter koch" <pe***************@gmail.comwrote in message
news:11**********************@n67g2000cwd.googleg roups.com...
[snip]
>
Well, I do not believe you should call it "just one extra level of
indirection". It is something that easily equates to a factor of 10 (or
more) in performance on modern processors (due to cache-misses).

The cache misses are a container implementation issue and not something
inherent to references.

They certainly can't be a container implementation issue. At best it
can be a memory allocation issue (requiring subsequent allocation to be
sequential in memory),
I consider that part of the container architecture. But yes, it's more
appropriately categorized as memory management.
but that would require that those allocations
happens in an optimal order - e.g. allocate element n before element n
+1 and require no allocations in between.
Well for large datasets the clustering would be important. Otherwise not.
But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.

Tony
Dec 14 '06 #21

Tony wrote:
But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.
Very important special case. In fact, node-based containers (like
std::list) are almost always outperformed by vector in any real piece
of code. Continual storage simply wins.

And do not forget that it also applies to deque.

Mirek

Dec 14 '06 #22

peter koch wrote:
Well, I do not believe you should call it "just one extra level of
indirection". It is something that easily equates to a factor of 10 (or
more) in performance on modern processors (due to cache-misses).
Being curios, I decided to put this into the test in my recent
benchmark U++ Core benchmark

http://www.ultimatepp.org/www$uppweb2$vsstd$en-us.html

because testing is quite simple for U++ - I have just replaced
VectorMap by ArrayMap.

And the result was - in this test-case, it did not make any difference.

I believe you would get similar results for any nontrivial T.

Mirek

Dec 14 '06 #23

Tony skrev:
[snip]
>
Well for large datasets the clustering would be important. Otherwise not.
But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.
Well ... for small datasets, nothing really matters much. And I agree
that my arguments are relevant mostly for std::vector and std::deque,
but it is still relevant for other containertypes - it will be one
extra indirection and one more strain on the cache.

/Peter

Dec 14 '06 #24

"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...
>
Tony wrote:
>But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.

Very important special case. In fact, node-based containers (like
std::list) are almost always outperformed by vector in any real piece
of code. Continual storage simply wins.

And do not forget that it also applies to deque.
I choose containers first based upon appropriateness of the abstraction.
I do performance optimization (perhaps shoe-horning into another
container type) only on a need-to basis. I tend to think of node-based
containers as "containers" and vector as "array". Of course if I were
to just put pointers to objects in a vector, it would be more "container"
like.

Tony
Dec 14 '06 #25

"peter koch" <pe***************@gmail.comwrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...
>
Tony skrev:
[snip]
>>
Well for large datasets the clustering would be important. Otherwise not.
But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.

Well ... for small datasets, nothing really matters much. And I agree
that my arguments are relevant mostly for std::vector and std::deque,
but it is still relevant for other containertypes - it will be one
extra indirection and one more strain on the cache.
I just wanted to verify that there wasn't something inherent in reference
implementation, other than another level of indirection, that was going
on behind the scenes. The cache thing I can manage.

Tony
Dec 14 '06 #26

Tony wrote:
"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...

Tony wrote:
But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.
Very important special case. In fact, node-based containers (like
std::list) are almost always outperformed by vector in any real piece
of code. Continual storage simply wins.

And do not forget that it also applies to deque.

I choose containers first based upon appropriateness of the abstraction.
What abstract operations vector lacks and list has? (OK, there are some
maybe, but in 99.9% cases, vector abstraction is superset of others).
(OK2, I am not using std::vector, but U++ Vector and Array, so my
abstraction options are a little bit wider ;)

Mirek

Dec 14 '06 #27

"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11**********************@73g2000cwn.googlegro ups.com...
>
Tony wrote:
>"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11**********************@f1g2000cwa.googlegr oups.com...
>
Tony wrote:

But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.

Very important special case. In fact, node-based containers (like
std::list) are almost always outperformed by vector in any real piece
of code. Continual storage simply wins.

And do not forget that it also applies to deque.

I choose containers first based upon appropriateness of the abstraction.

What abstract operations vector lacks and list has? (OK, there are some
maybe, but in 99.9% cases, vector abstraction is superset of others).
(OK2, I am not using std::vector, but U++ Vector and Array, so my
abstraction options are a little bit wider ;)
I meant more like: "I need an associative container where order counts,
hence
I'll choose a tree-based map container". In your example, vector vs. list,
"list" to me has much more connotation than vector ('vector', to me, sounds
more like something that has magnitude and direction than something
container-like btw).

I've heard it said that one should use vector unless there's good reason not
to, but I think it is better to choose a container that "looks and feels
right"
given the context. A lot of programs today have much more leeway because
the processor performance is so good. For instance, I use a map to map
platform window handles to my window class. How many windows is
a GUI program likely to have? 100? 1000? Those are trivial numbers
compared to what containers are designed for. I could stuff all those
key-value pairs into an array and search through all of them all of the time
and it probably wouldn't make one bit of difference to the user. But I
choose
a map anyway because it "looks and feels right".

Tony
Dec 15 '06 #28
On Dec 15, 2:01 am, "Tony" <rdnewsNOSPAM2...@sbcglobal.netwrote:
"Mirek Fidler" <c...@ntllib.orgwrote in messagenews:11**********************@73g2000cwn.go oglegroups.com...
Tony wrote:
"Mirek Fidler" <c...@ntllib.orgwrote in message
news:11**********************@f1g2000cwa.googlegr oups.com...
Tony wrote:
But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.
Very important special case. In fact, node-based containers (like
std::list) are almost always outperformed by vector in any real piece
of code. Continual storage simply wins.
And do not forget that it also applies to deque.
I choose containers first based upon appropriateness of the abstraction.
What abstract operations vector lacks and list has? (OK, there are some
maybe, but in 99.9% cases, vector abstraction is superset of others).
(OK2, I am not using std::vector, but U++ Vector and Array, so my
abstraction options are a little bit wider ;)

I meant more like: "I need an associative container where order counts,
hence I'll choose a tree-based map container". In your example, vector vs.. list,
"list" to me has much more connotation than vector ('vector', to me, sounds
more like something that has magnitude and direction than something
container-like btw).
I always as myself this question when I need to choose a new container,
does the choice noticeable affect performance? If it isn't then I take
the container that allows me to do more with less code, in other words
the one with an interface and set of operations that best matches when
I need to do. And so, I think, does most of us in here.
I've heard it said that one should use vector unless there's good reason not
to, but I think it is better to choose a container that "looks and feels
right" given the context.
Well, it's up to you to decide what a "good reason" is, but I think
that what they mean is that if you need a container and don't have any
special requirements a vector is always to prefer, i.e. for many things
you could use a list just as well as a vector, in these cases go for
the vector.

--
Erik Wikström

Dec 15 '06 #29

Tony wrote:
"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11**********************@73g2000cwn.googlegro ups.com...

Tony wrote:
"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11**********************@f1g2000cwa.googlegro ups.com...

Tony wrote:

But aren't you really only talking about vector and not node based
containers in comparison? vector is special case.

Very important special case. In fact, node-based containers (like
std::list) are almost always outperformed by vector in any real piece
of code. Continual storage simply wins.

And do not forget that it also applies to deque.

I choose containers first based upon appropriateness of the abstraction.
What abstract operations vector lacks and list has? (OK, there are some
maybe, but in 99.9% cases, vector abstraction is superset of others).
(OK2, I am not using std::vector, but U++ Vector and Array, so my
abstraction options are a little bit wider ;)

I meant more like: "I need an associative container where order counts,
hence
I'll choose a tree-based map container".
Sure, associative is another problem.
In your example, vector vs. list,
"list" to me has much more connotation than vector ('vector', to me, sounds
more like something that has magnitude and direction than something
container-like btw).
So you have the problem with the name? :)
I've heard it said that one should use vector unless there's good reason not
to
Actually, as long as it is not associative, there are no good reasons
to choose std::list.

Think is, std::list advantage seemingly is O(1) ability to
remove/insert element everywhere. But the trouble is that you must know
which one to remove/where to insert.

Maybe there are usage patterns which naturally lead to keeping some
iterator(s) into the middle of list, but I never met one.

So in most cases, you will want to find that position. And thanks to
node-based nature of list, iterating is way slower than std::vector, so
before you get your position, your O(1) insert/remove advantage is
lost.
A lot of programs today have much more leeway because
the processor performance is so good. For instance, I use a map to map
platform window handles to my window class. How many windows is
a GUI program likely to have? 100? 1000?
[off-topic]
A good GUI program will have about 1-10, as many as number of top-level
windows. Anyway, I am still using VectorMap to map handles to objects.
Just like you said, because it feels better ;)
[/off-topic]

100-1000 is quite a lot. You should always use something else than
linear search here.
Those are trivial numbers
compared to what containers are designed for. I could stuff all those
key-value pairs into an array and search through all of them all of the time
and it probably wouldn't make one bit of difference to the user. But I
choose
a map anyway because it "looks and feels right".
Yes, you are doing the right thing here. (Well, you can have
vector-like associative containers and they are much faster than binary
trees or other node-based maps, but that is another story ;)

Mirek

Dec 15 '06 #30

Mirek Fidler wrote:
Think is, std::list advantage seemingly is O(1) ability to
remove/insert element everywhere. But the trouble is that you must know
which one to remove/where to insert.
That's hardly what most projects I've worked on use std::list or linked
lists in general. Often when linked list -like data structure is
chosen, it is NOT relevant at all what order objects are in the
container.

The primary concern is that you can create and delete objects
*cheaply*, and still have a way of keeping track of all objects you
have created. Deleting object from *middle* of container with array
semantics isn't very efficient.

The std::list has it's own problems, example, we want to keep track of
objects which are created with factory dp.

Example code:

texture* s = device->createTexture("bla.jpg");
s->release();

OK, let's assume device is the factory object which also manages
texture objects. The s->release() method implements reference counting
and when the reference count reaches zero delete is invoked.

If we keep the texture objects in std::list, finding objects for
deletion is iterating through the factory's texture object container
(or generic object container) until the value "s" is found.
Inefficient, this invites usage of std::map or similar container.

On the other hand, if the texture inherits from custom linked list
type, which *includes* (implementation detail, for example) pointers to
previous and next object in the sequence in the container, the removal
by value of s is trivial (and very fast). There should be a mechanism
to validate that s is actually part of the sequence it is removed from.
One way is to have pointer to the "current owner container" in the node
object.

class object
{
container* m_container;
object* m_prev;
object* m_next;
int m_refcount;
...
};

class texture : public object
{ ... };

But such design is more closer to what you might do in code you write
in C, not C++ .. it is a reasonable trade off to avoid writing custom
code for this kind of use case to just use std::set and be done with
it.

std::vector, or anykind of container with array semantics doesn't cut
it. We are NOT interested in inserting new objects in specific location
in the container, we just want to insert new stuff and remove *random*
objects out without significant overhead.

Maybe there are usage patterns which naturally lead to keeping some
iterator(s) into the middle of list, but I never met one.
Above you see one not very uncommon case where this might be
requirement. Believe it or not, std::set is MUCH better solution to
this particular problem to specificly AVOID having tangling iterators
as object handles, that would be plain stupid wouldn't it?

So in most cases, you will want to find that position. And thanks to
node-based nature of list, iterating is way slower than std::vector, so
before you get your position, your O(1) insert/remove advantage is
lost.
Speaking of iteration, of course, even with std::set for the use case
above, iterating the tree is still required. But that's precisely why
the tree is preferred over flat array or linked list.

The best is if pointer-to-object is fast to resolve into
pointer-to-node-in-sequence. The best case is possible if
pointer-to-object IS pointer-to-node-in-sequence (eg. custom code). But
profiling has revealed that it it not worth the effort for the kind of
uses *I* have put this kind of code into.

A lot of programs today have much more leeway because
the processor performance is so good. For instance, I use a map to map
platform window handles to my window class. How many windows is
a GUI program likely to have? 100? 1000?
Sensible approach. Likewise. I choose the std::set for the task I used
to illustrational purposes for the reason that it scales well. For
small dataset, it doesn't matter what you use, it's fast anyways. This
scales pretty nice to medium complexity too. With really large amount
of objects, 1,000,000+ it would start to take noticeable hit compared
to "direct lookup", but the dataset will not be likely to reach that
size for for a while, past decade we have been hovering between 10 and
a few thousands, tops, depending on the application.

The tops figure cancels out the brute force linear search, is still
manageable without any significant runtime overhead to tree and doesn't
yet warrant custom code.

Yes, you are doing the right thing here. (Well, you can have
vector-like associative containers and they are much faster than binary
trees or other node-based maps, but that is another story ;)
While I agree with the sentiments that array is better caching wise, it
should only be concern if there is no other design criteria for
choosing a specific kind of container. For random or random-like access
the memory bandwidth isn't as relevant as latency, for caching point of
view. The latency can be amortized by design, don't use the data
immediately after reading and you won't stall. Example:

if ( object->foo )
bar( ... );

We have to read from "foo" which is behind a pointer, until we know if
we will branch or not. If the predicate says "true" here, we have to
call.. but we cannot make the call until the value is read, if it comes
off cache, we have high latency. On the other hand, it is possible to
work around memory latency with additional gates in the CPU, we could
decode ahead of the IP and have multiple look-ahead contexts, and
"branch" to the context that was actually taken after the predicate can
be resolved.

How far ahead we can execute depends on how much area we are willing to
sacrifice to implement this logic. There is likely to be much more
productive use for the gates as this problem can be solved on the
software side on performance critical sections of the code, which are
far and wide between most of the code we will ever have to write. How
did it go, 97.8735433 214 (?) % of code isn't performance critical?

Whatever.

Dec 15 '06 #31

persenaama wrote:
Mirek Fidler wrote:
Think is, std::list advantage seemingly is O(1) ability to
remove/insert element everywhere. But the trouble is that you must know
which one to remove/where to insert.

That's hardly what most projects I've worked on use std::list or linked
lists in general. Often when linked list -like data structure is
chosen, it is NOT relevant at all what order objects are in the
container.
Wait a moment right there.

I was speaking about std::list as *value container*, not about list
structures in general.

List structures are great if you have to maintain groups of objects,
which logically belong elsewhere - which I think is exactly you are try
to explain. Heck, I even have a template class to make some object part
of such list.

But list *container* is next to useless.
If we keep the texture objects in std::list, finding objects for
deletion is iterating through the factory's texture object container
(or generic object container) until the value "s" is found.
Yep, that is what I meant.
On the other hand, if the texture inherits from custom linked list
type, which *includes* (implementation detail, for example) pointers to
previous and next object in the sequence in the container, the removal
by value of s is trivial (and very fast).
Yes, I agree. That is what I like to do as well ;)

Mirek

Dec 15 '06 #32
Mirek Fidler wrote:
I was speaking about std::list as *value container*, not about list
structures in general.
...

But list *container* is next to useless.
Couldn't agree more! Also noticed my thoughts went to all kinds of
random directions, which is kind of lame. :(

Dec 15 '06 #33
Roland Pibinger wrote:
The probably best-selling STL-book, Scott Meyer's 'Effective
STL', not even mentions value semantics.
We must not agree on what value semantics means. See Item 3, "Make
copying cheap and correct for objects in containers."

Dec 15 '06 #34

"Mirek Fidler" <cx*@ntllib.orgwrote in message
news:11*********************@j72g2000cwa.googlegro ups.com...
>
Tony wrote:
>A lot of programs today have much more leeway because
the processor performance is so good. For instance, I use a map to map
platform window handles to my window class. How many windows is
a GUI program likely to have? 100? 1000?

[off-topic]
A good GUI program will have about 1-10, as many as number of top-level
windows. Anyway, I am still using VectorMap to map handles to objects.
Just like you said, because it feels better ;)
[/off-topic]
All the buttons and stuff are windows too.
100-1000 is quite a lot. You should always use something else than
linear search here.
I use a hash map.
>Those are trivial numbers
compared to what containers are designed for. I could stuff all those
key-value pairs into an array and search through all of them all of the
time
and it probably wouldn't make one bit of difference to the user. But I
choose
a map anyway because it "looks and feels right".

Yes, you are doing the right thing here. (Well, you can have
vector-like associative containers and they are much faster than binary
trees or other node-based maps, but that is another story ;)
Tony

Dec 17 '06 #35

Tony wrote:
am likely to have? 100? 1000?

[off-topic]
A good GUI program will have about 1-10, as many as number of top-level
windows. Anyway, I am still using VectorMap to map handles to objects.
Just like you said, because it feels better ;)
[/off-topic]

All the buttons and stuff are windows too.
Only if the library you are using is poorly designed ;)

Mirek

Dec 17 '06 #36

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

3 posts views Thread by Naren | last post: by
1 post views Thread by Kay Schluehr | last post: by
17 posts views Thread by baibaichen | last post: by
26 posts views Thread by saxenavaibhav17 | last post: by
17 posts views Thread by qazmlp1209 | last post: by
1 post views Thread by CARIGAR | last post: by
reply views Thread by suresh191 | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.