473,406 Members | 2,707 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Efficiency of map and vector iterators



I'm curious why I might be getting such a large performance
difference between using a map iterator and a vector iterator.
This is a computational electromagnetics code where I have to
separate points in space that arrive randomly into groupings
based on (x, y, z) dimensions, so I use std::map to do that.
Once the sorting is done I need to find the interactions between
groups, so I've been using iterators to step through the map.
As a test, I then modified the code a bit to first copy the
groups from the map into a vector so I could use a vector
iterator. I didn't expect any significant performance difference
since there are heavy floating point calculations going on at
each step. However, with Intel C++ 7.1 the compuation time to do
the nested loops below decreased from 78.5 seconds to 60 seconds
when I switched to the vector. The difference with GCC was less
severe, but it still decreased from 81.5 seconds with the map to
73.1 seconds with the vector.

My question is why there is such a dramatic difference in
these two cases. With all the floating point calculations going
on, I can't imagine that the actual time to increment the
iterators is having an effect. Could the map template be
preventing optimizations that the vector template alone
permits? (I used -O3 optimization in both cases.) Is this
a typical penalty paid when iterating through a map instead
of a vector?
Code sample with map or vector implementation chosen using
#define statements. Note that I added the pointers to the map
elements to allow the same code to be used below with either
iterator. This had no effect on the execution time.

using namespace std;

typedef map<DIMS<int> , GROUP> GROUP_MAP; // DIMS has x, y, z indices
vector<GROUP_MAP> groups_map;

#ifdef USE_GROUPS_VECTOR
typedef vector<GROUP> GROUP_VECTOR;
vector<GROUP_VECTOR> groups_vector;

.... // Copy groups from map to vector

#endif

for (int level = 0; level < LEVELS; ++level) {
#ifdef USE_GROUPS_MAP // Slower
for (GROUP_MAP::iterator M_it = groups_map[level].begin();
M_it != groups_map[level].end(); ++M_it) {
GROUP *m_group = &(M_it->second); // For compatibility below
for (GROUP_MAP::iterator N_it = groups_map[level].begin();
N_it != groups_map[level].end(); ++N_it) {
GROUP *n_group = &(N_it->second); // For compatibility below
#endif
#ifdef USE_GROUPS_VECTOR // Faster
for (GROUP_VECTOR::iterator m_group = groups_vector[level].begin();
m_group != groups_vector[level].end(); ++m_group) {
for (GROUP_VECTOR::iterator n_group = groups_vector[level].begin();
n_group != groups_vector[level].end(); ++n_group) {
#endif
... // Heavy floating-point calculations, calling Fortran/C libraries
}
}
}
Jul 22 '05 #1
14 4985
Jim West wrote:
I'm curious why I might be getting such a large performance
difference between using a map iterator and a vector iterator.


Incrementing a vector iterator is usually a single instruction (add
offset) while a map iterator is usually a memory lookup (or many memory
lookups) and hence more expensive computationally.

No surprises here.
Jul 22 '05 #2
In article <br********@dispatch.concentric.net>, Gianni Mariani wrote:
Jim West wrote:
I'm curious why I might be getting such a large performance
difference between using a map iterator and a vector iterator.


Incrementing a vector iterator is usually a single instruction (add
offset) while a map iterator is usually a memory lookup (or many memory
lookups) and hence more expensive computationally.


I understand that. However, I do six fast Fourier transforms (lengths
up to 100s) and several matrix-vector multiplies within each loop.
I can't imagine that the time needed to increment the iterator could be
significant compared to the time needed to do the floating-point
computations. And yet I'm getting a 30% decrease in execution speed
when using a map with Intel C++. Hence my concern that there is
something else fundamentally different between the map and vector
templates that is having an effect that I am not aware of (and
should be!).
Jul 22 '05 #3
"Jim West" <eg***********@yahoo.com> wrote in message
news:sl********************@detrick.westlan...
In article <br********@dispatch.concentric.net>, Gianni Mariani wrote:
Jim West wrote:
I'm curious why I might be getting such a large performance
difference between using a map iterator and a vector iterator.


Incrementing a vector iterator is usually a single instruction (add
offset) while a map iterator is usually a memory lookup (or many memory
lookups) and hence more expensive computationally.


I understand that. However, I do six fast Fourier transforms (lengths
up to 100s) and several matrix-vector multiplies within each loop.
I can't imagine that the time needed to increment the iterator could be
significant compared to the time needed to do the floating-point
computations. And yet I'm getting a 30% decrease in execution speed
when using a map with Intel C++. Hence my concern that there is
something else fundamentally different between the map and vector
templates that is having an effect that I am not aware of (and
should be!).


Yes, that sounds fishy to me. I wouldn't be surprised if, somehow, the
number of calculations you are running is actually different in the two
cases.

--
Cy
http://home.rochester.rr.com/cyhome/
Jul 22 '05 #4
Jim West wrote:

I understand that. However, I do six fast Fourier transforms (lengths
up to 100s) and several matrix-vector multiplies within each loop.
I can't imagine that the time needed to increment the iterator could be
significant compared to the time needed to do the floating-point
computations. And yet I'm getting a 30% decrease in execution speed
when using a map with Intel C++. Hence my concern that there is
something else fundamentally different between the map and vector
templates that is having an effect that I am not aware of (and
should be!).

Can your run a profiler ?

Jul 22 '05 #5
Jim West wrote:

I understand that. However, I do six fast Fourier transforms (lengths
up to 100s) and several matrix-vector multiplies within each loop.
I can't imagine that the time needed to increment the iterator could be
significant compared to the time needed to do the floating-point
computations. And yet I'm getting a 30% decrease in execution speed
when using a map with Intel C++. Hence my concern that there is
something else fundamentally different between the map and vector
templates that is having an effect that I am not aware of (and
should be!).


Well, you can always take a look into the actual implementation of
'std::map<>::iterator's increment operator, see how "heavy" it really is
and decide whether the observed 30% reduction in execution speed is
justified.

Also, some other factors could be at play here. For example, adjacent
elements of 'std::vector' are stored in adjacent regions of memory,
which is not necessarily the case with 'std::map'. This can have
noticeable impact on the efficiency of hardware memory caches, which in
turn might have noticeable impact on the overall performance.

--
Best regards,
Andrey Tarasevich

Jul 22 '05 #6

"Jim West" <eg***********@yahoo.com> wrote in message
news:sl********************@detrick.westlan...
In article <br********@dispatch.concentric.net>, Gianni Mariani wrote:
Jim West wrote:
I'm curious why I might be getting such a large performance
difference between using a map iterator and a vector iterator.


Incrementing a vector iterator is usually a single instruction (add
offset) while a map iterator is usually a memory lookup (or many memory
lookups) and hence more expensive computationally.


I understand that. However, I do six fast Fourier transforms (lengths
up to 100s) and several matrix-vector multiplies within each loop.
I can't imagine that the time needed to increment the iterator could be
significant compared to the time needed to do the floating-point
computations. And yet I'm getting a 30% decrease in execution speed
when using a map with Intel C++. Hence my concern that there is
something else fundamentally different between the map and vector
templates that is having an effect that I am not aware of (and
should be!).


Looking through the headers on my system (VC6), there is a major difference
in the way end() is implemented.

In <vector> we have:
iterator end() {return (_Last); }
where the 'iterator' type is a typedef for a pointer

In <map> we have:
iterator end() {return (_Tr.end()); }
which leads to (in xtree):
iterator end() {return (iterator(_Head)); }

which leads to:
iterator(_Nodeptr _P) : const_iterator(_P) {}
and:
const_iterator(_Nodeptr _P) : _Ptr(_P) {}

So, for vector, end() just returns a pointer but, for map, each call results
in some memory allocation and object construction.

What happens if you cache the return value of end() when using the map
(assuming you are not altering the map so as to make this dangerous)?

Tom
Jul 22 '05 #7
Thomas Wintschel wrote in news:s9SBb.44169$d35.14903@edtnps84:

Looking through the headers on my system (VC6), there is a major
difference in the way end() is implemented.

In <vector> we have:
iterator end() {return (_Last); }
where the 'iterator' type is a typedef for a pointer

In <map> we have:
iterator end() {return (_Tr.end()); }
which leads to (in xtree):
iterator end() {return (iterator(_Head)); }

which leads to:
iterator(_Nodeptr _P) : const_iterator(_P) {}
and:
const_iterator(_Nodeptr _P) : _Ptr(_P) {}

So, for vector, end() just returns a pointer but, for map, each call
results in some memory allocation and object construction.

What happens if you cache the return value of end() when using the map
(assuming you are not altering the map so as to make this dangerous)?


A std::map (and std::set and std::multi...) end() iterator is fixed.

Insert's don't invalidate iterator's, and erase only invalidates the
iterators to the elements('s) being erased, and you can't erase end().

So this shouldn't be a problem.

Rob.
--
http://www.victim-prime.dsl.pipex.com/
Jul 22 '05 #8
In article <s9SBb.44169$d35.14903@edtnps84>, Thomas Wintschel wrote:


So, for vector, end() just returns a pointer but, for map, each call results
in some memory allocation and object construction.

What happens if you cache the return value of end() when using the map
(assuming you are not altering the map so as to make this dangerous)?
That is something that would never have occured to me. I will
check it in the morning.

Following Gianni's advice, I ran the profiler (why I didn't do it in the
first place is beyond me!) and two things jumped out at me:

1) The map version had the entry: (lines broken)

4.57 106.10 9.36 33835812 0.00 0.00
std::_Tree<std::_Tmap_traits<DIMS<int>, GROUP, std::less<DIMS<int> >,
std::allocator<std::pair<DIMS<int> const, GROUP> >, false>::const_iterator::_Inc()


which would account for about half the time difference (the 9.36 s
is the actual time). I found that quite surprising.

2) The entry for operator* multiplying a float by a complex<float> grew
from 1.20 s with vector to 8.90 s with map. I have no idea why
that might occur, other than an optimizing difference. (I get exactly
the same answer, down to the last decimal, so I don't think I'm
doing any extra calculations with the map.)

I'll tweak the stopping test tomrrow after I have had some time to
sleep.

Thanks for the help, everyone.
Jul 22 '05 #9

"Rob Williscroft" <rt*@freenet.REMOVE.co.uk> wrote in message
news:Xn**********************************@195.129. 110.200...
Thomas Wintschel wrote in news:s9SBb.44169$d35.14903@edtnps84:

Looking through the headers on my system (VC6), there is a major
difference in the way end() is implemented.

In <vector> we have:
iterator end() {return (_Last); }
where the 'iterator' type is a typedef for a pointer

In <map> we have:
iterator end() {return (_Tr.end()); }
which leads to (in xtree):
iterator end() {return (iterator(_Head)); }

which leads to:
iterator(_Nodeptr _P) : const_iterator(_P) {}
and:
const_iterator(_Nodeptr _P) : _Ptr(_P) {}

So, for vector, end() just returns a pointer but, for map, each call
results in some memory allocation and object construction.

What happens if you cache the return value of end() when using the map
(assuming you are not altering the map so as to make this dangerous)?


A std::map (and std::set and std::multi...) end() iterator is fixed.

Insert's don't invalidate iterator's, and erase only invalidates the
iterators to the elements('s) being erased, and you can't erase end().

So this shouldn't be a problem.

Rob.
--
http://www.victim-prime.dsl.pipex.com/


Just tried it out on a 933 MHz Intel. Constructed an empty map and an empty
vector and called end() on each a hundred million times. The map clocked in
at 34 seconds and the vector at 9.

I guess this only really matters to those who must iterate over grains of
sand in a desert.

Tom
Jul 22 '05 #10
<snip>
Gianni's advice, I ran the profiler (why I didn't do it in the
first place is beyond me!) and two things jumped out at me:

1) The map version had the entry: (lines broken)

4.57 106.10 9.36 33835812 0.00 0.00
std::_Tree<std::_Tmap_traits<DIMS<int>, GROUP, std::less<DIMS<int> >,
std::allocator<std::pair<DIMS<int> const, GROUP> >, false>
::const_iterator::_Inc()


which would account for about half the time difference (the 9.36 s
is the actual time). I found that quite surprising.

2) The entry for operator* multiplying a float by a complex<float> grew
from 1.20 s with vector to 8.90 s with map. I have no idea why
that might occur, other than an optimizing difference. (I get exactly
the same answer, down to the last decimal, so I don't think I'm
doing any extra calculations with the map.)


As far as the second point is concerned, it is difficult to to tell why
this happens without seeing the code. But as Andrey suggested the memory
access patterns may be an issue. Cache misses are very expensive on modern
processors.

--
Peter van Merkerk
peter.van.merkerk(at)dse.nl


Jul 22 '05 #11
Thomas Wintschel wrote:

Just tried it out on a 933 MHz Intel. Constructed an empty map and an empty
vector and called end() on each a hundred million times. The map clocked in
at 34 seconds and the vector at 9.

I guess this only really matters to those who must iterate over grains of
sand in a desert.


Carefull how you interpret this.

9 = time(vector.end()) + loop overhead
34 = time(map.end()) + loop overhead

if:

9 = loop overhead

then time(map.end()) is infinitly more expensive than time(map.end()) ...

BTW - it is likely that the loop overhead is more expensive than calling
vector.end() because vector.end() is more than likely inlined. In a
good optimizing compiler, it may not call it at all if it does nothing.

However, my tests with GCC 3.3.1 show that map operations are ~3 times
slower.

Jul 22 '05 #12
In article <sl********************@detrick.westlan>, Jim West wrote:

1) The map version had the entry: (lines broken)

4.57 106.10 9.36 33835812 0.00 0.00
std::_Tree<std::_Tmap_traits<DIMS<int>, GROUP, std::less<DIMS<int> >,
std::allocator<std::pair<DIMS<int> const, GROUP> >, false>
::const_iterator::_Inc()

Well, I figured out this part. I had a direct lookup of the map
elements using a DIMS<int> as the index buried in the inner loop
that didn't do anything (left over from a developmental version of
the code). Apparently it was optimized when switching to the vector
but not when iterating over the same map the lookup refered to. Even I
know that is an extremely slow operation! When I got rid of it the time
reduced to 66 seconds, versus 60 seconds for the vector version.

I feel a bit like an idiot for this one. Sorry to waste everyone's
time.

2) The entry for operator* multiplying a float by a complex<float> grew
from 1.20 s with vector to 8.90 s with map. I have no idea why
that might occur, other than an optimizing difference. (I get exactly
the same answer, down to the last decimal, so I don't think I'm
doing any extra calculations with the map.)


Still wondering about this one, though. My best guess is that it has
to do with cache hits (as suggested) since copying the groups into
a vector puts them into contiguous memory. Although right now I must
admit it is probably more likely that I messed up somewhere else.

Thanks again for the help, everyone.
Jul 22 '05 #13
In article <sl**************************@jwest.ecen.okstate.e du>, Jim West wrote:
2) The entry for operator* multiplying a float by a complex<float> grew
from 1.20 s with vector to 8.90 s with map. I have no idea why
that might occur, other than an optimizing difference. (I get exactly
the same answer, down to the last decimal, so I don't think I'm
doing any extra calculations with the map.)


Still wondering about this one, though. My best guess is that it has
to do with cache hits (as suggested) since copying the groups into
a vector puts them into contiguous memory. Although right now I must
admit it is probably more likely that I messed up somewhere else.


If anyone is still interested, this one went away when I upgraded to
Intel C++ version 8.0 (from 7.1). Apparently it was an optimizing
problem afterall.
Jul 22 '05 #14

"Jim West" <eg***********@yahoo.com> wrote in message news:sl**************************@jwest.ecen.oksta te.edu...


I'm curious why I might be getting such a large performance
difference between using a map iterator and a vector iterator.

[snip]
Comparative performance tests "vector iterator vs. map iterator" can be seen at :
http://article.gmane.org/gmane.comp.....perfometer/24

We can see that a map iterator uses more time than a vector iterator.
=====================================
Alex Vinokur
mailto:al****@connect.to
http://mathforum.org/library/view/10978.html
news://news.gmane.org/gmane.comp.lang.c++.perfometer
=====================================

Jul 22 '05 #15

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

11
by: Richard Thompson | last post by:
I've got a memory overwrite problem, and it looks as if a vector has been moved, even though I haven't inserted or deleted any elements in it. Is this possible? In other words, are there any...
3
by: codefixer | last post by:
Hello, I am trying to understand if ITERATORS are tied to CONTAINERS. I know the difference between 5 different or 6(Trivial, on SGI). But what I fail to understand is how can I declare all 5...
13
by: zaineb | last post by:
Hi, This is a follow-up of sort of this thread:...
2
by: ma740988 | last post by:
typedef std::vector < std::complex < double > > complex_vec_type; // option1 int main() { complex_vec_type cc ( 24000 ); complex_vec_type dd ( &cc, &cc ); } versus
9
by: Christian Chrismann | last post by:
Hi, I've a runtime problem with STL vectors. Here is the simplified version of the code: template <class Tclass A { ... private: vector<T*myvector; typename vector<T*>::itarator mIt;
6
by: toton | last post by:
Hi, I am inheriting a vector<Pointclass as a PointVector non-polymorphically, to add several additional functionalities to it. I know the dangers of inheriting a container class, as it doesn't...
2
by: Thormod Johansen | last post by:
Hi, I have just read the article about iterators at http://www.oreillynet.com/pub/a/network/2005/10/18/what-is-iterator-in-c-plus-plus.html?page=last and I am wondering about what is ment by the...
12
by: desktop | last post by:
Why does insert only work when specifying an iterator plus the object to be inserted: std::vector<intt; std::vector<int>::iterator it; it = t.begin(); t.insert(it,33); If I use push_back...
13
by: smp | last post by:
Does anyone know why making a vector of pointers is so much less efficient than a vector of objects? For a simple example: int num = 20; vector<int*v_int_ptr; v_int_ptr.reserve(num); ...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.