473,769 Members | 2,382 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

STL: map vs vector

Hi,

This is a follow-up of sort of this thread:
http://groups.google.com/group/comp....2cab752779f1fa

I've been chewing on this problem for a while. When I came across the
above thread, I figured that maybe other members have a better idea on
how to approach this.

I'm currently trying to decide between which one of maps or vectors to
use. I'm trying to come up with a design that will let me perform fast
insertions and deletions (w/o duplicates) in a sorted "list" and then
look up every 25th or so element (and then iterate sequentially for the
next 25 values , so sequential look up). The indexed lookup happens
slightly more frequently than insertions and deletions, so it makes
sense to make lookup efficient. Because the insertions and deletions
also happen often, fast/efficient lookups are helpful as well. Memory
isnt as much of an issue as speed is.

I've looked around a bit, so if this is something that has been
discussed before, please let me know.

Right now I have 2 "solutions" .

One is a sorted vector, where I use 'lower_bound' to get where I should
insert the value and insert it there, and use it to find the value to
remove. Lookups are just normal lookups (thru
my_vector.begin ()+indexSubscri pt ). Most inserts and removals are from
the center of the list, so whenever there is an insertion or removal,
I'm guessing that all the elements are copied up or down. The values
are user defined objects so thats a lot of copying calls.

The other design uses a map and a vector. The map contains the actual
sorted values. The vector is an index to every 25th key which is reset
at every insertion and removal (from the closest previous key in the
map onwards). To access the 25th*x value, you would get the
corresponding key in the index and find it in the map, and then iterate
sequentially.

After implementing both of these, the performance difference is really
minimal for me, but I'm not sure if I'm missing a better way of doing
this. I'm guessing this is something that people have come across
often. I'm a new C++/STL user, so I might be missing a good data
structure to use or a better solution to this problem. Any help would
be appreciated.

Thanks much.
-Z

May 18 '06 #1
13 20152
I'd suggest a map is generally a better idea as you intend often doing
insertion and deletion operations. My justification for this is that a
vector is generally implemented as an allocated array and a map is
generally implemented as a Red Black Tree. If insertion and deletions
are often required then the Red Black Tree will be faster without a
significant lose of speed for searching or spanning. Whereas insertion
and deletions in an array are significantly slower.

JB
May 18 '06 #2
I might be missing the obvious, but since you are considering between
map and a *vector*, it looks like std::set might do the job better than
map. The same fundamental problem would still persist, though.

The indexing is a problem, what you could do is that you have indexing
vector with only pointers-to-objects-in-set stored. You could try lazy
evaluation for the indexing "service": sort the indexing vector using
the object's address as predicate when first indexing query comes to
your higher level container (wrapper).

You got a nice problem there: sorting, fast insertion and removal -and-
linear indexing into the container. The indexing absolutely suggests
that you want vector-like storage. This is a big no if the objects you
are storing are expensive to copy. Pointers are cheaper to copy so that
will reduce the overhead. That's the rationalization I have for the
suggestion.

What works best in situation where ideal solution is not obvious
depends to a degree on the usage pattern. It's not obvious to me from
your post (I didn't follow the link to previous thread).

Important: the lazy evaluation is highly important here, you *want* to
have a flag indicating when the pointer vector is not sorted. Ofcourse,
still, doing "random" removals from a vector will cost a bundle. It's
really ugly.. if possible I'd try to adjust the client code to not
require indexing into the container (most probably highly inpractical
but just a thought..) -- note -- removal won't invalidate the order,
only insertion. I'd postpone the sorting until a query is made, then
you can use trivial std::sort

May 19 '06 #3
Hi,

I figured there was no real difference between a set and a map, so at
least I'll be able to map to keys in map, which might be easier for me.
I was a little confused about pointers to objects. If I add a new
object or remove an object from the set/map .. is the pointer
invalidated?

One thing I can say about usage is that reading is periodic. So say,
every 30 ms, you read chunks of 25 objects from the "list". You only
read things in chunks so you read the 1st 25 elements .. and then
somewhere else you might read 75-100th elements.

Since querying is performed so often, I'm not sure how helpful lazy
evaluation will be though, or maybe I'm confused about this.

Thanks!
Mehwish

persenaama wrote:
I might be missing the obvious, but since you are considering between
map and a *vector*, it looks like std::set might do the job better than
map. The same fundamental problem would still persist, though.

The indexing is a problem, what you could do is that you have indexing
vector with only pointers-to-objects-in-set stored. You could try lazy
evaluation for the indexing "service": sort the indexing vector using
the object's address as predicate when first indexing query comes to
your higher level container (wrapper).

You got a nice problem there: sorting, fast insertion and removal -and-
linear indexing into the container. The indexing absolutely suggests
that you want vector-like storage. This is a big no if the objects you
are storing are expensive to copy. Pointers are cheaper to copy so that
will reduce the overhead. That's the rationalization I have for the
suggestion.

What works best in situation where ideal solution is not obvious
depends to a degree on the usage pattern. It's not obvious to me from
your post (I didn't follow the link to previous thread).

Important: the lazy evaluation is highly important here, you *want* to
have a flag indicating when the pointer vector is not sorted. Ofcourse,
still, doing "random" removals from a vector will cost a bundle. It's
really ugly.. if possible I'd try to adjust the client code to not
require indexing into the container (most probably highly inpractical
but just a thought..) -- note -- removal won't invalidate the order,
only insertion. I'd postpone the sorting until a query is made, then
you can use trivial std::sort


May 19 '06 #4
Hi,

I figured there was no real difference between a set and a map, so at
least I'll be able to map to keys in map, which might be easier for me.
I was a little confused about pointers to objects. If I add a new
object or remove an object from the set/map .. is the pointer
invalidated?

One thing I can say about usage is that reading is periodic. So say,
every 30 ms, you read chunks of 25 objects from the "list". You only
read things in chunks so you read the 1st 25 elements .. and then
somewhere else you might read 75-100th elements.

Since querying is performed so often, I'm not sure how helpful lazy
evaluation will be though, or maybe I'm confused about this.

Thanks!

May 19 '06 #5
za****@gmail.co m wrote:
I'm currently trying to decide between which one of maps or vectors to
use. I'm trying to come up with a design that will let me perform fast
insertions and deletions (w/o duplicates) in a sorted "list" and then
look up every 25th or so element (and then iterate sequentially for the
next 25 values , so sequential look up). The indexed lookup happens
slightly more frequently than insertions and deletions, so it makes
sense to make lookup efficient. Because the insertions and deletions
also happen often, fast/efficient lookups are helpful as well. Memory
isnt as much of an issue as speed is.
Do you iterate across all (or most) of the collection when you do the
25-at-a-time thing, without modifying the container (hence invalidating
iterators and such) in the middle of said iteration? That may be a
significant consideration. Note that map (and set) do *not* invalidate
iterators on insert and delete (except for iterators pointing to a
deleted element).
One is a sorted vector, where I use 'lower_bound' to get where I should
insert the value and insert it there, and use it to find the value to
remove. Lookups are just normal lookups (thru
my_vector.begin ()+indexSubscri pt ). Most inserts and removals are from
the center of the list, so whenever there is an insertion or removal,
I'm guessing that all the elements are copied up or down. The values
are user defined objects so thats a lot of copying calls.
Yeah, that's not a very good idea. You'll have to talk all day if you
want to try and convince me you've got a problem where the best
solution is to use a vector and insert into the middle of it over and
over.
The other design uses a map and a vector. The map contains the actual
sorted values. The vector is an index to every 25th key which is reset
at every insertion and removal (from the closest previous key in the
map onwards). To access the 25th*x value, you would get the
corresponding key in the index and find it in the map, and then iterate
sequentially.
....guh? Do you mean that you've got a vector<map<T>:: iterator>? So
basically, you're pessimizing by caching a whole lot of iterators you
usually won't need? If that's not what you mean, please show some code
to be more clear. Or use a simpler scheme...
After implementing both of these, the performance difference is really
minimal for me, but I'm not sure if I'm missing a better way of doing
this. I'm guessing this is something that people have come across
often. I'm a new C++/STL user, so I might be missing a good data
structure to use or a better solution to this problem. Any help would
be appreciated.


What's wrong with just using std::set, and simply doing (iter += 25)?
You get logarithmic insert and delete, a constant-time operation to
jump ahead 25 elements, few iterator invalidations, and the sorting is
done for you. The only thing you don't get is constant-time random
access, but log time should be plenty good, I'd think. Implement it
and see how the perf numbers come out.

If that doesn't suit your needs, std::list is another reasonable
option. Linear random access, but constant-time insert/delete, and
again very few iterator invalidations to worry about. I recommend
getting O'Reilly's _STL Pocket Guide_ and paging through it when you
have trouble choosing the right STL template.

Luke

May 19 '06 #6
Luke Meyers wrote:
Do you iterate across all (or most) of the collection when you do the
25-at-a-time thing, without modifying the container (hence invalidating
iterators and such) in the middle of said iteration? That may be a
significant consideration. Note that map (and set) do *not* invalidate
iterators on insert and delete (except for iterators pointing to a
deleted element).
I iterate through the collection after the position where the element
has been inserted or removed. So I'm basically resetting the index
(vector) after the position where the element was inserted. I'm not
storing iterators but keys (for the map), so I do a find in the map for
that key (but its logarithmic which isnt bad).
Yeah, that's not a very good idea. You'll have to talk all day if you
want to try and convince me you've got a problem where the best
solution is to use a vector and insert into the middle of it over and
over.
:)
...guh? Do you mean that you've got a vector<map<T>:: iterator>? So
basically, you're pessimizing by caching a whole lot of iterators you
usually won't need? If that's not what you mean, please show some code
to be more clear. Or use a simpler scheme...
No I have a vector<key> for a map<key, value>. Also, I actually do use
pretty much each one of the index since they're "played out"
elsewhere.
What's wrong with just using std::set, and simply doing (iter += 25)?
You get logarithmic insert and delete, a constant-time operation to
jump ahead 25 elements, few iterator invalidations, and the sorting is
done for you. The only thing you don't get is constant-time random
access, but log time should be plenty good, I'd think. Implement it
and see how the perf numbers come out.


I wasn't aware I could do a iter+=25 on a set iterator. Are you sure we
can? I'm currently using Nicholai M. Josuttis _C++ Standard Library_
and I wasn't aware that sets have random access iterators (I thought it
was bidirectional only). If I can do an efficient +=25, then that would
be great!!

Also, I tested the +=n with maps .. it didnt work, Im guessing sets are
the same inside .. so if theres a way, lemme know!!

Also I tried using it with sets, are you sure this is defined.

Sorry Im a relative newbie, so lemme know.

Thanks!

May 19 '06 #7
za****@gmail.co m wrote:
Luke Meyers wrote:
Do you iterate across all (or most) of the collection when you do the
25-at-a-time thing, without modifying the container (hence invalidating
iterators and such) in the middle of said iteration? That may be a
significant consideration. Note that map (and set) do *not* invalidate
iterators on insert and delete (except for iterators pointing to a
deleted element).
I iterate through the collection after the position where the element
has been inserted or removed. So I'm basically resetting the index
(vector) after the position where the element was inserted. I'm not
storing iterators but keys (for the map), so I do a find in the map for
that key (but its logarithmic which isnt bad).


Assuming your insertions occur with similar frequency at any point in
the list (as ISTR was stated or implied), this is no different in terms
of asymptotic complexity from resetting all the indices.
...guh? Do you mean that you've got a vector<map<T>:: iterator>? So
basically, you're pessimizing by caching a whole lot of iterators you
usually won't need? If that's not what you mean, please show some code
to be more clear. Or use a simpler scheme...


No I have a vector<key> for a map<key, value>. Also, I actually do use
pretty much each one of the index since they're "played out"
elsewhere.


I think that, under this scheme, it's better to use a vector of
iterators, since they permit constant-time element access rather than
log-time access. Sounds like you're re-caching them in all cases where
iterators would be invalidated (and then some) anyway. Of course, if
your constraint about the 25 thing was loose, you could save on a lot
of this reevaluation (which is of linear complexity each time you do
it).

But regardless, what does this buy you? In order to populate the
vector, don't you have to do all the work ahead of time that you're
supposedly saving by caching those keys/iterators in the vector?
Measure caching vs. non-cachingg and see if your "optimizati on" really
bought you anything.
I wasn't aware I could do a iter+=25 on a set iterator. Are you sure we
can? I'm currently using Nicholai M. Josuttis _C++ Standard Library_
and I wasn't aware that sets have random access iterators (I thought it
was bidirectional only). If I can do an efficient +=25, then that would
be great!!

Also, I tested the +=n with maps .. it didnt work, Im guessing sets are
the same inside .. so if theres a way, lemme know!!

Also I tried using it with sets, are you sure this is defined.


Sorry, I didn't try this myself prior to posting. Whether += is
defined is (perhaps surprisingly at first) kind of irrelevant. You've
got a forward-iterator, which means you've got operator++, so you can
always implement operator+= in terms of that. If the standard library
doesn't provide this, it's probably to avoid implying that the
operation is more efficient than it really is.

Here's a working implementation:

namespace std {
template <class FwIter>
FwIter &
operator+=(FwIt er & iter, size_t n)
{
for (int i = 0; i < n; ++i, ++iter); // no body
return iter;
}
}

Luke

May 22 '06 #8
The iter += 25 will in this case do ++iter 25 times most likely. Btw.
if you want to iterate after the object you did just insert, you can
use the iterator to that object and start from there (insert returns
iterator!).

May 22 '06 #9
Luke Meyers wrote:
za****@gmail.co m wrote:
If I can do an efficient +=25, then that would be great!!


Forgot to mention in my other post, but while the implementation I gave
is of course linear w.r.t. 25, 25 is a constant (and a small one), so
technically the operation is constant time. I assume you're clever
enough not to let this fool you into a false sense of efficiency,
though -- but the problem is a result of the data structure you chose,
not this operation.

Luke

May 22 '06 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

30
2744
by: Przemo Drochomirecki | last post by:
Hi, The task was indeed simple. Read 2.000.000 words (average length = 9), sort them and write it to new file. I've made this in STL, and it was almost 17 times slower than my previous "clear-C-code". I used <vector>, <algorithm>, <iostream> and <algorithm>. Is STL really so slow? Thx in adv. Przemo
9
1889
by: Aguilar, James | last post by:
Hey guys. A new question: I want to use an STL libarary to hold a bunch of objects I create. Actually, it will hold references to the objects, but that's beside the point, for the most part. Here's the question: I want to be able to change the references (including deleting them). Is there any way to do that besides using pointers rather than references for the STL library? I'd also prefer to avoid using const_cast, if it is indeed...
35
2653
by: Jon Slaughter | last post by:
I'm having a problem allocating some elements of a vector then deleting them. Basicaly I have something like this: class base { private: std::vector<object> V;
11
4754
by: ma740988 | last post by:
I'm perusing a slide with roughly 12 bullets spread across 3 pages. Each bullet reflects 'advice'. I'm ok with all but 1 bullet, more specifically the bullet that states: " Avoid the STL unless absolutely necessary " Now, I'm not acclimated with much C++ history, but something tells me this is akin to trepidation that surrounded C++ during it's inception? IOW, is this 'old school' rhetoric or ..... How do you refute this argument?
7
1991
by: rodrigostrauss | last post by:
I'm using Visual Studio 2005 Professional, and I didn't find the STL.NET. This code: #include "stdafx.h" #include <vector> using namespace System; using namespace std; int main(array<System::String ^> ^args)
8
3951
by: Gregory | last post by:
I have a question about using STL containers in C++ class public interface. Lets say that I want to return some container from class method or accept class method parameter as some container. For example: class A { public: const vector<int>& getTable() { return m_table; }
9
4620
by: Christian Chrismann | last post by:
Hi, I've a runtime problem with STL vectors. Here is the simplified version of the code: template <class Tclass A { ... private: vector<T*myvector; typename vector<T*>::itarator mIt;
1
3074
by: krunalbauskar | last post by:
Hi, Explicit instantiation of STL vector demands explicit instantiation of all the templates it using internally. For example - <snippet> #include <iostream> #include <vector>
2
2761
by: Builder | last post by:
How can I import the following COM interface to C#? DECLARE_INTERFACE_(IVertices, IUnknown) { STDMETHOD(get_vertices) (THIS_ vector<POINT>& vertices) PURE; STDMETHOD(set_vertices) (THIS_ vector<POINTvertices) PURE; }; Is it possible to import STL containers to C# at all?
7
3130
by: ademirzanetti | last post by:
Hi there !!! I would like to listen your opinions about inherit from a STL class like list. For example, do you think it is a good approach if I inherit from list to create something like "myList" as in the example below ? #include "Sector.h" using namespace boost;
0
9586
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9423
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
9861
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8869
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7406
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6672
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5298
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
3956
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3561
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.