473,382 Members | 1,387 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Which is more efficient:STL:Set or array?

Problem details below:
I have few items(simple values or objects) to be put into an array and
I implement it through a set rather than just an array of items. This
is because every time I get a new item I have to look into the Set
whether it is there or not. Depending on whether it is there or not I
have to respond differntly.
Since it is easy to search into a Set rather than a number of if
statement which will compare each of the element in the array with the
given new item I chose Set but worrying about the efficiency
comparision of them.

Thanks,
CodeCracker

Jul 23 '05 #1
25 2883
CodeCracker wrote:
Problem details below:
I have few items(simple values or objects) to be put into an array and
I implement it through a set rather than just an array of items. This
is because every time I get a new item I have to look into the Set
whether it is there or not. Depending on whether it is there or not I
have to respond differntly.
Since it is easy to search into a Set rather than a number of if
statement which will compare each of the element in the array with the
given new item I chose Set but worrying about the efficiency
comparision of them.

Thanks,
CodeCracker


Don't worry about it.

--
Anti-spam address, change each 'X' to '.' to reply directly.
Jul 23 '05 #2
Larry I Smith wrote:
Don't worry about it.


I like your answer. May I quote you?

Jul 23 '05 #3
Rapscallion wrote:
Larry I Smith wrote:
Don't worry about it.


I like your answer. May I quote you?


Sure. I get blamed for a lot... (:

Lary

--
Anti-spam address, change each 'X' to '.' to reply directly.
Jul 23 '05 #4
CodeCracker wrote:
Problem details below:
I have few items(simple values or objects) to be put into an array and
I implement it through a set rather than just an array of items. This
is because every time I get a new item I have to look into the Set
whether it is there or not. Depending on whether it is there or not I
have to respond differntly.
Since it is easy to search into a Set rather than a number of if
statement which will compare each of the element in the array with the
given new item I chose Set but worrying about the efficiency
comparision of them.

Thanks,
CodeCracker


Besides the QT manuals, you might get some good help in the
KDE newsgroup. Some smart QT developers hang out there.

comp.windows.x.kde

Larry

--
Anti-spam address, change each 'X' to '.' to reply directly.
Jul 23 '05 #5
Say in two situations :
1. I have 10 items
2. I have 50 or more items.
Still I should not worry about efficiency in either of them. I can go
ahead with either STL:set or array implementation.

Jul 23 '05 #6
CodeCracker wrote:
Problem details below:
I have few items(simple values or objects) to be put into an array
and I implement it through a set rather than just an array of items.
This is because every time I get a new item,
I have to look into the Set whether it is there or not.
Depending on whether it is there or not I have to respond differntly.
Since it is easy to search into a Set
rather than a number of if statement[s]
which will compare each of the element in the array with the given new item,
I chose Set but worrying about the efficiency comparision of them.


It sounds like the objects that you are describing *are* sets
so a set class template should be the most efficient representation.
These templates are part of the standard library for a reason.
The reason is to give the compiler developer an opportunity
to provide a highly optimized implementation.
These implementations are usually difficult to beat
with any "hand rolled" implementation.

As a C++ programmer, your job is to match the containers
required by your program with standard container class templates.
All of the classic Abstract Data Types (ADTs) are implemented
in the standard library so this shouldn't be a problem.
If you can't find a perfect match,
you should try to adapt one of the standard class templates
(by deriving an appropriate class [template] from it for example.)

Only rarely is it necessary
to design a new container class from scratch.
If you find yourself doing this,
you probably need to re-examine your requirements
and convince yourself that you haven't just made a big mistake.
Jul 23 '05 #7
I think I wasn't clear. Let me rehearse what I was telling.
I have some items( either simple values as int, float etc or objects (
some other classes of mine say FileClass).
Now every time somebody calls my function DoSomething() I do a search
inside my array to find that item and if it is found I call DoThis()
and if not found I call DoThat().
But the problem here is that if I use an array implementation I have to
write a whole lot of If statement or write a loop where it goes in
search of the current item in the array, for which it will make a whole
lot of comparison. Instead of doing all this if I use the STL:set
containers and insert my items once there and then search for my
current item every time in my set will it not be more efficient than
my array implementation.
Remember I am not creating my container from scratch and using the
existing containers only.

Jul 23 '05 #8
CodeCracker wrote:
I think I wasn't clear. Let me rehearse what I was telling.
I have some items( either simple values as int, float etc or objects (
some other classes of mine say FileClass).
Now every time somebody calls my function DoSomething() I do a search
inside my array to find that item and if it is found I call DoThis()
and if not found I call DoThat().
But the problem here is that if I use an array implementation I have to
write a whole lot of If statement or write a loop where it goes in
search of the current item in the array, for which it will make a whole
lot of comparison. Instead of doing all this if I use the STL:set
containers and insert my items once there and then search for my
current item every time in my set will it not be more efficient than
my array implementation.
Remember I am not creating my container from scratch and using the
existing containers only.


If you're going to populate your array once and then not change it
you're probably better off with a vector, because it uses less memory.

vector<whatever> vec(first, last);
sort(vec.begin(), vec.end());

when you need to find something, use

find(vec.begin(), vec.end(), X)

Even if you're going to add or remove things occasionally, this is
probably better. set wins when you add and remove things fairly often.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 23 '05 #9
CodeCracker wrote:
Say in two situations :
1. I have 10 items
2. I have 50 or more items.
Still I should not worry about efficiency in either of them. I can go
ahead with either STL:set or array implementation.


yes

--
Anti-spam address, change each 'X' to '.' to reply directly.
Jul 23 '05 #10
But if while adding I want to be sure that the item in the containers
are all unique then still vector be preferred.

Jul 23 '05 #11
CodeCracker wrote:
But if while adding I want to be sure that the item in the containers
are all unique then still vector be preferred.


You could use a set while you are inserting items. Once that is finished,
you copy the set into a vector, preserving the order of the items.
Afterwards looking up items in the vector will be fast using binary search.
The costs for the copy operation are very small compared to the savings
provided you have a lot of lookups. Also, this is only applicable if no
further changes to the data structure need to be done in the lookup-phase.
Best

Kai-Uwe Bux

ps.: Do not worry about efficiency before you profile your code. Quite
possibly the real timesink is somewhere else.
Jul 23 '05 #12

CodeCracker wrote:
But if while adding I want to be sure that the item in the containers
are all unique then still vector be preferred.


Just use std::set so your code will be simplier and most probably it
will be faster than you need. And for amount of ~50 elements you would
hardly measure the difference of execution time btw std::vector and
std::set (maybe even std::list). So use std::set and if profiler will
show that there is a problem with std::set then try something else.

Regards,
Vyacheslav

Jul 23 '05 #13
CodeCracker wrote:
But if while adding I want to be sure that
the item in the containers are all unique
then still vector be preferred.


No.
You are confused.
The object, as you have described it so far, is a set.
You should implement it with a set template class.
This will be the most efficient and portable implementation.
Using any other class template will almost certainly
result in a sub optimal implementation.

Study this object carefully.
It is either a set or it isn't.
If you can discover some reason why it isn't a set,
tell us and, perhaps, we can help you find a better match.
Jul 23 '05 #14
In message <d6**********@nntp1.jpl.nasa.gov>, E. Robert Tisdale
<E.**************@jpl.nasa.gov> writes
CodeCracker wrote:
But if while adding I want to be sure that
the item in the containers are all unique
then still vector be preferred.
No.
You are confused.
The object, as you have described it so far, is a set.
You should implement it with a set template class.
This will be the most efficient and portable implementation.
Using any other class template will almost certainly
result in a sub optimal implementation.

Study this object carefully.
It is either a set or it isn't.


Wrong level of abstraction. It's not what it "is", but what it _does_,
that matters. The C++ library containers and algorithms support various
operations - insert, erase, search - with different efficiency
tradeoffs. Patterns of use also vary - do you have separate phases which
build then search, or are the operations intermixed?

What you want is the container which most efficiently supports your
pattern of use. That might very well be std::set, but it doesn't have to
be.
If you can discover some reason why it isn't a set,
tell us and, perhaps, we can help you find a better match.


Note that the STL set-oriented algorithms includes, set_union,
set_intersection, set_difference, set_symmetric_difference are designed
to work on any sorted sequence, not just a std::set.

--
Richard Herring
Jul 23 '05 #15
In message <_q********************@rcn.net>, Pete Becker
<pe********@acm.org> writes
CodeCracker wrote:
I think I wasn't clear. Let me rehearse what I was telling.
I have some items( either simple values as int, float etc or objects (
some other classes of mine say FileClass).
Now every time somebody calls my function DoSomething() I do a search
inside my array to find that item and if it is found I call DoThis()
and if not found I call DoThat().
But the problem here is that if I use an array implementation I have to
write a whole lot of If statement or write a loop where it goes in
search of the current item in the array, for which it will make a whole
lot of comparison. Instead of doing all this if I use the STL:set
containers and insert my items once there and then search for my
current item every time in my set will it not be more efficient than
my array implementation.
Remember I am not creating my container from scratch and using the
existing containers only.

If you're going to populate your array once and then not change it
you're probably better off with a vector, because it uses less memory.

vector<whatever> vec(first, last);
sort(vec.begin(), vec.end());


And maybe vec.erase(unique(vec.begin(), vec.end()), vec.end()) to remove
duplicates?
when you need to find something, use

find(vec.begin(), vec.end(), X)
Since it's sorted, binary_search (if you just want to know if X is
there) or lower_bound (if you need an iterator to it) might be better.
Even if you're going to add or remove things occasionally, this is
probably better. set wins when you add and remove things fairly often.


--
Richard Herring
Jul 23 '05 #16
E. Robert Tisdale wrote:
The object, as you have described it so far, is a set.
You should implement it with a set template class.
This will be the most efficient and portable implementation.
Using any other class template will almost certainly
result in a sub optimal implementation.


It depends on what's important. As I said earlier, a set will use more
memory than a vector of the same size. In addition, its implementation
requires more code.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 23 '05 #17
Richard Herring wrote:

And maybe vec.erase(unique(vec.begin(), vec.end()), vec.end()) to remove
duplicates?


Sigh. Maybe. But designing from incomplete information is usually a
waste of time. I see no point in trying to fine-tune this example.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 23 '05 #18
Richard Herring wrote:
In message <_q********************@rcn.net>, Pete Becker
<pe********@acm.org> writes

when you need to find something, use

find(vec.begin(), vec.end(), X)

Since it's sorted, binary_search (if you just want to know if X is
there) or lower_bound (if you need an iterator to it) might be better.


Good point. That was the idea, but I used the wrong algorithm.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 23 '05 #19
Richard Herring wrote:
It's not what it "is", but what it _does_, that matters.


You are confused.

An object *is* what it *does*.
Jul 23 '05 #20
I conclude that I can continue with set for now with a small set of
elements in it but if I am going to have a >50 elements in the set I
must consider vector and use the unique algo while deleting the
duplicates with erase.
I took the path of set for the ease of use and did not consider any
efficiency of it. I want to delay the effciency topic until my testing
shows I need to optimize my code here.
Is my conclusion correct?

Jul 23 '05 #21
CodeCracker wrote:
I conclude that I can continue with set for now with a small set of
elements in it but if I am going to have a >50 elements in the set I
must consider vector and use the unique algo while deleting the
duplicates with erase.
I took the path of set for the ease of use and did not consider any
efficiency of it. I want to delay the effciency topic until my testing
shows I need to optimize my code here.
Is my conclusion correct?


Generally speaking, yes. On the other hand, if you're writing CAD
applications, every byte is important, and profiling is irrelevant:
always make things small. <g>

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 23 '05 #22
sam
Pete Becker wrote:
CodeCracker wrote:
I think I wasn't clear. Let me rehearse what I was telling.
I have some items( either simple values as int, float etc or objects (
some other classes of mine say FileClass).
Now every time somebody calls my function DoSomething() I do a search
inside my array to find that item and if it is found I call DoThis()
and if not found I call DoThat().
But the problem here is that if I use an array implementation I have to
write a whole lot of If statement or write a loop where it goes in
search of the current item in the array, for which it will make a whole
lot of comparison. Instead of doing all this if I use the STL:set
containers and insert my items once there and then search for my
current item every time in my set will it not be more efficient than
my array implementation.
Remember I am not creating my container from scratch and using the
existing containers only.

If you're going to populate your array once and then not change it
you're probably better off with a vector, because it uses less memory.

vector<whatever> vec(first, last);
sort(vec.begin(), vec.end());

when you need to find something, use

find(vec.begin(), vec.end(), X)

It is sure this using find is very elegant and efficent. But I've been
feeling pluzzl about how to create X for find. Is there any complex
example for defining X? eg. the comparison is not a primitive type like
int, float, etc...

Sam.
Even if you're going to add or remove things occasionally, this is
probably better. set wins when you add and remove things fairly often.

Jul 23 '05 #23
sam
Pete Becker wrote:
Richard Herring wrote:

And maybe vec.erase(unique(vec.begin(), vec.end()), vec.end()) to
remove duplicates?

how about vec.clear()?

Sam

Sigh. Maybe. But designing from incomplete information is usually a
waste of time. I see no point in trying to fine-tune this example.

Jul 23 '05 #24
sam wrote:

It is sure this using find is very elegant and efficent. But I've been
feeling pluzzl about how to create X for find. Is there any complex
example for defining X? eg. the comparison is not a primitive type like
int, float, etc...


The three-argument version of find (and of binary_search which, as was
pointed out, is the right one to use <g>) takes an object of the type
that the iterator points to. The four-argument version takes an object
of an arbitrary type and a predicate that compares the object that the
iterator points to with the passed object.

struct data
{
int value;
// other stuff
};

struct predicate
{
bool operator()(const data& obj, int val)
{ return obj.value == val; }

binary_search(vec.begin(), vec.end(), 3, predicate());

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Jul 23 '05 #25
In message <d6**********@nntp1.jpl.nasa.gov>, E. Robert Tisdale
<E.**************@jpl.nasa.gov> writes
Richard Herring wrote:
It's not what it "is", but what it _does_, that matters.
You are confused.


Not in the least. I consider whether a class matches my requirements,
not what its name happens to be.
An object *is* what it *does*.
I see you're getting the point. What it does, not what it's named. "is"
was in quotes above for a reason, referring to someone who said:
It is either a set or it isn't.


when they probably meant something more like

It is either being used for its set-like behaviour or it isn't.

--
Richard Herring
Jul 23 '05 #26

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: vk02720 | last post by:
How does the STL "set" class ensure uniqueness ? Does the contained type need to implement a certain method or override a certain operator ( != or < ) ? TIA
5
by: A. Prock | last post by:
Hello, I am using the std::set from the standard template library, and it seems that it is broken. I've tried both the MS version as well as the g++ version and I get the same bug in both...
12
by: CQ | last post by:
Hi there, I am having the following problem: I have the following class: class reachGraphState { protected: /* ... some stuff .... */ public:
20
by: Martin Jørgensen | last post by:
Hi, I'm reading a number of double values from a file. It's a 2D-array: 1 2 3 4 5 6 7 ------------- 1 3.2 2 0 2.1 3 9.3 4
1
by: creativeinspiration | last post by:
Hey Everybody. Does anybody know which operators need to be implemented in order to make the stl set of those objects works? For example I have a class myclass, and I want to have set<myclass>...
2
by: Stefan Istrate | last post by:
How can I find the k-th position in a STL set? I want to do this in O(logN) time, where N is the dimension of the set. Thank you!
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.