473,811 Members | 3,290 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

STL container question

Hi

I need to store a number of integer values which I will then search on
later to see if they exist in my container. Can someone tell me which
container would be quickest for finding these values? I can't use a
plain C array (unless I make it 2^32 in size!) since I don't know the
max integer value.

Thanks for any help

B2003
Oct 1 '08
80 2455
James Kanze wrote:
On Oct 1, 7:31 pm, Erik Wikström <Erik-wikst...@telia. comwrote:
>On 2008-10-01 18:57, Rolf Magnus wrote:
Jeff Schwab wrote:
Boltar wrote:
I need to store a number of integer values which I will
then search on later to see if they exist in my container.
Can someone tell me which container would be quickest for
finding these values? I can't use a plain C array (unless
I make it 2^32 in size!) since I don't know the max
integer value.
>Sorted vector. See Effective STL, Item 23.
>For the record, you wouldn't 2^32 integers, just 2^32 bits
= 500 MiB. It's actually not that much RAM, depending on
your target system, and would let you check for integers
with O(1) complexity (rather than O(log N)).
However, it can still be slower, since it's more or less the
worst thing you can do to the cache.
>Still, you should only get one cache-miss when looking for a
value, if you use a set or vector you will probably get more.

One thing I don't understand here: both a C style array and
std::vector use a single block of contiguous memory. How could
cache performance be any different for them?
We're not talking about C style arrays vs. vector, but about different
algorithms. One is to storing the values in a sorted array/vector, then
doing a binary search for the value. The other is like a trivial kind of
hash table, where each possible integer value has a boolean entry, so you
can simply use the value as index. Saves the search, but needs a big amount
of memory that is much larger than the cache.
Oct 1 '08 #21
Rolf Magnus wrote:
James Kanze wrote:
>On Oct 1, 7:31 pm, Erik Wikström <Erik-wikst...@telia. comwrote:
>>On 2008-10-01 18:57, Rolf Magnus wrote:
Jeff Schwab wrote:
Boltar wrote:
>I need to store a number of integer values which I will
>then search on later to see if they exist in my container.
>Can someone tell me which container would be quickest for
>finding these values? I can't use a plain C array (unless
>I make it 2^32 in size!) since I don't know the max
>integer value.
Sorted vector. See Effective STL, Item 23.
For the record, you wouldn't 2^32 integers, just 2^32 bits
= 500 MiB. It's actually not that much RAM, depending on
your target system, and would let you check for integers
with O(1) complexity (rather than O(log N)).
However, it can still be slower, since it's more or less the
worst thing you can do to the cache.
Still, you should only get one cache-miss when looking for a
value, if you use a set or vector you will probably get more.
One thing I don't understand here: both a C style array and
std::vector use a single block of contiguous memory. How could
cache performance be any different for them?

We're not talking about C style arrays vs. vector, but about different
algorithms. One is to storing the values in a sorted array/vector, then
doing a binary search for the value. The other is like a trivial kind of
hash table, where each possible integer value has a boolean entry, so you
can simply use the value as index. Saves the search, but needs a big amount
of memory that is much larger than the cache.
Right, but the whole thing doesn't have to be in cache, just the parts
that are used. The rest may as well be swapped out of memory
altogether. Of course, that's the point of cache. :)
Oct 2 '08 #22
Juha Nieminen wrote:
Ioannis Vranos wrote:
>Lists are implemented using pointers to point to the previous and to the
next elements, so list::sort(), is more efficient by changing pointer
values, while sorting a vector involves copying objects.

The original poster talked about storing integer values. I highly
doubt sorting a list of integers will be faster than sorting an array of
integers. In fact, I'm pretty sure of the contrary.

We must think generally. In general, sorting a list is faster than
sorting a vector, because the list sorting does not involve the
construction or destruction of any object.

Regarding ints, I think sorting a vector of ints and as list of ints,
both have about the same efficiency.
If the programmer decides to replace ints with other objects, he will
not have to change much in the code, if he uses a list.
Oct 2 '08 #23
Pete Becker wrote:
>
In other words, no. <gOne could also point out that quicksort can be
used to sort a vector but can't be used on a list, so obviously sorting
a vector is faster.

list::sort is faster than any sort on vector, because it does not
involve the construction or destruction of any object.
Oct 2 '08 #24
James Kanze wrote:
To other considerations: you can't use binary_search on
std::list---with any sort of size, that's going to make a
significant difference in favor of vector.

Of course you can:
#include <algorithm>
#include <list>
#include <cstddef>
#include <iostream>
int main()
{
using namespace std;

typedef list<intintlist ;

intlist ilist;
for (size_t i= 0; i< 100; ++i)
ilist.push_back (100-i);
ilist.sort();
for(intlist::co nst_iterator p= ilist.begin(); p!= ilist.end(); ++p)
cout<< *p<< endl;
binary_search(i list.begin(), ilist.end(), 50);
}
Oct 2 '08 #25
Ioannis Vranos wrote:
[...]
We must think generally. In general, sorting a list is faster than
sorting a vector, because the list sorting does not involve the
construction or destruction of any object.

Regarding ints, I think sorting a vector of ints and as list of ints,
both have about the same efficiency.
Why don't you just measure before you doubt the statements
of those who already went and did this?

On my platform, this

#include <iostream>
#include <vector>
#include <list>
#include <algorithm>

const unsigned int kLimit = 1000000;

template< class Cont >
void fill(Cont& cont)
{
for( unsigned int u = 0; u < kLimit; ++u ) {
cont.push_back( u);
}
}

template< class Cont >
void test(Cont& cont);

int main()
{
std::vector<uns igned intv; v.reserve(kLimi t);
std::list<unsig ned intl;

std::cout << "filling a vector..." << std::endl;
fill(v);
std::cout << "filling a list..." << std::endl;
fill(l);
std::cout << "...done.\n ";

std::cout << "sorting a vector..." << std::endl;
test(v);
std::cout << "sorting a list..." << std::endl;
test(l);

return 0;
}

template< typename T, class Al >
inline void sort(std::vecto r<T,Al>& v) {std::sort(v.be gin(),v.end()); }

template< typename T, class Al >
inline void sort(std::list< T,Al>& l) {l.sort();}

#include <windows.h//for GetTickCount()

template< class Cont >
void test(Cont& cont) {
const DWORD start = GetTickCount();
sort(cont);
std::cout << "...took " << GetTickCount()-start << "msecs." << std::endl;
}

outputs
filling a vector...
filling a list...
...done.
sorting a vector...
...took 47msecs.
sorting a list...
...took 1562msecs.
and thus agrees with everyone who disagreed with you.
If the programmer decides to replace ints with other objects, he will
not have to change much in the code, if he uses a list.
Right. It wouldn't do any good anyway as this

#include <iostream>
#include <vector>
#include <list>
#include <algorithm>
#include <string>

const unsigned int kLimit = 1000000;

template< class Cont >
void fill(Cont& cont)
{
for( unsigned int u = 0; u < kLimit; ++u ) {
cont.push_back( Test());
}
}

template< class Cont >
void test(Cont& cont);

class Test {
public:
Test() : instance_(++id_ ), str_("test it!") {}
bool operator<(const Test& rhs) {return instance_>rhs.i nstance_;}
private:
unsigned int instance_;
static unsigned int id_;
std::string str_;
};

unsigned int Test::id_ = 0;

int main()
{
std::vector<Tes tv; v.reserve(kLimi t);
std::list<Testl ;

std::cout << "filling a vector..." << std::endl;
fill(v);
std::cout << "filling a list..." << std::endl;
fill(l);
std::cout << "...done.\n ";

std::cout << "sorting a vector..." << std::endl;
test(v);
std::cout << "sorting a list..." << std::endl;
test(l);

return 0;
}

template< typename T, class Al >
inline void sort(std::vecto r<T,Al>& v) {std::sort(v.be gin(),v.end()); }

template< typename T, class Al >
inline void sort(std::list< T,Al>& l) {l.sort();}

#include <windows.h//for GetTickCount()

template< class Cont >
void test(Cont& cont) {
const DWORD start = GetTickCount();
sort(cont);
std::cout << "...took " << GetTickCount()-start << "msecs." << std::endl;
}

outputs

filling a vector...
filling a list...
...done.
sorting a vector...
...took 829msecs.
sorting a list...
...took 2437msecs.

and thus again disagrees with you.

Eagerly awaiting your counter example,

Scho-you-can-take-your-foot-out-of-your-mouth-now-bi
Oct 2 '08 #26
On Oct 1, 6:40*pm, Pete Becker <p...@versatile coding.comwrote :
On 2008-10-01 10:23:13 -0400, Ioannis Vranos
<ivra...@no.spa m.nospamfreemai l.grsaid:
Victor Bazarov wrote:
Ioannis Vranos wrote:
Victor Bazarov wrote:
>>Store first, then sort, then search (using 'std::binary_se arch'), you
could just use 'std::vector'.
>For that case, I think std::list is a better option, since the sorting
will be faster,
Do you have any proof of that?
Lists are implemented using pointers to point to the previous and to
the next elements, so list::sort(), is more efficient by changing
pointer values, while sorting a vector involves copying objects.

In other words, no. <gOne could also point out that quicksort can be
used to sort a vector but can't be used on a list, so obviously sorting
a vector is faster.
Not always so.

You are correct that as lists do not provide random-access iterators
quicksort can not be used. Instead, lists are normally sorted using
merge sort. Merge sort has better worst-case performance than
quicksort. The drawback of merge sort is that it normally requires
extra memory, whereas quicksort does not.

http://en.wikipedia.org/wiki/Merge_sort
<quote>
In the worst case, merge sort does about 39% fewer comparisons than
quicksort does in the average case; merge sort always makes fewer
comparisons than quicksort, except in extremely rare cases, when they
tie, where merge sort's worst case is found simultaneously with
quicksort's best case. In terms of moves, merge sort's worst case
complexity is O(n log n)—the same complexity as quicksort's best case,
and merge sort's best case takes about half as many iterations as the
worst case.
</quote>

--
Max
Oct 2 '08 #27
Hendrik Schober wrote:
Ioannis Vranos wrote:
>[...]
We must think generally. In general, sorting a list is faster than
sorting a vector, because the list sorting does not involve the
construction or destruction of any object.

Regarding ints, I think sorting a vector of ints and as list of ints,
both have about the same efficiency.

Why don't you just measure before you doubt the statements
of those who already went and did this?

On my platform, this

[ Non-portable code...]
>
and thus again disagrees with you.

Eagerly awaiting your counter example,

#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>
class SomeClass
{
typedef std::vector<int TypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();

bool operator<(const SomeClass &argSomeClas s) const
{
return vec[0]< argSomeClass.ve c[0];
}
};

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=10000;

typedef vector<SomeClas sVector;
typedef list<SomeClassL ist;
cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

cout<<" Done!\n\n"<< flush;

List lis(vec.size()) ;
cout<< "Filling list with vector elements..."<< flush;

for(Vector::siz e_type i= 0; i< vec.size(); ++i)
lis.push_back(v ec[i]);

cout<< " Done!\n\n"<< flush;
clock_t timeBeginVector , timeEndVector, timeBeginList, timeEndList;
cout<< "Timing the sorting of the vector..."<< flush;

timeBeginVector = clock();

sort(vec.begin( ), vec.end());

timeEndVector= clock();

cout<< " Done!\n\n"<< flush;
cout<< "Timing the sorting of the list..."<< flush;

timeBeginList= clock();

lis.sort();

timeEndList= clock();
cout<< " Done!\n\n"<< flush;
cout<< "The sorting of the vector took "
<< static_cast<dou ble>((timeEndVe ctor- timeBeginVector ))/
CLOCKS_PER_SEC
<< " seconds\n\n";

cout<< "The sorting of the list took "
<< static_cast<dou ble>((timeEndLi st- timeBeginList))/
CLOCKS_PER_SEC
<< " seconds\n\n";
}

SomeClass::Some Class():vec(Vec torSize)
{
using namespace std;

for(TypeVector: :size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin( ), vec.end());
}
Oct 2 '08 #28
Ioannis Vranos wrote:
[...]
==The timings less than 1 second are not conclusive. Add 0s until you
exceed 1-2 seconds.
If I add a single one more, the program runsout of memory.

Schobi
Oct 2 '08 #29
Ioannis Vranos wrote:
The program had a serious bug.
corrected:
[...]
Creating vector with 100000 elements... Done!
Filling list with vector elements... Done!
Timing the sorting of the vector... Done!
Timing the sorting of the list... Done!
The sorting of the vector took 0.015 seconds
The sorting of the list took 0.437 seconds

Schobi
Oct 2 '08 #30

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3509
by: kazio | last post by:
Hello, So, I need to have double linked, circular list (last element point to the first one and first one points to the last one). I thought maybe I could use list container from STL, but unfortunately probably not. I checked it, and when I increase (++) iterator pointing to the last element the program crashes. I know the standard says, this is a linear list (with beginning and the end), but I completely don't understand why they...
1
2231
by: Wolfgang Lipp | last post by:
my question is: do we need container elements for repeating elements in data-centric xml documents? or is it for some reason very advisable to introduce containers in xml documents even where not strictly needed? how can a recommendation on this in the light of existing tools like w3c xml schema and relaxng as well es established practice be answered? i would greatly appreciate any words, pointers, and links. the exposition of the...
4
13048
by: Ulrich Sprick | last post by:
Hi all, (DB2 V7.1 for WinNT) I am looking for a way to determine the free space in my tablespace (containers), but I can't find out. The tablespace in question is a system managed tablespace in a raw partition. The Control Center always reports 100% usage (although I can insert data...). The "list tablespace containers for n show detail" command shows the number of usable pages, but they are almost equal to the total number of pages. ...
3
2638
by: jignesh shah | last post by:
Hi all, Is there a way to recover a single container if its been corrupted or mark bad without restoring whole tablespace? environment: db28.1/aix5.1/tsm/rs-6000. Regards Jignesh
1
3403
by: Don Hames | last post by:
I have a windows application that has a split container in the client area. In the left panel, I added controls via the designer in VS 2005. In the right panel, I want to dynamically create and display other forms that I created in the project. Is this possible? Can the panel2 of a split container be the container for a dynamically created form? How else can I create a series of other windows (forms) and place them in the right panel of the...
7
20015
by: toton | last post by:
Hi, I want a circular buffer or queue like container (queue with array implementation). Moreover I want random access over the elements. And addition at tail and remove from head need to be low cost. STL vector is suitable for removing form tail? or it is as costly as removing from middle? any other std container to serve this purpose? (note , I dont need linked list implementation of any container, as I want random access)
2
2011
by: Daniel Lipovetsky | last post by:
I would like for an object to "report" to a container object when a new instance is created or deleted. I could have a container object that is called when a new instance is created, as below. class AnyObject: pass class Container: links = def add(self,other):
18
1825
by: Goran | last post by:
Hi @ all! Again one small question due to my shakiness of what to use... What is better / smarter? private: vector<MyClass_t* itsVector; OR...
36
2046
by: Peter Olcott | last post by:
So far the only way that I found to do this was by making a single global instance of the container class and providing access to the contained class, through this single global instance. Are there any other no-overhead ways that a contained class can access its container? The obvious choice of passing (a pointer or a reference to the container) to the contained class is not a no-overhead solution, it requires both memory and time. I am...
3
1521
by: Rob McDonald | last post by:
I am interested in having a container which has properties of both the STL's list and vector. (I want my cake and to eat it too). In my application, I will need to add/remove items from arbitrary points in the container. I will also need to be able to perform random access to elements of the container -- accessed by index, not associatively. Fortunately, in my application, I don't need to do both of these
0
9726
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9605
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10647
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10384
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10395
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7667
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6887
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5553
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4338
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.