STL container question

Boltar

Hi

I need to store a number of integer values which I will then search on
later to see if they exist in my container. Can someone tell me which
container would be quickest for finding these values? I can't use a
plain C array (unless I make it 2^32 in size!) since I don't know the
max integer value.

Thanks for any help

B2003

Oct 1 '08

Subscribe Post Reply

2363

Ioannis Vranos

Lars Tetzlaff wrote:

>corrected:

>#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>
class SomeClass
{
typedef std::vector<intTypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();

bool operator<(const SomeClass &argSomeClass) const
{
return vec[0]< argSomeClass.vec[0];
}
};

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=10000;

typedef vector<SomeClassVector;
typedef list<SomeClassList;
cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

cout<<" Done!\n\n"<< flush;

=== List lis;
cout<< "Filling list with vector elements..."<< flush;

for(Vector::size_type i= 0; i< vec.size(); ++i)
lis.push_back(vec[i]);

cout<< " Done!\n\n"<< flush;
clock_t timeBeginVector, timeEndVector, timeBeginList, timeEndList;
cout<< "Timing the sorting of the vector..."<< flush;

timeBeginVector= clock();

sort(vec.begin(), vec.end());

timeEndVector= clock();

cout<< " Done!\n\n"<< flush;
cout<< "Timing the sorting of the list..."<< flush;

timeBeginList= clock();

lis.sort();

timeEndList= clock();
cout<< " Done!\n\n"<< flush;
cout<< "The sorting of the vector took "
<< static_cast<double>((timeEndVector- timeBeginVector))/
CLOCKS_PER_SEC
<< " seconds\n\n";

cout<< "The sorting of the list took "
<< static_cast<double>((timeEndList- timeBeginList))/
CLOCKS_PER_SEC
<< " seconds\n\n";
}

SomeClass::SomeClass():vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

>This program doesn't sort at all, because the vec is initialized with
one constant!

? Yes it is initalised with its number of elements, which are SomeClass
objects created with their default constructor.

On my system all values are equal. The constructor is only called once!
The other elements are initialized with a copy of this SomeClass object!

There are two vecs in the code. Please be more specific, which are you
referring to?

Oct 2 '08 #51

Lars Tetzlaff

Ioannis Vranos schrieb:

Lars Tetzlaff wrote:

>>corrected:

>>#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>
class SomeClass
{
typedef std::vector<intTypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();

bool operator<(const SomeClass &argSomeClass) const
{
return vec[0]< argSomeClass.vec[0];
}
};

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=10000;

typedef vector<SomeClassVector;
typedef list<SomeClassList;
cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

cout<<" Done!\n\n"<< flush;

=== List lis;
cout<< "Filling list with vector elements..."<< flush;

for(Vector::size_type i= 0; i< vec.size(); ++i)
lis.push_back(vec[i]);

cout<< " Done!\n\n"<< flush;
clock_t timeBeginVector, timeEndVector, timeBeginList, timeEndList;
cout<< "Timing the sorting of the vector..."<< flush;

timeBeginVector= clock();

sort(vec.begin(), vec.end());

timeEndVector= clock();

cout<< " Done!\n\n"<< flush;
cout<< "Timing the sorting of the list..."<< flush;

timeBeginList= clock();

lis.sort();

timeEndList= clock();
cout<< " Done!\n\n"<< flush;
cout<< "The sorting of the vector took "
<< static_cast<double>((timeEndVector- timeBeginVector))/
CLOCKS_PER_SEC
<< " seconds\n\n";

cout<< "The sorting of the list took "
<< static_cast<double>((timeEndList- timeBeginList))/
CLOCKS_PER_SEC
<< " seconds\n\n";
}

SomeClass::SomeClass():vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

>>This program doesn't sort at all, because the vec is initialized with
one constant!

? Yes it is initalised with its number of elements, which are SomeClass
objects created with their default constructor.

On my system all values are equal. The constructor is only called once!
The other elements are initialized with a copy of this SomeClass object!

There are two vecs in the code. Please be more specific, which are you
referring to?

Vector vec(SIZE); in main.

Lars

Oct 2 '08 #52

Ioannis Vranos

Lars Tetzlaff wrote:

>>
There are two vecs in the code. Please be more specific, which are you
referring to?

Vector vec(SIZE); in main.

Already spotted the strange behaviour and posted a new message with the
subject "Strange behaviour".

Oct 2 '08 #53

Rolf Magnus

Ioannis Vranos wrote:

Hendrik Schober wrote:
>Ioannis Vranos wrote:
>>The program had a serious bug.
corrected:
[...]

Creating vector with 100000 elements... Done!
Filling list with vector elements... Done!
Timing the sorting of the vector... Done!
Timing the sorting of the list... Done!
The sorting of the vector took 0.015 seconds
The sorting of the list took 0.437 seconds

Again, increase the number of const size_t SIZE so as to exceed 1-2
seconds for the sorting of one container, and then post your results.

What does your example have to do with the OP's question? The OP has one int
as member, while in your code, each element contains a vector of 1000 ints
instead. If this proves anything, then that you can construct a situation
where a list is faster than a vector. Unfortunately, that situation has
nothing to do with the OP's question.

Oct 2 '08 #54

Chris M. Thomasson

"Ioannis Vranos" <iv*****@no.spam.nospamfreemail.grwrote in message
news:gc***********@ulysses.noc.ntua.gr...

Chris M. Thomasson wrote:
>>

Doesn't a swap sort of imply a copy of "something"?

struct object {
int x;
};

object objs[2] = { { 0 }, { 1 } };

How do you swap `obj[0]' with `obj[1]' without copying "something"?

A vector::swap() can be a shallow swap. For example if a vector is
implemented as a pointer pointing to an array on the free store, swap
could just swap the pointer values, without interfering with the contents.

My point was that "something" is copied; in this case a pointer value.
However, in this case, that statement would be a ridiculous nit-pick!

;^)

Oct 2 '08 #55

Chris M. Thomasson

"Chris M. Thomasson" <no@spam.invalidwrote in message
news:Nw*******************@newsfe02.iad...

"Ioannis Vranos" <iv*****@no.spam.nospamfreemail.grwrote in message
news:gc***********@ulysses.noc.ntua.gr...
>Chris M. Thomasson wrote:
>>>

Doesn't a swap sort of imply a copy of "something"?

struct object {
int x;
};

object objs[2] = { { 0 }, { 1 } };

How do you swap `obj[0]' with `obj[1]' without copying "something"?

A vector::swap() can be a shallow swap. For example if a vector is
implemented as a pointer pointing to an array on the free store, swap
could just swap the pointer values, without interfering with the
contents.

My point was that "something" is copied; in this case a pointer value.
However, in this case, that statement would be a ridiculous nit-pick!

;^)

Anyway, I this was about sorting vectors... AFAICT, this implies some
swapping of the contents of slots in the array, and swapping implies
"something" is copied. If its a pointer value, so be it. If not, well, then
things can start to get expensive. The sort and swap impl needs to be very
highly optimized indeed.

Oct 2 '08 #56

Chris M. Thomasson

"Chris M. Thomasson" <no@spam.invalidwrote in message
news:bA******************@newsfe02.iad...

"Chris M. Thomasson" <no@spam.invalidwrote in message
news:Nw*******************@newsfe02.iad...
>"Ioannis Vranos" <iv*****@no.spam.nospamfreemail.grwrote in message
news:gc***********@ulysses.noc.ntua.gr...
>>Chris M. Thomasson wrote:
Doesn't a swap sort of imply a copy of "something"?

struct object {
int x;
};

object objs[2] = { { 0 }, { 1 } };

How do you swap `obj[0]' with `obj[1]' without copying "something"?

A vector::swap() can be a shallow swap. For example if a vector is
implemented as a pointer pointing to an array on the free store, swap
could just swap the pointer values, without interfering with the
contents.

My point was that "something" is copied; in this case a pointer value.
However, in this case, that statement would be a ridiculous nit-pick!

;^)

Anyway, I this was about sorting vectors...

^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Anyway, I _thought_ this was about sorting vectors...
ARGH!

AFAICT, this implies some swapping of the contents of slots in the array,
and swapping implies "something" is copied. If its a pointer value, so be
it. If not, well, then things can start to get expensive. The sort and
swap impl needs to be very highly optimized indeed.

Oct 2 '08 #57

Chris M. Thomasson

"Jerry Coffin" <jc*****@taeus.comwrote in message
news:MP************************@news.sunsite.dk...

In article <aeb09adf-92fc-45ce-aa7f-
80**********@l62g2000hse.googlegroups.com>, ma***************@gmail.com
says...

[ ... ]

>You are correct that as lists do not provide random-access iterators
quicksort can not be used.

Actually, that's not true. It's entirely possible to implement a
quicksort on a linked list. You can't (efficiently) use a median-of-
three (or whatever) pivot selection, but the quicksort itself has no
need for random-access iteration.

You can use quick-sort on a skip-list where one of the skip-links points to
the middle of the list, or some other pivot.

>Instead, lists are normally sorted using merge sort.

This, however, is true. Even though you _can_ use quicksort on a linked
list, the potential gains from doing so are much smaller than the
potential losses.

>Merge sort has better worst-case performance than
quicksort. The drawback of merge sort is that it normally requires
extra memory, whereas quicksort does not.

When applied to arrays/vectors, that's true. For a linked list, they
both require essentially the same extra memory.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Oct 2 '08 #58

Victor Bazarov

Ioannis Vranos wrote:

The program had a serious bug.
corrected:

[..]

OK, as it was determined, your compiler has a bug (that makes it invoke
the default c-tor for all elements of a vector constructed with only the
size given. I've added a copy constructor that does exactly the same
thing as the default constructor and finally got your results. A vector
of large objects with random values takes longer to sort than a list of
large objects with similarly random values. Significantly longer. Your
theory is confirmed, I suppose.

Now, it has really nothing to do with the OP's problem, but still...

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 2 '08 #59

Victor Bazarov

Erik WikstrÃ¶m wrote:

On 2008-10-02 16:11, Victor Bazarov wrote:
>[..] You're free to speculate what makes it
so *bad* on your machine, but I'd seriously think about switching to a
real OS and a real compiler, if I got such bad results as you do.

Long since I last heard anyone calling Linux and gcc toy products. :-)

I didn't call them toy products, either.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 2 '08 #60

Ioannis Vranos

Victor Bazarov wrote:

>
OK, as it was determined, your compiler has a bug (that makes it invoke
the default c-tor for all elements of a vector constructed with only the
size given.

Actually from the responses I got in the "Strange behaviour (corrected)"
thread, it isn't a bug.

I've added a copy constructor that does exactly the same
thing as the default constructor and finally got your results. A vector
of large objects with random values takes longer to sort than a list of
large objects with similarly random values. Significantly longer. Your
theory is confirmed, I suppose.

Now, it has really nothing to do with the OP's problem, but still...

Strangely enough there should be no difference. With the default copy
constructor I get the differences I mentioned, while you get them only
when you define the copy constructor explicitly.

Oct 2 '08 #61

jl_post

Juha Nieminen wrote:

The original poster talked about storing integer values. I highly
doubt sorting a list of integers will be faster than sorting an array of
integers. In fact, I'm pretty sure of the contrary.

On Oct 2, 4:08 am, Ioannis Vranos replied:

If the programmer decides to replace ints with other objects, he will
not have to change much in the code, if he uses a list.

Even if he uses a std::vector, he still won't have to change much
in the code.

In my experience, it is usually not a good thing to write code
optimized for something that "might" happen in the future. If it does
happen, let the coder change it then, and not earlier.

I've seen code where data was stored in a list (or vector), but
nothing whatsoever was being done with the list after that. When I
asked the original coder why it was there, the reply was, "So that if
a future coder needs a list of the data, it's there to use."

Seriously, though, if you saw a list or vector that had did nothing
other than being populated, wouldn't you think it was a bug (or at
least vestigial code that was forgotten to be removed)?

I mean, if I had to add code that required me to use a list/vector
of that data, I would probably declare and populate my own list (who
wouldn't?). Even if I did see the first list (the one written for
future use), I would probably pass it over thinking it was needed for
some other piece of code -- fearing that if I manipulated it in any
way it might break current functional code.

To be honest, I think the STL sort algorithms are efficient enough
that the type of container they sort doesn't really matter in the long
run. In other words, they scale well enough that their difference in
speed can be pretty much considered negligible.

That being said, I think the original poster should use the
container that best suits the problem best. If the data is meant to
be used as a linked-list, then he/she should use a std::list. If the
data is meant to be used as a vector, then he/she should use a
std::vector.

Personally, I would suggest using a std::set, since it seems like
all he/she needs the container for is for checking to see if a number
is among the set of numbers he/she has already encountered -- unless
he/she also needs to use the set of numbers somewhere else as a list
or a vector.

Take care,

-- Jean-Luc

Oct 2 '08 #62

Ioannis Vranos

Ioannis Vranos wrote:

Victor Bazarov wrote:
>>
OK, as it was determined, your compiler has a bug (that makes it
invoke the default c-tor for all elements of a vector constructed with
only the size given.

Actually from the responses I got in the "Strange behaviour (corrected)"
thread, it isn't a bug.

>I've added a copy constructor that does exactly the same thing as the
default constructor and finally got your results. A vector of large
objects with random values takes longer to sort than a list of large
objects with similarly random values. Significantly longer. Your
theory is confirmed, I suppose.

Now, it has really nothing to do with the OP's problem, but still...

Strangely enough there should be no difference. With the default copy
constructor I get the differences I mentioned, while you get them only
when you define the copy constructor explicitly.

Disregard this message of mine.

Oct 2 '08 #63

Ioannis Vranos

Second correction:
#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>
class SomeClass
{
typedef std::vector<intTypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

SomeClass();
SomeClass(const SomeClass &);

bool operator<(const SomeClass &argSomeClass) const
{
return vec[0]< argSomeClass.vec[0];
}
};

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=1000;

typedef vector<SomeClassVector;
typedef list<SomeClassList;
cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

cout<<" Done!\n\n"<< flush;

List lis;
cout<< "Filling list with vector elements..."<< flush;

for(Vector::size_type i= 0; i< vec.size(); ++i)
lis.push_back(vec[i]);

cout<< " Done!\n\n"<< flush;
clock_t timeBeginVector, timeEndVector, timeBeginList, timeEndList;
cout<< "Timing the sorting of the vector..."<< flush;

timeBeginVector= clock();

sort(vec.begin(), vec.end());

timeEndVector= clock();

cout<< " Done!\n\n"<< flush;
cout<< "Timing the sorting of the list..."<< flush;

timeBeginList= clock();

lis.sort();

timeEndList= clock();
cout<< " Done!\n\n"<< flush;
cout<< "The sorting of the vector took "
<< static_cast<double>((timeEndVector- timeBeginVector))/
CLOCKS_PER_SEC
<< " seconds\n\n";

cout<< "The sorting of the list took "
<< static_cast<double>((timeEndList- timeBeginList))/
CLOCKS_PER_SEC
<< " seconds\n\n";
}

SomeClass::SomeClass():vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}
==SomeClass::SomeClass(const SomeClass &):vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

Oct 2 '08 #64

Ioannis Vranos

jl*****@hotmail.com wrote:

Juha Nieminen wrote:

>> The original poster talked about storing integer values. I highly
doubt sorting a list of integers will be faster than sorting an array of
integers. In fact, I'm pretty sure of the contrary.

Personally, I would suggest using a std::set, since it seems like
all he/she needs the container for is for checking to see if a number
is among the set of numbers he/she has already encountered -- unless
he/she also needs to use the set of numbers somewhere else as a list
or a vector.

I also think std::set is the answer for the OP.

Oct 2 '08 #65

Victor Bazarov

Ioannis Vranos wrote:

jl*****@hotmail.com wrote:
>Juha Nieminen wrote:
>>> The original poster talked about storing integer values. I highly
doubt sorting a list of integers will be faster than sorting an
array of
integers. In fact, I'm pretty sure of the contrary.

Personally, I would suggest using a std::set, since it seems like
all he/she needs the container for is for checking to see if a number
is among the set of numbers he/she has already encountered -- unless
he/she also needs to use the set of numbers somewhere else as a list
or a vector.

I also think std::set is the answer for the OP.

Just like std::list, std::set wasts too much memory per element to be
efficient for storing large numbers of elements, especially if each
datum is of a fundamental type. There is no need to waste all that
memory if the collection stabilises before the lookups. If lookups and
changes to the collection were intermingled, then re-sorting a vector
(or reallocating it when inserting in the middle) would be too much,
then a set would probably be more efficient. Again, nothing can be said
with certainty without *measuring*.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 2 '08 #66

Jerry Coffin

In article <kL*************@read4.inet.fi>, no****@thanks.invalid
says...

Jerry Coffin wrote:
You can't (efficiently) use a median-of-three (or whatever) pivot selection

Not true.

Granted, you can't do it for the *first* partition, but there's no
problem in getting the median-of-three pivot in subsequent partitions.

Quite true -- my statement was sufficiently incomplete that it wasn't
really accurate as stated. Thanks for the correction.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Oct 3 '08 #67

Lars Tetzlaff

Ioannis Vranos schrieb:

Second correction:
==SomeClass::SomeClass(const SomeClass &):vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

You will never get a sorted vector. Every time you copy an element you
will change it's value!

Lars

Oct 3 '08 #68

Ioannis Vranos

Lars Tetzlaff wrote:

Ioannis Vranos schrieb:
>Second correction:
==SomeClass::SomeClass(const SomeClass &):vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

You will never get a sorted vector. Every time you copy an element you
will change it's value!

I think during sort(), the default assignment operator is used, and not
the copy constructor.

Oct 3 '08 #69

Lars Tetzlaff

Ioannis Vranos schrieb:

Lars Tetzlaff wrote:
>Ioannis Vranos schrieb:
>>Second correction:
==SomeClass::SomeClass(const SomeClass &):vec(VectorSize)
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

You will never get a sorted vector. Every time you copy an element you
will change it's value!

I think during sort(), the default assignment operator is used, and not
the copy constructor.

You have to make a copy during the exchange of two elements:

SomeClass tmp = vec[i]; // new value here
vec[i] = vec[j];
vec[j] = tmp;

Lars

Oct 3 '08 #70

Ioannis Vranos

Lars Tetzlaff wrote:

Ioannis Vranos schrieb:

You have to make a copy during the exchange of two elements:

SomeClass tmp = vec[i]; // new value here
vec[i] = vec[j];
vec[j] = tmp;

You are right, I will post a corrected version.
Thanks.

Oct 3 '08 #71

Ioannis Vranos

Third correction:
#include <iostream>
#include <ctime>
#include <vector>
#include <list>
#include <cstddef>
#include <algorithm>
class SomeClass
{
public:
typedef std::vector<intTypeVector;

TypeVector vec;

enum { VectorSize= 1000 };

public:

void fillWithSortedRandomNumbers();

bool operator<(const SomeClass &) const;

SomeClass();
};

SomeClass::SomeClass():vec(VectorSize){}
inline void SomeClass::fillWithSortedRandomNumbers()
{
using namespace std;

for(TypeVector::size_type i= 0; i< vec.size(); ++i)
vec[i]= rand();

sort(vec.begin(), vec.end());
}

inline bool SomeClass::operator<(const SomeClass &argSomeClass) const
{
return vec[0]< argSomeClass.vec[0];
}

int main()
{
using namespace std;

srand(time(0));

const size_t SIZE=10000;

typedef vector<SomeClassVector;
typedef list<SomeClassList;
cout<< "\nCreating vector with "<< SIZE<< " elements..."<< flush;
Vector vec(SIZE);

for(Vector::size_type i= 0; i< vec.size(); ++i)
vec[i].fillWithSortedRandomNumbers();

cout<<" Done!\n\n"<< flush;

List lis;
cout<< "Filling list with vector elements..."<< flush;

for(Vector::size_type i= 0; i< vec.size(); ++i)
lis.push_back(vec[i]);

cout<< " Done!\n\n"<< flush;
clock_t timeBeginVector, timeEndVector, timeBeginList, timeEndList;
cout<< "Timing the sorting of the vector..."<< flush;
timeBeginVector= clock();

sort(vec.begin(), vec.end());

timeEndVector= clock();

cout<< " Done!\n\n"<< flush;
cout<< "Timing the sorting of the list..."<< flush;

timeBeginList= clock();

lis.sort();

timeEndList= clock();
cout<< " Done!\n\n"<< flush;
cout<< "The sorting of the vector took "
<< static_cast<double>((timeEndVector- timeBeginVector))/
CLOCKS_PER_SEC
<< " seconds\n\n";

cout<< "The sorting of the list took "
<< static_cast<double>((timeEndList- timeBeginList))/
CLOCKS_PER_SEC
<< " seconds\n\n";
}

Oct 3 '08 #72

Rolf Magnus

Victor Bazarov wrote:

Erik WikstrÃ¶m wrote:
>On 2008-10-02 16:11, Victor Bazarov wrote:
>>[..] You're free to speculate what makes it
so *bad* on your machine, but I'd seriously think about switching to a
real OS and a real compiler, if I got such bad results as you do.

Long since I last heard anyone calling Linux and gcc toy products. :-)

I didn't call them toy products, either.

No, you just claimed they were not "a real OS and a real compiler".

Oct 3 '08 #73

Victor Bazarov

Rolf Magnus wrote:

Victor Bazarov wrote:

>Erik WikstrÃ¶m wrote:
>>On 2008-10-02 16:11, Victor Bazarov wrote:
[..] You're free to speculate what makes it
so *bad* on your machine, but I'd seriously think about switching to a
real OS and a real compiler, if I got such bad results as you do.
Long since I last heard anyone calling Linux and gcc toy products. :-)
I didn't call them toy products, either.

No, you just claimed they were not "a real OS and a real compiler".

No, I didn't *claim* that either. I *implied* it, maybe. And still,
it's up to the reader to [try to] deduce what I *implied*. Don't put
words into my mouth, please.

V
--
Please remove capital 'A's when replying by e-mail
I do not respond to top-posted replies, please don't ask

Oct 3 '08 #74

Branimir Maksimovic

On Oct 1, 3:31*pm, Boltar <boltar2...@yahoo.co.ukwrote:

Hi

I need to store a number of integer values which I will then search on
later to see if they exist in my container. Can someone tell me which
container would be quickest for finding these values? I can't use a
plain C array (unless I make it 2^32 in size!) since I don't know the
max integer value.

Thanks for any help

B2003

You don;t need to know maximum value but aproximatelly
maximum number of integers that would be there.
If you are good guesser this hash class will suit your
needs. It has O(1) lookup and if there are more
numbers than table entries will do linear search,
but that shouldn't happen as much.

class HashT{
public:
HashT(unsigned max = 100):table_(max){}
bool find(unsigned num)const
{
unsigned index = num%table_.size();
if(table_[index].size()>0)
{
vector<int>::const_iterator
i = std::find(table_[index].begin(),table_[index].end(),num);
return i!=table_[index].end();
}
return false;
}
void insert(unsigned num)
{
unsigned index = num%table_.size();
if(table_[index].empty())
table_[index].push_back(num);
else
{
vector<int>::const_iterator
i = std::find(table_[index].begin(),table_[index].end(),num);
if(i == table_[index].end())
table_[index].push_back(num);
}
}
private:
vector<vector<int table_;
};

Greetings, Branimir.

Oct 4 '08 #75

Jerry Coffin

In article <gc***********@ulysses.noc.ntua.gr>,
iv*****@no.spam.nospamfreemail.gr says...

[ ... ]

A vector::swap() can be a shallow swap.

Not only can be, but basically _has_ to be. In particular, the _only_
situation in which vector::swap is allowed to throw is if the
container's allocator throws. This means that swap can't use _any_ of
the operators built into the type being contained. That leaves a shallow
swap as essentially the only option.

--
Later,
Jerry.

The universe is a figment of its own imagination.

Oct 4 '08 #76

aku ankka

On Oct 1, 11:08*pm, James Kanze <james.ka...@gmail.comwrote:

On Oct 1, 7:31 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
One thing I don't understand here: both a C style array and
std::vector use a single block of contiguous memory. *How could
cache performance be any different for them?

The memory isn't contiguous on most common contemporary systems; it is
heavily fragmented and just appears contiguous (paged).

Oct 5 '08 #77

Juha Nieminen

aku ankka wrote:

On Oct 1, 11:08 pm, James Kanze <james.ka...@gmail.comwrote:
>On Oct 1, 7:31 pm, Erik Wikström <Erik-wikst...@telia.comwrote:
One thing I don't understand here: both a C style array and
std::vector use a single block of contiguous memory. How could
cache performance be any different for them?

The memory isn't contiguous on most common contemporary systems; it is
heavily fragmented and just appears contiguous (paged).

AFAIK that's completely inconsequential from the point of view of the
cache. The size of a cache block is much smaller than the size of a
memory page (which is probably a multiple of it).

Oct 5 '08 #78

Hendrik Schober

Ioannis Vranos wrote:

Third correction:
[...]

So it took you three tries and code that's just about as
far from the original requirement as possible while still
dealing with lists and vectors to come up with an example
where a list is actually faster than a vector.
Do you realize that you just proofed everyone else's point.
(Start with a vector, and measure whether a list is faster,
for it might happen, although not in general.)

Schobi

Oct 5 '08 #79

Chris M. Thomasson

"Erik WikstrÃ¶m" <Er***********@telia.comwrote in message
news:du*****************@newsb.telia.net...

On 2008-10-01 18:57, Rolf Magnus wrote:
>Jeff Schwab wrote:

>>Boltar wrote:

I need to store a number of integer values which I will then search on
later to see if they exist in my container. Can someone tell me which
container would be quickest for finding these values? I can't use a
plain C array (unless I make it 2^32 in size!) since I don't know the
max integer value.

Sorted vector. See Effective STL, Item 23.

For the record, you wouldn't 2^32 integers, just 2^32 bits = 500 MiB.
It's actually not that much RAM, depending on your target system, and
would let you check for integers with O(1) complexity (rather than O(log
N)).

However, it can still be slower, since it's more or less the worst thing
you
can do to the cache.

Still, you should only get one cache-miss when looking for a value, if
you use a set or vector you will probably get more.

It takes some effort to maximize the use of the cache. I don't think one
could even use a `std::vector' in a cache-friendly manner. Well, first of
all, the memory for the vector would simply need to start on a cache-line
boundary; how can this be ensured? I think it would be impossible. Well,
perhaps with a custom allocator... Therefore, if you want to create highly
cache-friendly arrays and algorihtms, well, you need to "roll your own". I
don't quite see how one could use a `std::vector' and know that all the
caveats, such as false-sharing and proper data-alignment, have been
addressed...

Oct 7 '08 #80

Hendrik Schober

Chris M. Thomasson wrote:

"Erik WikstrÃ¶m" <Er***********@telia.comwrote in message
news:du*****************@newsb.telia.net...
[...]

>>However, it can still be slower, since it's more or less the worst thing
you
can do to the cache.
Still, you should only get one cache-miss when looking for a value, if
you use a set or vector you will probably get more.

It takes some effort to maximize the use of the cache. I don't think one
could even use a `std::vector' in a cache-friendly manner. Well, first of
all, the memory for the vector would simply need to start on a cache-line
boundary; how can this be ensured? I think it would be impossible. Well,
perhaps with a custom allocator... Therefore, if you want to create highly
cache-friendly arrays and algorihtms, well, you need to "roll your own". I
don't quite see how one could use a `std::vector' and know that all the
caveats, such as false-sharing and proper data-alignment, have been
addressed...

Actually the knowledge that 'std::vector' is more cache-friendly
than 'std::list' was obtained empirical: First people discovered
the fact and then found out (or rather speculated, I suppose
nobody ever went and really researched it) that processor caching
is the reason.

Schobi

Oct 7 '08 #81

Similar topics