help with vector<vector<double>>

T. Crane

Hi,

I'm struggling with how to initialize a vector<vector<double>>
object. I'm pulling data out of a file and storing it in the
vector<vector<double>object. Because any given file will have a
large amount of data, that I read off using an ifstream object, I
don't want to use the push_back method because this grows the
vector<vector<double>dynamically, and that will kill my execution
time. So, I want to reserve space first, using, of course the reserve
method. However, I'm not sure what the best way of doing this is.
Here's what I am thinking of doing (I don't even know if this will
work) but if there's a better way to do it, I'm all ears:

#include <vector>

int nColumns = 10;
int nRows = 15;

vector<vector<double>myData;

myData.reserve(nRows);

for (int i;i<nRows;i++){
myData[i].reserve(nColumns);
}
thanks,
trevis

Aug 9 '07 #1

Subscribe Post Reply

3935

Philip Potter

T. Crane wrote:

#include <vector>

int nColumns = 10;
int nRows = 15;

vector<vector<double>myData;

myData.reserve(nRows);

for (int i;i<nRows;i++){
myData[i].reserve(nColumns);
}

This code is broken. vector::reserve() doesn't actually create or
initialize objects in the vector; it only allocates space for objects
which have yet to be inserted using push_back() or whatever. As a
result, even after the myData.reserve() call, myData contains no items
and myData[i] refers to an item which doesn't exist.

As Alf says, you may not need to reserve at all. First, make it correct.
Then, only if necessary, make it faster.

Phil

Aug 9 '07 #2

T. Crane

On Aug 9, 1:17 pm, Philip Potter <p...@removethis.doc.ic.ac.ukwrote:

T. Crane wrote:
#include <vector>

int nColumns = 10;
int nRows = 15;

vector<vector<double>myData;

myData.reserve(nRows);

for (int i;i<nRows;i++){
myData[i].reserve(nColumns);
}

This code is broken. vector::reserve() doesn't actually create or
initialize objects in the vector; it only allocates space for objects
which have yet to be inserted using push_back() or whatever. As a
result, even after the myData.reserve() call, myData contains no items
and myData[i] refers to an item which doesn't exist.

As Alf says, you may not need to reserve at all. First, make it correct.
Then, only if necessary, make it faster.

Phil

OK -- that helps some. A related question then: if I have a vector
object and I don't reserve any space beforehand, as I understand it if
I push_back values into the vector, this will cause at some point the
vector object to exceed its initial capacity. At this point the
vector will identify a contiguous block of memory that is large enough
to accomodate the original vector plus the additional data. It will
then make a copy of itself and the new data into the new block of
memory. Is this basically how it works?

If it is, I call this growing the vector dynamically. This is in
contrast to reserving one chunk of memory that's big enough to hold
everything you want at the beginning and then populating it using
push_back. In this manner, there is never a reason to go out and find
a new chunk of contiguous memory.

Does this clarify what I'm getting at? And yes, I understand that I
could get it right first and then get it faster, but if I can do both
at once, why not?

thanks for your help,
trevis

Aug 9 '07 #3

Bo Persson

T. Crane wrote:
:: On Aug 9, 1:17 pm, Philip Potter <p...@removethis.doc.ic.ac.uk>
:: wrote:
::: T. Crane wrote:
:::: #include <vector>
:::
:::: int nColumns = 10;
:::: int nRows = 15;
:::
:::: vector<vector<double>myData;
:::
:::: myData.reserve(nRows);
:::
:::: for (int i;i<nRows;i++){
:::: myData[i].reserve(nColumns);
:::: }
:::
::: This code is broken. vector::reserve() doesn't actually create or
::: initialize objects in the vector; it only allocates space for
::: objects which have yet to be inserted using push_back() or
::: whatever. As a result, even after the myData.reserve() call,
::: myData contains no items and myData[i] refers to an item which
::: doesn't exist.
:::
::: As Alf says, you may not need to reserve at all. First, make it
::: correct. Then, only if necessary, make it faster.
:::
::: Phil
::
:: OK -- that helps some. A related question then: if I have a
:: vector object and I don't reserve any space beforehand, as I
:: understand it if I push_back values into the vector, this will
:: cause at some point the vector object to exceed its initial
:: capacity. At this point the vector will identify a contiguous
:: block of memory that is large enough to accomodate the original
:: vector plus the additional data. It will then make a copy of
:: itself and the new data into the new block of memory. Is this
:: basically how it works?

About right, yes. It also uses additional features, like for each time
the data is copied it allocates increasingly larger and larger blocks
of memory, so that it will not have to make additional copies for a
long time.

If your real sizes are 10 and 15, you definitely don't have to worry.
If you have 100s of millions of elements, you might have to. But you
will have to try it out first!
Bo Persson

Aug 9 '07 #4

T. Crane

On Aug 9, 3:18 pm, "Bo Persson" <b...@gmb.dkwrote:

T. Crane wrote:

:: On Aug 9, 1:17 pm, Philip Potter <p...@removethis.doc.ic.ac.uk>:: wrote:
::: T. Crane wrote:

:::: #include <vector>
:::
:::: int nColumns = 10;
:::: int nRows = 15;
:::
:::: vector<vector<double>myData;
:::
:::: myData.reserve(nRows);
:::
:::: for (int i;i<nRows;i++){
:::: myData[i].reserve(nColumns);
:::: }
:::
::: This code is broken. vector::reserve() doesn't actually create or
::: initialize objects in the vector; it only allocates space for
::: objects which have yet to be inserted using push_back() or
::: whatever. As a result, even after the myData.reserve() call,
::: myData contains no items and myData[i] refers to an item which
::: doesn't exist.
:::
::: As Alf says, you may not need to reserve at all. First, make it
::: correct. Then, only if necessary, make it faster.
:::
::: Phil
::
:: OK -- that helps some. A related question then: if I have a
:: vector object and I don't reserve any space beforehand, as I
:: understand it if I push_back values into the vector, this will
:: cause at some point the vector object to exceed its initial
:: capacity. At this point the vector will identify a contiguous
:: block of memory that is large enough to accomodate the original
:: vector plus the additional data. It will then make a copy of
:: itself and the new data into the new block of memory. Is this
:: basically how it works?

About right, yes. It also uses additional features, like for each time
the data is copied it allocates increasingly larger and larger blocks
of memory, so that it will not have to make additional copies for a
long time.

If your real sizes are 10 and 15, you definitely don't have to worry.
If you have 100s of millions of elements, you might have to. But you
will have to try it out first!

Bo Persson

My real sizes are:

100 data points cubed x 2 (two channels) x ~ 8 (different parameter
sets) ~= 16 million data points. Each data point is a 4 element
vector.

So, I'll try this first (as an example):

#include <vector>

int nColumns = 2;
int nRows = 3;

vector<floatv1(3),v2(3);
vector<vector<float>m(2);

v1.push_back(0.23);
v1.push_back(0.36);
v1.push_back(1.54);

v2.push_back(0.76);
v2.push_back(0.86);
v2.push_back(2.92);

m.push_back(v1);
m.push_back(v2);

First off, is this the "recommended" way of populating a 2D vector?
Second, if I find that I really, really want to use vector::reserve(),
how would I do it for a 2D vector?

#include <vector>

vector<floatv;

v.reserve(50);

vector<vector<float>m;

m.reserve(...?) How do I tell it to reserve enough memory to hold n
vector elements like v?

thanks again for your help,
trevis

Aug 9 '07 #5

BobR

T. Crane <tr**********@gmail.comwrote in message
news:11**********************@r34g2000hsd.googlegr oups.com...

Hi,

I'm struggling with how to initialize a vector<vector<double>>
object. I'm pulling data out of a file and storing it in the
vector<vector<double>object. Because any given file will have a
large amount of data, that I read off using an ifstream object, I
don't want to use the push_back method because this grows the
vector<vector<double>dynamically, and that will kill my execution
time. So, I want to reserve space first, using, of course the reserve
method. However, I'm not sure what the best way of doing this is.
Here's what I am thinking of doing (I don't even know if this will
work) but if there's a better way to do it, I'm all ears:

#include <vector>

int nColumns = 10;
int nRows = 15;

vector<vector<double>myData;

myData.reserve(nRows);

for (int i;i<nRows;i++){

No, no. Use:

for( size_t i(0); i < myData.size(); ++i){

That way it will do nothing (Philip Potter explained why).

myData[i].reserve(nColumns);
}

thanks, trevis

size_t const nColumns( 10 );
size_t const nRows( 15 );

std::vector<std::vector<double vec2D( nRows,
std::vector<double>( nColumns, 123.4567 ) );

If you don't want to initialize the doubles:

std::vector<std::vector<double vec2D( nRows, nColumns );

for( size_t x(0); x < vec2D.size(); ++x){
for( size_t y(0); y < vec2D.at(x).size(); ++y){
// get DoubleFromFile
vec2D.at( x ).at( y ) = DoubleFromFile;
// - possibly -
// if( MyFile ){
// MyFile >vec2D.at( x ).at( y );
// }
} // for(y)
} // for(x)

std::vector also has a resize().
vec2D.at( 0 ).resize( 20 );

--
Bob R
POVrookie

Aug 9 '07 #6

T. Crane

On Aug 9, 4:25 pm, "BobR" <removeBadB...@worldnet.att.netwrote:

T. Crane <trevis.cr...@gmail.comwrote in message

news:11**********************@r34g2000hsd.googlegr oups.com...

Hi,

I'm struggling with how to initialize a vector<vector<double>>
object. I'm pulling data out of a file and storing it in the
vector<vector<double>object. Because any given file will have a
large amount of data, that I read off using an ifstream object, I
don't want to use the push_back method because this grows the
vector<vector<double>dynamically, and that will kill my execution
time. So, I want to reserve space first, using, of course the reserve
method. However, I'm not sure what the best way of doing this is.
Here's what I am thinking of doing (I don't even know if this will
work) but if there's a better way to do it, I'm all ears:

#include <vector>

int nColumns = 10;
int nRows = 15;

vector<vector<double>myData;

myData.reserve(nRows);

for (int i;i<nRows;i++){

No, no. Use:

for( size_t i(0); i < myData.size(); ++i){

That way it will do nothing (Philip Potter explained why).

myData[i].reserve(nColumns);
}

thanks, trevis

size_t const nColumns( 10 );
size_t const nRows( 15 );

std::vector<std::vector<double vec2D( nRows,
std::vector<double>( nColumns, 123.4567 ) );

If you don't want to initialize the doubles:

std::vector<std::vector<double vec2D( nRows, nColumns );

for( size_t x(0); x < vec2D.size(); ++x){
for( size_t y(0); y < vec2D.at(x).size(); ++y){
// get DoubleFromFile
vec2D.at( x ).at( y ) = DoubleFromFile;
// - possibly -
// if( MyFile ){
// MyFile >vec2D.at( x ).at( y );
// }
} // for(y)
} // for(x)

std::vector also has a resize().
vec2D.at( 0 ).resize( 20 );

--
Bob R
POVrookie- Hide quoted text -

- Show quoted text -

Thanks for your help. I've got to play with it now.

Aug 9 '07 #7

Juha Nieminen

If you are worried about the copying overhead caused when the vector
increases its own size because of repeated push_backs, you might want
to consider using a std::deque instead. std::deque supports random
access iterators too, but its advantage is that it doesn't need to move
any existing elements when it allocates more space for itself (the
disadvantage is that its memory is not allocated contiguously, which is
usually not a problem, and that indexing might be slightly slower than
with a vector, which also isn't often a problem either). And of course
you get a constant-time push_front too, if you need it.

Aug 10 '07 #8

mistabean

Sorry to interrupt, but I have been using this method to initialize my
2D-Array for awhile

std::vector< std::vector <double array(nRow,
std::vector<double>(nCol, 0.0));

and fill it up index-wise, especially if the data is supposed to be
all over the place and not "column-wise" or "row-wise" insertion.

It works for me so far, however, being a newbie in programming, can
someone tell me what is the drawbacks of using this type of
initialization as opposed to methods suggested beforehand?

Aug 10 '07 #9

BobR

<mi*******@gmail.comwrote in message...

Sorry to interrupt, but I have been using this method to initialize my
2D-Array for awhile

std::vector< std::vector <double array(nRow,
std::vector<double>(nCol, 0.0));

and fill it up index-wise, especially if the data is supposed to be
all over the place and not "column-wise" or "row-wise" insertion.

It works for me so far, however, being a newbie in programming, can
someone tell me what is the drawbacks of using this type of
initialization as opposed to methods suggested beforehand?

If you know the data size, it's best to reserve the memory. If you don't
know data size, push_back, and let it grow dynamically.

Check-out the output from this snippet (shown at end):

void PrintVec( std::vector<intconst &vec, std::ostream &sout){
sout<<" size="<<vec.size()
<<" cap="<<vec.capacity()<<std::endl;
return;
} // PrintVec(vector<intconst&,ostream&)

void VecIntSize( std::ostream &cout ){
cout<<"\n--- VecInt(10) size test ---"<<std::endl;
std::vector<intVecInt( 10 );
PrintVec( VecInt, cout);
VecInt.push_back( 1 );
PrintVec( VecInt, cout);
for(size_t i(0); i < 11; ++i){ VecInt.push_back( i );}
PrintVec( VecInt, cout);
VecInt.resize( 50 );
PrintVec( VecInt, cout);
VecInt.push_back( 1 );
PrintVec( VecInt, cout);
VecInt.resize( 40 );
PrintVec( VecInt, cout);

cout<<" std::vector<int>().swap( VecInt );"<<std::endl;
std::vector<int>().swap( VecInt );
PrintVec( VecInt, cout);

cout<<"--- VecInt(10) size test ---END"<<std::endl;
return;
}

/* --- VecInt(10) size test ---
size=10 cap=10
size=11 cap=20 // note how capacity doubled
size=22 cap=40 // .... and again
size=50 cap=50 // resize
size=51 cap=100 // push_back
size=40 cap=100 // size reduced, capacity stayed same
std::vector<int>( ).swap( VecInt ); // funky, but 'resets' vector
size=0 cap=0
--- VecInt(10) size test ---END
*/

So, if you did something like:

std::vector<intVecInt;
VecInt.reserve( 100 ); // sets 'capacity', not 'size'

.... and then somewhere you did:

VecInt.push_back( 1 ); // this is the 101st push_back

.... you kind-of shot yourself in the foot (it needs to realloc and copy).
No big thing for small vectors, but costs time on very big vectors.
(....and you could run out of memory during the realloc/copy).

--
Bob R
POVrookie

Aug 10 '07 #10

Frank Birbacher

BobR schrieb:

VecInt.push_back( 1 ); // this is the 101st push_back

... you kind-of shot yourself in the foot (it needs to realloc and copy).
No big thing for small vectors, but costs time on very big vectors.
(....and you could run out of memory during the realloc/copy).

Actually using push_back n times on an empty vector costs O(n). Thus
from complexity theory it costs nothing extra.

In practise I'd prefer std::deque.

Frank

Aug 10 '07 #11

aku ankka

On Aug 9, 8:52 pm, "T. Crane" <trevis.cr...@gmail.comwrote:

If the nColumns is same for all rows you are better off allocating one
array which contains nColumns * nRows objects.

double* array = new double[nColumns * nRows];

Indexing a single object:

// object at (x,y) is ..
double* object = array + nColumns * y + x;

It can be argued if vector of vector's for 2d matrix is better or vice-
versa, but how does that sound to you?

Aug 10 '07 #12

BobR

Frank Birbacher <bl************@gmx.netwrote in message...

BobR schrieb:
VecInt.push_back( 1 ); // this is the 101st push_back

... you kind-of shot yourself in the foot (it needs to realloc and

copy).

No big thing for small vectors, but costs time on very big vectors.
(....and you could run out of memory during the realloc/copy).

Actually using push_back n times on an empty vector costs O(n). Thus
from complexity theory it costs nothing extra.

But, I was talking about when '.size()' passes '.capacity()'.
Ref: // this is the 101st push_back
.... when capacity was 100.

>
In practise I'd prefer std::deque.
Frank

Depends on what you're doing.

[ note in old STL docs ]
"
[5] Vector is usually preferable to deque and list. Deque is useful in the
case of frequent insertions at both the beginning and end of the sequence,
and list and slist are useful in the case of frequent insertions in the
middle of the sequence. In almost all other situations, vector is more
efficient.
"

--
Bob R
POVrookie

Aug 10 '07 #13

BobR

aku ankka <ju***@liimatta.orgwrote in message...

On Aug 9, 8:52 pm, "T. Crane" <trevis.cr...@gmail.comwrote:

If the nColumns is same for all rows you are better off allocating one
array which contains nColumns * nRows objects.

double* array = new double[nColumns * nRows];

Indexing a single object:

// object at (x,y) is ..
double* object = array + nColumns * y + x;

And later:

delete [] array;

<G>

--
Bob R
POVrookie

Aug 10 '07 #14

Bo Persson

BobR wrote:
::
:: So, if you did something like:
::
:: std::vector<intVecInt;
:: VecInt.reserve( 100 ); // sets 'capacity', not 'size'
::
:: ... and then somewhere you did:
::
:: VecInt.push_back( 1 ); // this is the 101st push_back
::
:: ... you kind-of shot yourself in the foot (it needs to realloc and
:: copy). No big thing for small vectors, but costs time on very big
:: vectors. (....and you could run out of memory during the
:: realloc/copy).

If you know that you actually need 101 push_backs, you could do a
reserve(101) :-)

If you don't know, and the numbers are reasonably random, you will get
a good performance *on average*. Note that after "shooting yourself in
the foot" at 101, you get the next 99 push_backs for free. If you
reallocate at a-million-and-one, you get the next million or so
push_backs for free. That's cheap!
Bo Persson

Aug 10 '07 #15

T. Crane

On Aug 10, 2:47 pm, "Bo Persson" <b...@gmb.dkwrote:

BobR wrote:

::
:: So, if you did something like:
::
:: std::vector<intVecInt;
:: VecInt.reserve( 100 ); // sets 'capacity', not 'size'
::
:: ... and then somewhere you did:
::
:: VecInt.push_back( 1 ); // this is the 101st push_back
::
:: ... you kind-of shot yourself in the foot (it needs to realloc and
:: copy). No big thing for small vectors, but costs time on very big
:: vectors. (....and you could run out of memory during the
:: realloc/copy).

If you know that you actually need 101 push_backs, you could do a
reserve(101) :-)

If you don't know, and the numbers are reasonably random, you will get
a good performance *on average*. Note that after "shooting yourself in
the foot" at 101, you get the next 99 push_backs for free. If you
reallocate at a-million-and-one, you get the next million or so
push_backs for free. That's cheap!

Bo Persson

So I was curious to see which is faster -- pushing back onto a vector
or using direct index access of the elements. I declare a
std::vector<intv, setting the number of elements to vSize. I set
the values using a for-loop. Then, I declare two
std::vector<vector<int objects, m1 & m2. The first of these I
initialize with mSize elements, and the second I use the
std::vector::reserve method to claim mSize space. The using two for-
loops I populate m1 & m2 using direct index accessing of the elements
to set them equal to v, and then I use push_back to fill m2. I time
these two for-loops as well as the whole function. What I find
(perhaps not suprisingly) is that I fill m1 much, much faster, i.e.
push_back() with reserve() are SLOW. Anyway, here's the code.
Nothing to special.

#include <vector>
#include <iostream>
#include <iomanip>
#include <time.h>

using namespace std;

int main(){
time_t t0_0,t1_0,t2_0;
time_t t0_f,t1_f,t2_f;

t0_0 = time(NULL);
int vSize = 10000000;
int mSize = 10;

vector<intv(vSize);

for (int i=0;i<vSize;i++){v.at(i) = i;}

vector<vector<int m1(mSize);
vector<vector<int m2;
m2.reserve(mSize);
t1_0 = time (NULL);
for (int j=0;j<mSize;j++){m1.at(j) = v;}

t1_f = time(NULL);
t2_0 = time(NULL);

for (int i=0; i<mSize;i++){ m2.push_back(v);}

t2_f = time(NULL);
t0_f = time(NULL);
cout << "testTime0 = " << t0_f-t0_0 << endl;
cout << "testTime1 = " << t1_f-t1_0 << endl;
cout << "testTime2 = " << t2_f-t2_0 << endl;

return 0;
}

Aug 10 '07 #16

Frank Birbacher

Hi!

BobR schrieb:

Frank Birbacher <bl************@gmx.netwrote in message...
>In practise I'd prefer std::deque.
Frank

Depends on what you're doing.

As always. I'll try to explain myself.

[ note in old STL docs ]
"
[5] Vector is usually preferable to deque and list.

[snip]

Sure. I've got my advise from the "C++ Coding Standards" (Sutter and
Alexandrescu). They argu that all operations on std::deque are at least
as fast as the corresponding operation on std::vector in terms of
complexity theory. Paired with the advise not to optimize prematurely
they suggest to always start using std::deque. If you later discover
performance issues you can still switch to vector.

So I prefer std::deque. Your milage may vary.

Frank

Aug 10 '07 #17

BobR

Bo Persson <bo*@gmb.dkwrote in message...

>
If you don't know, and the numbers are reasonably random, you will get
a good performance *on average*. Note that after "shooting yourself in
the foot" at 101, you get the next 99 push_backs for free. If you
reallocate at a-million-and-one, you get the next million or so
push_backs for free. That's cheap!
Bo Persson

And if your current vector is using 2/3 of available 'real estate', and you
exceed '.capacity()', it ain't cheap no more! <G>

But then, you should have taken that old 80286 machine down to the salvage
yard years ago. :-}
[ never put computers in the normal trash! ].

--
Bob R
POVrookie

Aug 10 '07 #18

Bo Persson

T. Crane wrote:
:: On Aug 10, 2:47 pm, "Bo Persson" <b...@gmb.dkwrote:
::: BobR wrote:
:::
:::::
::::: So, if you did something like:
:::::
::::: std::vector<intVecInt;
::::: VecInt.reserve( 100 ); // sets 'capacity', not 'size'
:::::
::::: ... and then somewhere you did:
:::::
::::: VecInt.push_back( 1 ); // this is the 101st push_back
:::::
::::: ... you kind-of shot yourself in the foot (it needs to realloc
::::: and copy). No big thing for small vectors, but costs time on
::::: very big vectors. (....and you could run out of memory during
::::: the realloc/copy).
:::
::: If you know that you actually need 101 push_backs, you could do a
::: reserve(101) :-)
:::
::: If you don't know, and the numbers are reasonably random, you
::: will get a good performance *on average*. Note that after
::: "shooting yourself in the foot" at 101, you get the next 99
::: push_backs for free. If you reallocate at a-million-and-one, you
::: get the next million or so push_backs for free. That's cheap!
:::
::: Bo Persson
::
:: So I was curious to see which is faster -- pushing back onto a
:: vector or using direct index access of the elements. I declare a
:: std::vector<intv, setting the number of elements to vSize. I set
:: the values using a for-loop. Then, I declare two
:: std::vector<vector<int objects, m1 & m2. The first of these I
:: initialize with mSize elements, and the second I use the
:: std::vector::reserve method to claim mSize space. The using two
:: for- loops I populate m1 & m2 using direct index accessing of the
:: elements to set them equal to v, and then I use push_back to fill
:: m2. I time these two for-loops as well as the whole function.
:: What I find (perhaps not suprisingly) is that I fill m1 much, much
:: faster, i.e. push_back() with reserve() are SLOW. Anyway, here's
:: the code. Nothing to special.
::
:: #include <vector>
:: #include <iostream>
:: #include <iomanip>
:: #include <time.h>
::
:: using namespace std;
::
:: int main(){
:: time_t t0_0,t1_0,t2_0;
:: time_t t0_f,t1_f,t2_f;
::
:: t0_0 = time(NULL);
:: int vSize = 10000000;
:: int mSize = 10;
::
:: vector<intv(vSize);
::
:: for (int i=0;i<vSize;i++){v.at(i) = i;}
::
:: vector<vector<int m1(mSize);
:: vector<vector<int m2;
:: m2.reserve(mSize);
:: t1_0 = time (NULL);
:: for (int j=0;j<mSize;j++){m1.at(j) = v;}
::
:: t1_f = time(NULL);
:: t2_0 = time(NULL);
::
:: for (int i=0; i<mSize;i++){ m2.push_back(v);}
::
:: t2_f = time(NULL);
:: t0_f = time(NULL);
:: cout << "testTime0 = " << t0_f-t0_0 << endl;
:: cout << "testTime1 = " << t1_f-t1_0 << endl;
:: cout << "testTime2 = " << t2_f-t2_0 << endl;
::
:: return 0;
:: }

Strange!

I get about equal time for both versions. In release mode it in fact
runs so fast that I get 0-1 second for all results. Changing time() to
clock(), I get something like

testTime0 = 890
testTime1 = 422
testTime2 = 406

You don't run with iterator debugging enabled, or anything?
Bo Persson

Aug 10 '07 #19

T. Crane

On Aug 10, 4:44 pm, "Bo Persson" <b...@gmb.dkwrote:

T. Crane wrote:

:: On Aug 10, 2:47 pm, "Bo Persson" <b...@gmb.dkwrote:::: BobR wrote:

:::
:::::
::::: So, if you did something like:
:::::
::::: std::vector<intVecInt;
::::: VecInt.reserve( 100 ); // sets 'capacity', not 'size'
:::::
::::: ... and then somewhere you did:
:::::
::::: VecInt.push_back( 1 ); // this is the 101st push_back
:::::
::::: ... you kind-of shot yourself in the foot (it needs to realloc
::::: and copy). No big thing for small vectors, but costs time on
::::: very big vectors. (....and you could run out of memory during
::::: the realloc/copy).
:::
::: If you know that you actually need 101 push_backs, you could do a
::: reserve(101) :-)
:::
::: If you don't know, and the numbers are reasonably random, you
::: will get a good performance *on average*. Note that after
::: "shooting yourself in the foot" at 101, you get the next 99
::: push_backs for free. If you reallocate at a-million-and-one, you
::: get the next million or so push_backs for free. That's cheap!
:::
::: Bo Persson
::
:: So I was curious to see which is faster -- pushing back onto a
:: vector or using direct index access of the elements. I declare a
:: std::vector<intv, setting the number of elements to vSize. I set
:: the values using a for-loop. Then, I declare two
:: std::vector<vector<int objects, m1 & m2. The first of these I
:: initialize with mSize elements, and the second I use the
:: std::vector::reserve method to claim mSize space. The using two
:: for- loops I populate m1 & m2 using direct index accessing of the
:: elements to set them equal to v, and then I use push_back to fill
:: m2. I time these two for-loops as well as the whole function.
:: What I find (perhaps not suprisingly) is that I fill m1 much, much
:: faster, i.e. push_back() with reserve() are SLOW. Anyway, here's
:: the code. Nothing to special.
::
:: #include <vector>
:: #include <iostream>
:: #include <iomanip>
:: #include <time.h>
::
:: using namespace std;
::
:: int main(){
:: time_t t0_0,t1_0,t2_0;
:: time_t t0_f,t1_f,t2_f;
::
:: t0_0 = time(NULL);
:: int vSize = 10000000;
:: int mSize = 10;
::
:: vector<intv(vSize);
::
:: for (int i=0;i<vSize;i++){v.at(i) = i;}
::
:: vector<vector<int m1(mSize);
:: vector<vector<int m2;
:: m2.reserve(mSize);
:: t1_0 = time (NULL);
:: for (int j=0;j<mSize;j++){m1.at(j) = v;}
::
:: t1_f = time(NULL);
:: t2_0 = time(NULL);
::
:: for (int i=0; i<mSize;i++){ m2.push_back(v);}
::
:: t2_f = time(NULL);
:: t0_f = time(NULL);
:: cout << "testTime0 = " << t0_f-t0_0 << endl;
:: cout << "testTime1 = " << t1_f-t1_0 << endl;
:: cout << "testTime2 = " << t2_f-t2_0 << endl;
::
:: return 0;
:: }

Strange!

I get about equal time for both versions. In release mode it in fact
runs so fast that I get 0-1 second for all results. Changing time() to
clock(), I get something like

testTime0 = 890
testTime1 = 422
testTime2 = 406

You don't run with iterator debugging enabled, or anything?

Bo Persson

No. At least I don't think so. I'm still pretty new to the Visual
Studio IDE. When I ran the code shown here (in Visual Studio -- not a
release .exe), I got

testTime0 = 43
testTime1 = 0
testTime2 = 38

The units are all seconds.

Aug 10 '07 #20

Bo Persson

BobR wrote:
:: Bo Persson <bo*@gmb.dkwrote in message...
:::
::: If you don't know, and the numbers are reasonably random, you
::: will get a good performance *on average*. Note that after
::: "shooting yourself in the foot" at 101, you get the next 99
::: push_backs for free. If you reallocate at a-million-and-one, you
::: get the next million or so push_backs for free. That's cheap!
::: Bo Persson
::
:: And if your current vector is using 2/3 of available 'real
:: estate', and you exceed '.capacity()', it ain't cheap no more! <G>

If we are talking gigabytes of estimated storage, optimization might
no longer be premature. :-)

In this example, it is rather interesting that a vector<vector<double>

will work pretty much like a deque anyway. One level of indirection,

and suballocations for each row.
Bo Persson

Aug 10 '07 #21

BobR

T. Crane <tr**********@gmail.comwrote in message...

On Aug 10, 2:47 pm, "Bo Persson" <b...@gmb.dkwrote:

If you don't know, and the numbers are reasonably random, you will get
a good performance *on average*. Note that after "shooting yourself in
the foot" at 101, you get the next 99 push_backs for free. If you
reallocate at a-million-and-one, you get the next million or so
push_backs for free. That's cheap!

Bo Persson

So I was curious to see which is faster -- pushing back onto a vector
or using direct index access of the elements. I declare a
std::vector<intv, setting the number of elements to vSize. I set
the values using a for-loop. Then, I declare two
std::vector<vector<int objects, m1 & m2. The first of these I
initialize with mSize elements, and the second I use the
std::vector::reserve method to claim mSize space. The using two for-
loops I populate m1 & m2 using direct index accessing of the elements
to set them equal to v, and then I use push_back to fill m2. I time
these two for-loops as well as the whole function. What I find
(perhaps not suprisingly) is that I fill m1 much, much faster, i.e.
push_back() with reserve() are SLOW. Anyway, here's the code.
Nothing to special.

#include <vector>
#include <iostream>
#include <iomanip>
#include <time.h>

using namespace std;

int main(){
time_t t0_0,t1_0,t2_0;
time_t t0_f,t1_f,t2_f;

t0_0 = time(NULL);
int vSize = 10000000;
int mSize = 10;

vector<intv(vSize);

for (int i=0;i<vSize;i++){v.at(i) = i;}

vector<vector<int m1(mSize);
vector<vector<int m2;
m2.reserve(mSize);
t1_0 = time (NULL);
for (int j=0;j<mSize;j++){m1.at(j) = v;}

t1_f = time(NULL);
t2_0 = time(NULL);

for (int i=0; i<mSize;i++){ m2.push_back(v);}

t2_f = time(NULL);
t0_f = time(NULL);
cout << "testTime0 = " << t0_f-t0_0 << endl;
cout << "testTime1 = " << t1_f-t1_0 << endl;
cout << "testTime2 = " << t2_f-t2_0 << endl;

return 0;
}

[ my opinion ]
And your results may be false. Put the vector operations in functions, and
'call' them at least 1000 times, capturing the elapsed time each pass.
Examine the mean and deviations of the times.

If you are running on an OS, the OS can intefere with the times you got.
(memory management, other programs running (background), etc.)

I've run some time tests on vectors that iterated 100 times[1], and the
result fluctuated between 50 to 150 ticks ( std::clock_t() ) on multiple
runs.
V1 might take 50 ticks and V2 150 ticks on one run, and the times reverse on
the next run.

[1] - construct, fill, destruct cycle.
--
Bob R
POVrookie

Aug 10 '07 #22

Juha Nieminen

mi*******@gmail.com wrote:

Sorry to interrupt, but I have been using this method to initialize my
2D-Array for awhile

std::vector< std::vector <double array(nRow,
std::vector<double>(nCol, 0.0));

and fill it up index-wise, especially if the data is supposed to be
all over the place and not "column-wise" or "row-wise" insertion.

It works for me so far, however, being a newbie in programming, can
someone tell me what is the drawbacks of using this type of
initialization as opposed to methods suggested beforehand?

There's nothing wrong with that type of initialization. However,
it assumes that you know the dimensions in advance. If you don't
know them (which is not an unusual situation) then you obviously
can't allocate the space beforehand, but you have to rely on
dynamically increasing the size of the vectors as more and more
data is inserted into them.
If that is the case then in some situations it's more efficient
to use a std::deque instead of a std::vector, especially if copying
the elements is a slow operation.

Aug 11 '07 #23

Juha Nieminen

aku ankka wrote:

If the nColumns is same for all rows you are better off allocating one
array which contains nColumns * nRows objects.

double* array = new double[nColumns * nRows];

Why do it that way and risk a memory leak, instead of doing it like:

std::vector<doublearray(nColumns * nRows);

?

The std::vector is doing effectively the same thing, but it's safer.

Aug 11 '07 #24

aku ankka

On Aug 11, 11:45 am, Juha Nieminen <nos...@thanks.invalidwrote:

aku ankka wrote:
If the nColumns is same for all rows you are better off allocating one
array which contains nColumns * nRows objects.

double* array = new double[nColumns * nRows];

Why do it that way and risk a memory leak, instead of doing it like:

std::vector<doublearray(nColumns * nRows);

?

The std::vector is doing effectively the same thing, but it's safer.

That's pretty irrelevant for my interests, the point is that vector of
vector's isn't what *I* would do for 2d array (most of the time).

Aug 11 '07 #25

Juha Nieminen

aku ankka wrote:

> Why do it that way and risk a memory leak, instead of doing it like:

std::vector<doublearray(nColumns * nRows);

?

The std::vector is doing effectively the same thing, but it's safer.

That's pretty irrelevant for my interests, the point is that vector of
vector's isn't what *I* would do for 2d array (most of the time).

Code safety is pretty irrelevant for you? I sure hope you are not
programming professionally.

Aug 11 '07 #26

James Kanze

On Aug 9, 7:52 pm, "T. Crane" <trevis.cr...@gmail.comwrote:

I'm struggling with how to initialize a vector<vector<double>>
object. I'm pulling data out of a file and storing it in the
vector<vector<double>object. Because any given file will have a
large amount of data, that I read off using an ifstream object, I
don't want to use the push_back method because this grows the
vector<vector<double>dynamically, and that will kill my execution
time. So, I want to reserve space first, using, of course the reserve
method. However, I'm not sure what the best way of doing this is.
Here's what I am thinking of doing (I don't even know if this will
work) but if there's a better way to do it, I'm all ears:

#include <vector>

int nColumns = 10;
int nRows = 15;

vector<vector<double>myData;

myData.reserve(nRows);

for (int i;i<nRows;i++){
myData[i].reserve(nColumns);
}

Is there some reason you don't write just:

std::vector< std::vector< double
myData( nRows,
std::vector< double >( nColumns ) ) ;

This will actually initialize all of the data with 0.0, which
you might not need, but it avoids the need of push_back
entirely.

--
James Kanze (GABI Software) email:james.ka...@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34

Aug 11 '07 #27

aku ankka

On Aug 11, 3:09 pm, Juha Nieminen <nos...@thanks.invalidwrote:

The std::vector is doing effectively the same thing, but it's safer.

That's pretty irrelevant for my interests, the point is that vector of
vector's isn't what *I* would do for 2d array (most of the time).

Code safety is pretty irrelevant for you? I sure hope you are not
programming professionally.

No, discussing it with _you_ is irrelevant for me, because that
doesn't touch the point I was making: using vector of vectors for
presenting 2d array is lame.

I'm sure a smart guy like you can see the difference.

Aug 11 '07 #28

aku ankka

That's pretty irrelevant for my interests, the point is that vector of

vector's isn't what *I* would do for 2d array (most of the time).

Try reading that posting again.

It adds irrelevant detail, which might be interesting if array vs.
vector were the topic. If I wanted to add more information into the
original posting I would have ranted ten pages on the virtues of
different containers, but I wasn't posting about that were I?

Nothing wrong people suggesting deques and what not as "optimization"
for such trivial task? Pffff.

Aug 11 '07 #29

BobR

Bo Persson <bo*@gmb.dkwrote in message...

::
:: And if your current vector is using 2/3 of available 'real
:: estate', and you exceed '.capacity()', it ain't cheap no more! <G>

If we are talking gigabytes of estimated storage, optimization might
no longer be premature. :-)

Never thought I'd see the day that someone would say that in this NG, even
in jest. <G>

--
Bob R
POVrookie

Aug 11 '07 #30

Juha Nieminen

aku ankka wrote:

No, discussing it with _you_ is irrelevant for me, because that
doesn't touch the point I was making: using vector of vectors for
presenting 2d array is lame.

I'm sure a smart guy like you can see the difference.

The problem is that by using 'new' instead of a std::vector you
are teaching bad habits. The exact same idea of using a 1-dimensional
vector to represent a 2-dimensional vector could have been done with
a std::vector, and it would have been a much better example.
In the worst case the original poster might get the idea that using
'new' instead of std::vector is a better solution.

Aug 12 '07 #31

aku ankka

On Aug 12, 8:53 pm, Juha Nieminen <nos...@thanks.invalidwrote:

aku ankka wrote:
No, discussing it with _you_ is irrelevant for me, because that
doesn't touch the point I was making: using vector of vectors for
presenting 2d array is lame.

I'm sure a smart guy like you can see the difference.

The problem is that by using 'new' instead of a std::vector you
are teaching bad habits. The exact same idea of using a 1-dimensional
vector to represent a 2-dimensional vector could have been done with
a std::vector, and it would have been a much better example.
In the worst case the original poster might get the idea that using
'new' instead of std::vector is a better solution.

That is still a topic I am not interested in discussing with you. He
can use whatever he sees fit - the optimization has been done; now it
is issue of style, safety, yada yada yadayaa, on and on.

Aug 12 '07 #32

mistabean

On 11 Aug., 10:41, Juha Nieminen <nos...@thanks.invalidwrote:

mistab...@gmail.com wrote:
Sorry to interrupt, but I have been using this method to initialize my
2D-Array for awhile

std::vector< std::vector <double array(nRow,
std::vector<double>(nCol, 0.0));

and fill it up index-wise, especially if the data is supposed to be
all over the place and not "column-wise" or "row-wise" insertion.

It works for me so far, however, being a newbie in programming, can
someone tell me what is the drawbacks of using this type of
initialization as opposed to methods suggested beforehand?

There's nothing wrong with that type of initialization. However,
it assumes that you know the dimensions in advance. If you don't
know them (which is not an unusual situation) then you obviously
can't allocate the space beforehand, but you have to rely on
dynamically increasing the size of the vectors as more and more
data is inserted into them.
If that is the case then in some situations it's more efficient
to use a std::deque instead of a std::vector, especially if copying
the elements is a slow operation.

ahaaa.. so nothing wrong with that one then? In my programs that I am
currently doing, nRow and nCol are always calculated during runtime,
depending on what type of datas that I have.

And I found it much easier to debug, as opposed to push_back()'s. Of
course, it is also a matter of me always forgetting to clear(), when I
am doing the filling-up over and over again, and the std::vector is
either a return parameter, or declared outside the loop.

And have to read up more about std::deque, since I have one one-
dimensional std::vector that is initialized at start-up with an
initial size, but will be growing during runtime. Thanks for the
advice.

Aug 13 '07 #33

help with vector<vector<double>>

Similar topics