By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,493 Members | 3,101 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,493 IT Pros & Developers. It's quick & easy.

Arrays vs Vector Help

P: n/a
Ben
I have a program which is using a lot of memory.

At the moment I store a lot of pointers to objects in std::vector.
(millions of them)

I have three questions:

1) Lets say the average Vector is of size 2. How much memory can I
save by storing my pointers in c++ arrays, rather than vectors.

2) I looked into using boost::array to save memory, but you need to
know the size at compile time, as opposed to run time. I want to
number of elements in the array to be configurable at run time, but I
don't need to be able to resize my vectors. This seems to rule our
boost::array, since it is templatized on size. Are there any other
ideas for array classed I can use to save memory over std::vector?

3) If I just use a standard C++ array, how do I create an array of
pointers on the fly. An array of Objects is easy:

array = new Object[size];
How do I do the same for an array of pointers to Objects?

Thanks,
ben

Oct 6 '05 #1
Share this Question
Share on Google+
13 Replies


P: n/a
* Ben:
I have a program which is using a lot of memory.

At the moment I store a lot of pointers to objects in std::vector.
(millions of them)

I have three questions:

1) Lets say the average Vector is of size 2. How much memory can I
save by storing my pointers in c++ arrays, rather than vectors.
Depends on (A) what you're doing with the vectors, (B) the C++ implementation
and (C) how you'd implement the use of raw arrays.

Regarding (1), if any vector you're using can vary in size, as your question 2
below indicates _may_ be the case, it might be gobbling up memory.

Use something like

template< typename T >
void clearAll( typename std::vector<T>& v )
{
std::vector<T> empty;
empty.swap( v );
}

to get rid of overly large buffers (they only increase, never decrease) inside
the vectors.
2) I looked into using boost::array to save memory, but you need to
know the size at compile time, as opposed to run time. I want to
number of elements in the array to be configurable at run time, but I
don't need to be able to resize my vectors. This seems to rule our
boost::array, since it is templatized on size. Are there any other
ideas for array classed I can use to save memory over std::vector?
Don't know; I'd just roll my own, but after careful consideration.

3) If I just use a standard C++ array, how do I create an array of
pointers on the fly. An array of Objects is easy:

array = new Object[size];
How do I do the same for an array of pointers to Objects?


I don't understand how you can not know this, but anyway:

typedef Object* ObjectPtr;

ObjectPtr* array = new ObjectPtr[size];

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Oct 6 '05 #2

P: n/a
Ben wrote:
I have a program which is using a lot of memory.

At the moment I store a lot of pointers to objects in std::vector.
(millions of them)

I have three questions:

1) Lets say the average Vector is of size 2. How much memory can I
save by storing my pointers in c++ arrays, rather than vectors.
Typical vector size would be 12 to 16 bytes.

Typical 'array' size (I assume you mean dynamic array) would be (say) 6
bytes, 4 bytes for a pointer and two bytes to save the size.

In both cases these sizes are exclusive of the actual data stored.

From the above I assume you have 'millions'/2 arrays or vectors. Which
means you would save 3 to 5 'millions' of bytes.

2) I looked into using boost::array to save memory, but you need to
know the size at compile time, as opposed to run time. I want to
number of elements in the array to be configurable at run time, but I
don't need to be able to resize my vectors. This seems to rule our
boost::array, since it is templatized on size. Are there any other
ideas for array classed I can use to save memory over std::vector?
So all your vectors or arrays are of fixed size, but that size is not
known until run time. Is that correct? Combining all the millions of
vectors/arrays into one large vector would seem to be the obvious
solution. Wrap that single large vector into a handy class and Bob's
your uncle.

3) If I just use a standard C++ array, how do I create an array of
pointers on the fly. An array of Objects is easy:

array = new Object[size];
How do I do the same for an array of pointers to Objects?
Exactly the same syntax, just substitute Object* for Object.

array = new Object*[size];

Of course this just creates the array of pointers, it does not make any
of the pointers themselves, for that you need

for (int i = 0; i < size; ++i)
array[i] = new Object;
Thanks,
ben


john
Oct 6 '05 #3

P: n/a
Ben wrote:
I have a program which is using a lot of memory.

At the moment I store a lot of pointers to objects in std::vector.
(millions of them)
Millions in a single std::vector or millions in millions of std::vectors?
I have three questions:

1) Lets say the average Vector is of size 2.
What's a Vector? Do you mean std::vector?
How much memory can I
save by storing my pointers in c++ arrays, rather than vectors.
Unknown. What's a Vector?
2) I looked into using boost::array to save memory, but you need to
know the size at compile time, as opposed to run time. I want to
number of elements in the array to be configurable at run time, but I
don't need to be able to resize my vectors. This seems to rule our
boost::array, since it is templatized on size. Are there any other
ideas for array classed I can use to save memory over std::vector?
No. std::vector is the best choice for your parameters: size is known
at run time, but doesn't change. Make sure you define your vectors with
the argument when constructing, or 'reserve' right away.
3) If I just use a standard C++ array, how do I create an array of
pointers on the fly. An array of Objects is easy:

array = new Object[size];
How do I do the same for an array of pointers to Objects?


Object **array = new Object*[size];

Why do you need to keep pointers? Is 'Object' a base class for some
hierarchy of classes? How do you create individual objects? Is there
a good reason to have collections of pointers instead of collections
of objects?

V
Oct 6 '05 #4

P: n/a
Ben
>No. std::vector is the best choice for your parameters: size is known
at run time, but doesn't change. Make sure you define your vectors with
the argument when constructing, or 'reserve' right away.


Is std::vector<Object*> really as memory efficient as Object **array =
new Object*[size]; ?

Oct 6 '05 #5

P: n/a
Ben
> From the above I assume you have 'millions'/2 arrays or vectors. Which
means you would save 3 to 5 'millions' of bytes.


yes, that is what I mean.

Oct 6 '05 #6

P: n/a
Ben wrote:
From the above I assume you have 'millions'/2 arrays or vectors. Which
means you would save 3 to 5 'millions' of bytes.

yes, that is what I mean.


But you would save even more by cutting the overhead of using millions
of arrays or vector and instead use one large array or vector.

And also by using a single vector or array the difference in overhead
becomes irrelevant, so I would use a vector.

john
Oct 6 '05 #7

P: n/a

Ben skrev:
No. std::vector is the best choice for your parameters: size is known
at run time, but doesn't change. Make sure you define your vectors with
the argument when constructing, or 'reserve' right away.


Is std::vector<Object*> really as memory efficient as Object **array =
new Object*[size]; ?


No. As John Harrison you will spend a few bytes more (on a 32-bit
compiler four bytes as i assume the size of the vector will be int).
One significand saving will be if you need only one vector and not
millions of them (and that should not be difficult to accomplish). One
other huge saving in space would be if you could have a vector of
objects rather than a vector of pointers. If this is feasible depends
on your application - if your object is movable and your objects are
not polymorphics that should do it.

/Peter

Oct 6 '05 #8

P: n/a
Ben wrote:
No. std::vector is the best choice for your parameters: size is known
at run time, but doesn't change. Make sure you define your vectors with
the argument when constructing, or 'reserve' right away.


Is std::vector<Object*> really as memory efficient as Object **array =
new Object*[size]; ?


The "obvious" implementation of vector looks somewhat like this:
template < typename T >
class vector {

T* data;
std::size_t length;
std::size_t capacity;

public:

...

};
So this is reasonably small (I would venture the conjecture that both fields
are 4 bytes on many machines). Thus a vector adds a pretty small overhead
to the data that you store inside.

However, your allocator might actually grab memory in chunks (say multiples
of 16 or even 32 bytes), in which case some 4 or 20 bytes might be wasted
per vector. The use of a custom allocator, however, should allow you to
avoid this.

Also note: if you use T* instead of std::vector<T> you will usually need to
store the length of the array somewhere since otherwise you cannot iterate
through -- you need to know when to stop. The only true overhead added by
std::vector<T> is the capacity field.

However in the case of the OP, it appears that there are a lot of vectors
all of the same length, which could therefore better be stored outside,
whose lengths do not change, whence a capacity does not need to be kept
around. If the length of the vectors is small (like 2 to 4), one could
probably cut memory consumption in half by ditching std::vector<T*> for
T**.
Best

Kai-Uwe Bux
Oct 6 '05 #9

P: n/a
Ben
I should be more specific.

I have data in pairs (vectors of size 2), or triplets (vectors of size
3), or maybe more which I need to keep separate, since I keep
statistics on how often each pair or triplet appears in some input
data. I cant just store them in a big vector, since I need to keep
track of how often each of these data points appear. So I have a sort
of Histogram, for each vector and a count of how often it appears in
input. The problem is the number of possible vectors is HUGE. So,
using the std::vector class has some major over head.

I also need them to be pointers for polymorphism purposes.
Thanks again,
Ben

Oct 7 '05 #10

P: n/a
Ben wrote:
I should be more specific.

I have data in pairs (vectors of size 2), or triplets (vectors of size
3), or maybe more which I need to keep separate, since I keep
statistics on how often each pair or triplet appears in some input
data. I cant just store them in a big vector, since I need to keep
track of how often each of these data points appear.


I think you are wrong here. Just because your data is stored in a big
vector does not mean that you cannot process the individual pieces of
data seperately. I think you're confusing the features of your problem
(i.e. pairs, triplets) with the implementation (i.e. because you have
pairs in your problem you must have pairs in your implementation).

In fact reading the above, a vector for all the pairs and another vector
for all the triplets and so on seems obvious and efficient. Just use an
offset into the big vector of pairs to represent each pair, an offset
into the vector of triplets to represent triplet and so on.

john
Oct 7 '05 #11

P: n/a
"Ben" <be************@gmail.com> schrieb im Newsbeitrag
news:11**********************@z14g2000cwz.googlegr oups.com...
I should be more specific.

I have data in pairs (vectors of size 2), or triplets (vectors of size
3)


Vectors of size 3 could have another overhead if you did not use reserve().
Some implementations would have space for 4 elements.
But this isn't a problem if you follow the suggestions of the other posts
and make one or two large vectors.

Greets Chris
Oct 7 '05 #12

P: n/a
Ben
But the best way to keep track of this kind of thing is a hash, right?

So i need individual small elements to hash.

When a data point comes in, I need to know what object type to
increment (as being seen) and the best way to do that is a hash.

If everything is in one vector, I would not be able to hash.

Oct 7 '05 #13

P: n/a
Ben wrote:
But the best way to keep track of this kind of thing is a hash, right?

So i need individual small elements to hash.

When a data point comes in, I need to know what object type to
increment (as being seen) and the best way to do that is a hash.

If everything is in one vector, I would not be able to hash.


Yes you can, you hash an offset into the big vector, not the object itself.

If you have all the objects in a big vector then an offset into that
vector can always stand for the object itself.

john
Oct 7 '05 #14

This discussion thread is closed

Replies have been disabled for this discussion.