473,385 Members | 1,461 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

stl vector/memory question

Hi,

I'm working on a scheme to store several millions of records (Subdata),
each of which contains a variable number of sub records (Celldata). My concern is that
my prototype uses a lot more memory than I'm expecting. To store 1.7million Subdatas
that have an average of 7 Celldatas, I'm using 85 Mb. Thats about 50 bytes/record, where as
actual data is about 14 bytes/record (average).

Any insight is greatly appreciated!
Ken

Cliffnoted code is as follows:

class Celldata
{
public:
short cellnum:10,
time_stamp:6;

Celldata(int, int);
};
class Subdata
{
public:
vector<Celldata> cells;

int add_cell(int cell, int time);
};

class Esn_global
{
public:

vector<Subdata> sublists;
}
int
Subdata::add_cell( int cell, int time)
{
Celldata newcell(cell, time);
cells.push_back(newcell);
return 1;
}
Esn_global g_esn;

main()
{
for (loop forever)
{
-read record containing subscriber identifier and cell
- add cell to subcribers list (if not already on list)
}

}
I've trimmed some of the code out, there is another structure that takes a subscriber's
identifier (phone number) and returns an index into g_esn.sublists that corresponds to that users
data.

I'm wondering if I should be making new class instances with a 'new' and then pushing onto my
vectors?

I've also played around with resizing the vectors by 2 entries when they are full to prevent them from
doubling in size when they fill; this didn't seem to be a benefit....

I was pushing all the instances onto vectors to avoid any memory leaking and also thought it would be better
for daily audits of the data. Maybe this is my problem?

thanks

Jul 22 '05 #1
5 2285
Kenneth W Del Signore <kw**@lucent.com> wrote in
news:41**************@lucent.com:
Hi,

I'm working on a scheme to store several millions of records
(Subdata),
each of which contains a variable number of sub records (Celldata).
My concern is that my prototype uses a lot more memory than I'm
expecting. To store 1.7million Subdatas that have an average of 7
Celldatas, I'm using 85 Mb. Thats about 50 bytes/record, where as
actual data is about 14 bytes/record (average).
[snip]
I've also played around with resizing the vectors by 2 entries when
they are full to prevent them from doubling in size when they fill;
this didn't seem to be a benefit....
Note that when one resizes a vector, it is not required that the vector
have the exact capacity that you specify, it may allocate more memory than
you specify. So if you .resize(2), the vector may actually have allocated
memory to hold 16.

Reminder that .reserve and .resize don't do the same thing....
I was pushing all the instances onto vectors to avoid any memory
leaking and also thought it would be better for daily audits of the
data. Maybe this is my problem?


I suppose if you want stricter control, you might try std::list ?
Jul 22 '05 #2
Andre Kostur wrote:
I suppose if you want stricter control, you might try std::list ?


With 12 bytes per record, a list isn't about to solve the OP's memory
woes, except maybe with a little list unrolling. Dealing with all
operations on unrolled lists is complicated, but if there isn't an
unrolled list template out there there should be.
--
Derrick Coetzee
I grant this newsgroup posting into the public domain. I disclaim all
express or implied warranty and all liability. I am not a professional.
Jul 22 '05 #3
Kenneth W Del Signore wrote:

I'm working on a scheme to store several millions of records (Subdata),
each of which contains a variable number of sub records (Celldata). My concern is that
my prototype uses a lot more memory than I'm expecting. To store 1.7million Subdatas
that have an average of 7 Celldatas, I'm using 85 Mb. Thats about 50 bytes/record, where as
actual data is about 14 bytes/record (average).
[...]

class Celldata
{
public:
short cellnum:10,
time_stamp:6;

Celldata(int, int);
};

class Subdata
{
public:
vector<Celldata> cells;

int add_cell(int cell, int time);
};

class Esn_global
{
public:

vector<Subdata> sublists;
}

[...]

Let's assume pointer and integer size is 4 for the sake of calculations.

First, look at the dynamic part of
vector<Subdata> sublists;

You didn't account for
1.7 million * sizeof(Subdata)

where
sizeof(Subdata) is typically sizeof(vector<Celldata>) which
will possibly be 12 bytes (a vector needs to know where its reserved
storage begins and ends, as well as how many elements are actually used).

That's not the end of it yet. Unless you reserve the right number of
elements for sublists initially, the storage will need to be reallocated,
possibly multiple times. Reallocation of a large block may leave an
unused region not big enough for larger size allocation requests but
too big to be used up by all subsequently allocated small objects
combined. In addition, unless perhaps you know the element count
beforehand, the reserved storage of sublists will be larger than
the used part.

Now, the dynamic part of vector<Celldata> cells:
Add the overhead of heap allocation performed by the vector allocator
to what Andre has mentioned.

All things combined, 50 bytes per Subdata looks reasonable to me.
All of the above, of course, are just guesses. I would first find out
which part of that is actually the most problematic, then optimize
accordingly.

For example, for vector<Celldata>, you /might/ want to use some sort
of a small object allocator (instead of the default one).

For vector<Subdata> sublists, you /might/ benefit from an allocator
using its own dedicated heap. Alternatively, you could implement and
use a different kind of container (which, e.g., would allocate its
elements in pages). It may also prove totally unnecessary depending
on the results of the profiling.

Denis
Jul 22 '05 #4
"Denis Remezov" <fi***************@yahoo.removethis.ca> wrote in message
news:41***************@yahoo.removethis.ca...
For vector<Subdata> sublists, you /might/ benefit from an allocator
using its own dedicated heap. Alternatively, you could implement and
use a different kind of container (which, e.g., would allocate its
elements in pages).

Such a container exists in the standard library: it's called std::deque,
and offers pretty much the same interface as std::vector.
std::deque<Subdata> is likely to be a good alternative here.

hth -Ivan
--
http://ivan.vecerina.com/contact/?subject=NG_POST <- e-mail contact form
Brainbench MVP for C++ <> http://www.brainbench.com

Jul 22 '05 #5
Ivan Vecerina wrote:

"Denis Remezov" <fi***************@yahoo.removethis.ca> wrote in message
news:41***************@yahoo.removethis.ca...
For vector<Subdata> sublists, you /might/ benefit from an allocator
using its own dedicated heap. Alternatively, you could implement and
use a different kind of container (which, e.g., would allocate its
elements in pages).

Such a container exists in the standard library: it's called std::deque,
and offers pretty much the same interface as std::vector.
std::deque<Subdata> is likely to be a good alternative here.


Thanks for noting that. My only gripe is that the standard deque doesn't
provide you any way of controlling the "page" (or "node") size, but it might
be in vain (for many cases including this, it's probably not at all critical,
and the default may be just good enough).

Denis
Jul 22 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: Stefan Höhne | last post by:
Hi, as I recon, std::vector::clear()'s semantics changed from MS VC++ 6.0 to MS' DOT.NET - compiler. In the 6.0 version the capacity() of the vector did not change with the call to...
9
by: luigi | last post by:
Hi, I am trying to speed up the perfomance of stl vector by allocating/deallocating blocks of memory manually. one version of the code crashes when I try to free the memory. The other version...
9
by: fudmore | last post by:
Hello Everybody. I have a Segmentation fault problem. The code section at the bottom keeps throwing a Segmentation fault when it enters the IF block for the second time. const int...
3
by: kayjean | last post by:
Hi. I have a small test.cc >> g++ -o test test.cc (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) >> phase 1 (VSZ RSS 50844 47748 ) >> phase 2 (VSZ RSS 50844 47748 ) >> phase 3 (VSZ RSS...
9
by: kathy | last post by:
I am using std::vector in my program: func() { std::vector <CMyClass *> vpMyClass; vpMyClass.push_back(new CMyClass()); vpMyClass.push_back(new CMyClass()); vpMyClass.push_back(new...
3
by: Jeff | last post by:
I have a question about potential memory leakage in a 2 dim array I have. If I say: vector< vector<int> > vec; vec.push_back( vector<int>() ); // initialize the row vec.push_back(1);...
6
by: zl2k | last post by:
hi, there I am using a big, sparse binary array (size of 256^3). The size may be changed in run time. I first thought about using the bitset but found its size is unchangeable. If I use the...
6
by: madhu | last post by:
vector<vector<vector<long Vector3D; // 3dvector. for (long k = 0; j < Depth; j++ ) { Vector3D.push_back ( vector<vector<A_Type() ); for (long j = 0; j < Height; j++ ) { Vector3D.push_back (...
9
by: Jess | last post by:
Hello, I tried to clear a vector "v" using "v.clear()". If "v" contains those objects that are non-built-in (e.g. string), then "clear()" can indeed remove all contents. However, if "v"...
2
by: zl2k | last post by:
hi, all I need to use gsl_vector pointer with std::vector but not sure how to free the memory when I don't need it. Here is a piece of the code. =================== std::vector<gsl_vector *...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.