473,626 Members | 3,093 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Illogical std::vector size?

Hi,

First some background.

I have a structure,

struct sFileData
{
char*sSomeStrin g1;
char*sSomeStrin g2;
int iSomeNum1;
int iSomeNum2;
sFileData(){... };
~sFileData(){.. .};
sFileData(const sFileData&){... };
const sFileData operator=( const sFileData &s ){...}
};

std::vector< sFileData, std::allocator< sFileData>> address_;

for the sake of simplicity I remove the body of the 'tors
I have no memory leaks as far as I can tell.

Then I read a file, (each line is 190 chars mostly blank spaces).
In each line I 'read' info to fill in the structure.

Because there are some many blank spaces in the line I make sure that my
data is 'trimmed'.

So in effect sSomeString1 and sSomeString2 are never more than 10 chars,
(although in the file they could be up to 40 chars).

I chose vectors because after reading the file I need to do searches of
sSomeString1 and sSomeString2, (no other reasons really).

But my problem is the size of address_ is not consistent with the size of
the file.

The file is around 13Mb with around 100000 'lines' of 190 chars each.
Because I remove blank spaces and I convert 2 numbers to int, (from char). I
guess I should not use more than half, 5Mb.

But after loading I see that I used around 40Mb, (3 times more than the
original size).

as far as I can tell you cannot really tell the size of a vector, but I use
windows and the task manager and I can see the size of my app before and
after reading the file, (I do nothing else).

So what could be the reason for those inconsistencies ?
How could I optimize my code to compress those 40mb even more?

Many thanks

Simon

Jul 23 '05 #1
24 2915
simon schreef:
Hi,

First some background.

I have a structure,

struct sFileData
{
char*sSomeStrin g1;
char*sSomeStrin g2;
int iSomeNum1;
int iSomeNum2;
sFileData(){... };
~sFileData(){.. .};
sFileData(const sFileData&){... };
const sFileData operator=( const sFileData &s ){...}
};

std::vector< sFileData, std::allocator< sFileData>> address_;

for the sake of simplicity I remove the body of the 'tors
I have no memory leaks as far as I can tell.

Then I read a file, (each line is 190 chars mostly blank spaces).
In each line I 'read' info to fill in the structure.

Because there are some many blank spaces in the line I make sure that my
data is 'trimmed'.

So in effect sSomeString1 and sSomeString2 are never more than 10 chars,
(although in the file they could be up to 40 chars).

I chose vectors because after reading the file I need to do searches of
sSomeString1 and sSomeString2, (no other reasons really).

But my problem is the size of address_ is not consistent with the size of
the file.

The file is around 13Mb with around 100000 'lines' of 190 chars each.
Because I remove blank spaces and I convert 2 numbers to int, (from char). I
guess I should not use more than half, 5Mb.

But after loading I see that I used around 40Mb, (3 times more than the
original size).

as far as I can tell you cannot really tell the size of a vector, but I use
windows and the task manager and I can see the size of my app before and
after reading the file, (I do nothing else).


1) Windows Task Manager is not suited for this
2) vector only stores sFileData objects, not the strings themselves
3) Even when vector has excess size (which is common, don't want to
reallocate after each pusch_back) it won't include the strings
4) Many implementations of new[] allocate at least 16 bytes, plus
the overhead needed for delete[]
5) So what? 40MB is not a lot. Worry when it exceeds 1.5Gb. Memory
is cheap. Writing a custom string class is not. BTDT.

HTH,
Michiel Salters

Jul 23 '05 #2
>
1) Windows Task Manager is not suited for this
yea, but it was what raised suspicion in the first place.
What might be better?
2) vector only stores sFileData objects, not the strings themselves
3) Even when vector has excess size (which is common, don't want to
reallocate after each pusch_back) it won't include the strings
4) Many implementations of new[] allocate at least 16 bytes, plus
the overhead needed for delete[]
Are you saying that std::string might actually be better in that case?
What might be a better way?
5) So what? 40MB is not a lot. Worry when it exceeds 1.5Gb. Memory
is cheap. Writing a custom string class is not. BTDT.
40Mb or 4Gb, there is still something not quite right, and i would preffer
to know what it is rather than brushing it under the rug.

HTH,
Michiel Salters

Jul 23 '05 #3
> > 1) Windows Task Manager is not suited for this

yea, but it was what raised suspicion in the first place.
What might be better?


What suspicion? It only tells that the total program is now using 40 MB
not that this particular part of your program is using 40 MB.
Or if it is a change in the total amount of memory used there could be
a near infinite number of other reasons that the program is now using
40 MB instead of the expected 5 MB increase.
You'd need a profiler to check if it is indeed the vector of structs
that is the problem.

Jul 23 '05 #4

> 1) Windows Task Manager is not suited for this
yea, but it was what raised suspicion in the first place.
What might be better?


What suspicion?


That i was doing something wrong or that i did not understand something
else.
It only tells that the total program is now using 40 MB
not that this particular part of your program is using 40 MB.
Or if it is a change in the total amount of memory used there could be
a near infinite number of other reasons that the program is now using
40 MB instead of the expected 5 MB increase.
You'd need a profiler to check if it is indeed the vector of structs
that is the problem.


I placed a break point before reading the file, check the memory, and one
after reading the file.
I then compared the before and after.
The odds of it been something else are fairly small, IMO.

Simon
Jul 23 '05 #5
20 MB of file in memory is a good starter to explain away the
discrepancy.
You need something better then two brwakpoints and the taskmanager to
make statements about what uses the memory.

Jul 23 '05 #6
"simon" <sp********@myo ddweb.com> wrote in message
news:3h******** ****@individual .net
Hi,

First some background.

I have a structure,

struct sFileData
{
char*sSomeStrin g1;
char*sSomeStrin g2;
int iSomeNum1;
int iSomeNum2;
sFileData(){... };
~sFileData(){.. .};
sFileData(const sFileData&){... };
const sFileData operator=( const sFileData &s ){...}
};

std::vector< sFileData, std::allocator< sFileData>> address_;

for the sake of simplicity I remove the body of the 'tors
I have no memory leaks as far as I can tell.

Then I read a file, (each line is 190 chars mostly blank spaces).
In each line I 'read' info to fill in the structure.

Because there are some many blank spaces in the line I make sure that
my data is 'trimmed'.

So in effect sSomeString1 and sSomeString2 are never more than 10
chars, (although in the file they could be up to 40 chars).

I chose vectors because after reading the file I need to do searches
of sSomeString1 and sSomeString2, (no other reasons really).

But my problem is the size of address_ is not consistent with the
size of the file.

The file is around 13Mb with around 100000 'lines' of 190 chars each.
Because I remove blank spaces and I convert 2 numbers to int, (from
char). I guess I should not use more than half, 5Mb.

But after loading I see that I used around 40Mb, (3 times more than
the original size).

as far as I can tell you cannot really tell the size of a vector, but
I use windows and the task manager and I can see the size of my app
before and after reading the file, (I do nothing else).


I think your problem has nothing to do with the vector. As has already been
pointed out, the vector doesn't store the characters, only the pointer. With
VC++, sizeof(sFileDat a) is 16. The memory used by the vector should be
16*address_.cap acity() plus a small amount of overhead, which we can
approximate with sizeof(address_ ). Try this:

#include <vector>
#include <iostream>
using namespace std;

struct sFileData
{
char*sSomeStrin g1;
char*sSomeStrin g2;
int iSomeNum1;
int iSomeNum2;
sFileData(){}
~sFileData(){}
sFileData(const sFileData&){}
const sFileData operator=( const sFileData &s ){ return *this;}
};

std::vector< sFileData, std::allocator< sFileData> > address_;
int main()
{
sFileData sfd;
for(int i=0; i<100000; ++i)
address_.push_b ack(sfd);
cout << "storage size of vector is approx ";
cout << sizeof(address_ )+sizeof(sFileD ata)*address_.c apacity() << endl;
return 0;
}

When I run this, I get

storage size of vector is approx 2212100

and task manager similarly shows about a 2Mb increase in memory useage.
Accordingly, it seems that the other 38Mb is due to whatever else you are
doing to allocate memory for the characters --- unless, as someone else
suggested, you are reading the whole file into memory and not taking that
into account.
--
John Carson

Jul 23 '05 #7
simon wrote:
1) Windows Task Manager is not suited for this

yea, but it was what raised suspicion in the first place.
What might be better?

2) vector only stores sFileData objects, not the strings themselves
3) Even when vector has excess size (which is common, don't want to
reallocate after each pusch_back) it won't include the strings
4) Many implementations of new[] allocate at least 16 bytes, plus
the overhead needed for delete[]

Are you saying that std::string might actually be better in that case?
What might be a better way?


No, std::string have the same problem. That isn't the cause.
5) So what? 40MB is not a lot. Worry when it exceeds 1.5Gb. Memory
is cheap. Writing a custom string class is not. BTDT.

40Mb or 4Gb, there is still something not quite right, and i would preffer
to know what it is rather than brushing it under the rug.


40 Mb is about (42000000 bytes)
42000000 bytes / 100000 records = 420 bytes/record.

Even if each new char* uses at least 16 bytes, one record uses about
40 bytes.
Definetively I think you have a problem elsewhere probably in loop
Jul 23 '05 #8
> 20 MB of file in memory is a good starter to explain away the
discrepancy.
I don't read the whole file in memory.

I use fopen(...)
read each chunk of data using fread(...) and then close the file using
fclose(...).
You need something better then two brwakpoints and the taskmanager to
make statements about what uses the memory.


Well i am sorry, but i can see that before i read the file i use x amount of
memory and that just after i finish reading the file, (after the fclose), i
used x+40mb.

Simon

Jul 23 '05 #9
>> Hi,

First some background.

I have a structure,
I think your problem has nothing to do with the vector. As has already
been pointed out, the vector doesn't store the characters, only the
pointer. With VC++, sizeof(sFileDat a) is 16. The memory used by the vector
should be 16*address_.cap acity() plus a small amount of overhead, which we
can approximate with sizeof(address_ ). Try this:
<snip code >

When I run this, I get

storage size of vector is approx 2212100
I will try that, but i should get the same thing myself.

and task manager similarly shows about a 2Mb increase in memory useage.
Accordingly, it seems that the other 38Mb is due to whatever else you are
doing to allocate memory for the characters --- unless, as someone else
suggested, you are reading the whole file into memory and not taking that
into account.

I don't read the whole file in memory.

I use fopen(...)
read each chunk of data using fread(...) and then close the file using
fclose(...).

Simon
Jul 23 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
4761
by: Chris Thompson | last post by:
Hi I'm writing a p2p client for an existing protocol. I used a std::vector<char> as a buffer for messages read from the server. The message length is the first 4 bytes. The message code the second 4. The total message length is therefore 4 + message length. A number of messages work fine/as expected but there are consistant errors occuring. After a period
18
2861
by: Janina Kramer | last post by:
hi ng, i'm working on a multiplayer game for a variable number of players and on the client side, i'm using a std::vector<CPlayer> to store informatik about the players. CPlayer is a class that contains another std::vector<CPosition>. Because one of the players is the client itself (and the size of the vector<CPlayer> doesn't change during a game), i thought i could store a std::vector<CPlayer>::iterator "localplayer" that points to the...
20
17809
by: Anonymous | last post by:
Is there a non-brute force method of doing this? transform() looked likely but had no predefined function object. std::vector<double> src; std::vector<int> dest; std::vector<double>::size_type size = src.size(); dest.reserve(size); for (std::vector<int>::size_type i = 0;
17
3343
by: Michael Hopkins | last post by:
Hi all I want to create a std::vector that goes from 1 to n instead of 0 to n-1. The only change this will have is in loops and when the vector returns positions of elements etc. I am calling this uovec at the moment (for Unit-Offset VECtor). I want the class to respond correctly to all usage of STL containers and algorithms so that it is a transparent replacement for std:vector. The options seems to be:
8
5103
by: Ross A. Finlayson | last post by:
I'm trying to write some C code, but I want to use C++'s std::vector. Indeed, if the code is compiled as C++, I want the container to actually be std::vector, in this case of a collection of value types or std::vector<int>. So where I would use an int* and reallocate it from time to time in C, and randomly access it via , then I figure to copy the capacity and reserve methods, because I just need a growable array. I get to considering...
7
3003
by: Dilip | last post by:
If you reserve a certain amount of memory for a std::vector, what happens when a reallocation is necessary because I overshot the limit? I mean, say I reserve for 500 elements, the insertion of 501st element is going to cause some more allocation -- for arguments sake if the vector re-grows to accomodate 1000 elements, I play around with it and completely erase everything in it after I am done. Now how much does the vector hold? Do I...
4
2312
by: mathieu | last post by:
Hello, I am looking at the API of std::vector but I cannot find a way to specify explicitely the size of my std::vector. I would like to avoid vector::resize since it first initializes the elements of the vector. Thank you ! Mathieu Code:
3
2238
by: n.torrey.pines | last post by:
I'd like to be able to view two contiguous elements of a vector as a pair. Assuming I'm not accessing the last element, of course, and the element type is not bool, when is it safe to do so, from the language definition point of view? const pr& p = *(const pr*)(&v); // pr - either std::pair or hand-defined pair of elements
23
3464
by: Mike -- Email Ignored | last post by:
In std::vector, is reserve or resize required? On: Linux mbrc32 2.6.22.1-41.fc7 #1 SMP Fri Jul 27 18:10:34 EDT 2007 i686 athlon i386 GNU/Linux Using: g++ (GCC) 4.1.2 20070502 (Red Hat 4.1.2-12) The program below fails, but if the reserve(en)
0
8696
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
8637
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8358
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
7188
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6119
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
5571
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4195
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2621
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1504
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.