473,699 Members | 2,628 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How to find out memory occupied by vector, set, or map?

I'm trying to develop several interesting components
of a simple search engine follow some text books.

A book introduces some ways to compress dictionary,
allow it to stay on main memory.
In the dictionary, is a list of millions of words
with related data of several bytes.

Word / Term Freq(int) / Pointer to disk file
Hello 100

One method is:
concatenate all the words into one long contiguous string
and an array of 4-byte character pointers is used to access.

But I think STL map is a suitable data structure for this problem.
I don't know the detailed internal implementation of map.
I think it's implemented as a binary search tree.
Every node occupy just enough memory. (I'm not sure)
I also need the search feature of BST.

So I'd like to know if it is a need to build the data structure
myself.
How can I know the whole memory occupied by a map,
as well as other data structures such as vector, set.
(Or memory occupied by a map element).

BTW, should I use std::string or char[] to store the millions of
words?
How much is the string overhead?
Jun 27 '08 #1
1 2243
On May 20, 4:58 am, Lambda <stephenh...@gm ail.comwrote:
I'm trying to develop several interesting components
of a simple search engine follow some text books.
A book introduces some ways to compress dictionary,
allow it to stay on main memory.
In the dictionary, is a list of millions of words
with related data of several bytes.
Word / Term Freq(int) / Pointer to disk file
Hello 100
One method is:
concatenate all the words into one long contiguous string
and an array of 4-byte character pointers is used to access.
A classical string pool. It's not a solution to your problem,
just a very special way to represent strings.
But I think STL map is a suitable data structure for this
problem. I don't know the detailed internal implementation of
map. I think it's implemented as a binary search tree. Every
node occupy just enough memory. (I'm not sure) I also need the
search feature of BST.
Every node is dynamically allocated, and contains several
pointers in addition to the raw data, so it can be fairly
expensive in terms of memory use. On the other hand, for
something small, like a couple of million elements, why not.

Note that access time is O(lg n), and not constant. For more
than a couple of thousand elements, a hash table will be faster
(provided you use a good hash function).
So I'd like to know if it is a need to build the data
structure myself. How can I know the whole memory occupied by
a map, as well as other data structures such as vector, set.
(Or memory occupied by a map element).
You can't really, since there are so many hidden overheads. I'd
just go ahead and implement using std::map, but carefully
encapsulating it so that I could replace it if it did cause a
problem later.
BTW, should I use std::string or char[] to store the millions
of words?
std::string, definitely.
How much is the string overhead?
Again, it depends on the implementation. A lot of
implementations today use a small string optimization which
avoids dynamic allocation for strings shorter than some cut-off
size; if your words are English, there's a good chance that most
of them will fall under this limit.

All sorts of optimizations are possible, once you have the
application working, and can see if they're necessary.

--
James Kanze (GABI Software) email:ja******* **@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientier ter Datenverarbeitu ng
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jun 27 '08 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
2615
by: Birgit Rahm | last post by:
Hello newsgroup, I am a beginner, so I am asking maybe immoderate questions. I want to delete a variable, after filling it with complex objects. How should I do it? e.g. AAA = AAA = Can I 1) AAA =
2
1491
by: anand | last post by:
hi, i am fetching data from database using dataadaptor and filling up a dataset. i checked memory usage of the app, and it consumes 50 Mb of RAM for the given amount of data. that is, i checked the memory occupied by the app before and after fetching data. i saved the same amount of data in plain comma separated text file, which occupied 4Mb! i saved the same data in
1
2798
by: trialproduct2004 | last post by:
Hi all, I am having slight confusion regarding memory management in .net. Say suppose i have two application one is in C# and other is in MFC(VC++). Both of this application are using lots of memory. Suppose i run first C# application which has occupied all memory and
18
21535
by: Jan | last post by:
Hi there, i've got an STL map with something like this ( map<string, Object*> xyz; ) What happens when I call xyz.clear()? Is only the map cleared or the map and the Objects, so that the memory is free again? Have I to delete every single Object? Thanks for replying, Jan
18
7221
by: ma740988 | last post by:
Trying to get more acclimated with the use of function objects. As part of my test, consider: # include <vector> # include <iostream> # include <algorithm> #include <stdexcept> #include <bitset> using std::vector;
3
3096
by: somaskarthic | last post by:
Hi How to release the memory occupied by string vector in c++? Its very urgent . Pls post your replies? Thanks in advance somaskarthic
17
3068
by: christophe.chazeau | last post by:
Hi, I have a problem with a really simple chunk of code which should work but does obviously does not. This chunk of code is just a POC aimed at finding a bug in a larger project in which the same problem seems to occur. Here the deal : when I run this piece of code, I expect all the memory allocated by the "Test" object to be freed but what I observe is that after the second sleep (after all the additions to the vector), the memory...
39
3186
by: Ravi | last post by:
Can you all please suggest a program which tell us the range of memry addresses occupied by the given c program?
4
4249
by: alien.0101 | last post by:
Hi I want to know the memory occupied by STL Map, Is there a way to do that. I have written a sample application and adding value to map doesnot change size map<int,intmapInt2Int; mapInt2Int=2; mapInt2Int=2;
0
8685
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8613
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
8880
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7745
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5869
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4374
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4626
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2344
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2008
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.