471,595 Members | 1,579 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,595 software developers and data experts.

how long should it take to delete 34 million __int64 from a map?

I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808 to
9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the program is
shutting down it takes forever (10+ mins) while its deleting the objects.
The memory it uses according to task manager is varies from 500,000 k then
it dropped down to 9 k then it went back up to 400,000k and I just killed
the process.

I'm using stl port btw. I'm looking to ask in the stl port group once I find
one but is 34 million alot and it should take this amount of time?

Is there any way to speed this up?

thanks

Feb 16 '06 #1
10 3062

Whybother wrote:
I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808 to
9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the program is
shutting down it takes forever (10+ mins) while its deleting the objects.
The memory it uses according to task manager is varies from 500,000 k then
it dropped down to 9 k then it went back up to 400,000k and I just killed
the process.

I'm using stl port btw. I'm looking to ask in the stl port group once I find
one but is 34 million alot and it should take this amount of time?

Is there any way to speed this up?


Yes, I would expect it to take quite a while. The map has to traverse
34 million objects and delete them.

You probably want some completely different, not in the standard,
container. You can, with some work, implement a map inside of a vector
and such a structure would be quicker to delete by quite a bit.

Feb 16 '06 #2
Whybother wrote:
I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808
to 9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the
program is shutting down it takes forever (10+ mins) while its
deleting the objects. The memory it uses according to task manager is
varies from 500,000 k then it dropped down to 9 k then it went back
up to 400,000k and I just killed the process.

I'm using stl port btw. I'm looking to ask in the stl port group once
I find one but is 34 million alot and it should take this amount of
time?
Is there any way to speed this up?


Call 'abort'.

V
--
Please remove capital As from my address when replying by mail
Feb 16 '06 #3

<ro**********@gmail.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...

Whybother wrote:
I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808 to
9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the program is
shutting down it takes forever (10+ mins) while its deleting the objects.
The memory it uses according to task manager is varies from 500,000 k
then
it dropped down to 9 k then it went back up to 400,000k and I just killed
the process.

I'm using stl port btw. I'm looking to ask in the stl port group once I
find
one but is 34 million alot and it should take this amount of time?

Is there any way to speed this up?


Yes, I would expect it to take quite a while. The map has to traverse
34 million objects and delete them.

You probably want some completely different, not in the standard,
container. You can, with some work, implement a map inside of a vector
and such a structure would be quicker to delete by quite a bit.

are you talking a vector consisting of map elements? vector<map<int,int>> or
something? can you expand upon that?

thanks
Feb 16 '06 #4
Whybother wrote:
I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808 to
9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the program is
shutting down it takes forever (10+ mins) while its deleting the objects.
The memory it uses according to task manager is varies from 500,000 k then
it dropped down to 9 k then it went back up to 400,000k and I just killed
the process.

I'm using stl port btw. I'm looking to ask in the stl port group once I find
one but is 34 million alot and it should take this amount of time?

Is there any way to speed this up?

thanks


You might find a hashmap (std::tr1::unordered_map or something similar)
quicker since the order of your elements is not relevant (or is it? you
need to supply some code). The overhead of a balanced rb tree (map)
would be a lot with 34M elements. Another suggestion would be to write
your own allocator which uses large pre-allocated blocks of memory.

--
--dakka

Dykstra's Observation:
If debugging is the process of removing bugs, then programming must be
the process of putting them in.
Feb 16 '06 #5
Whybother wrote:
I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808 to
9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the program is
shutting down it takes forever (10+ mins) while its deleting the objects.
The memory it uses according to task manager is varies from 500,000 k then
it dropped down to 9 k then it went back up to 400,000k and I just killed
the process.


This would be typical if you hit disk swap. std::map is not optimized
for
low-memory situations. Basically, even if but 10% of the entries are
swapped to disk, removing any one of them has a
(1-0.9^(2log34.000.000))
chance of touching swap. That's 93%! Per node! I wouldn't be surprised
if in that case you had a million I/Os. That's bound to be painful.

Is it a lot faster with half the size?

HTH,
Michiel Salters

Feb 16 '06 #6
On 2006-02-16, Whybother <fu**@fuck.net> wrote:

<ro**********@gmail.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...

Whybother wrote:
I'm using stl port btw. I'm looking to ask in the stl port group
once I find one but is 34 million alot and it should take this
amount of time?

Is there any way to speed this up?


Yes, I would expect it to take quite a while. The map has to
traverse 34 million objects and delete them.

You probably want some completely different, not in the standard,
container. You can, with some work, implement a map inside of a
vector and such a structure would be quicker to delete by quite a
bit.


are you talking a vector consisting of map elements?
vector<map<int,int>> or something? can you expand upon that?


He probably meant a vector of pair<key, value>, kept sorted by key.
You can find elements using binary search, and destructing the thing
can be done in one quick step.

If your 34-million elements aren't already sorted when you put them in
the data structure it might end up taking much longer to build up the
vector, though, since potentially NumElements pairs have to moved to
keep it sorted. The usage pattern of the data is critical to deciding
how to solve this problem.

--
Neil Cerutti
Feb 16 '06 #7
In article <8e********************@comcast.com>,
"Whybother" <fu**@fuck.net> wrote:
<ro**********@gmail.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...

Whybother wrote:
I have this

typedef map<__int64, int> Results_Map;

__int64 is defined as 8 bytes ranging from -9,223,372,036.854,775,808 to
9,223,372,036.854,775,807

I have loaded it with approx 34 million hash keys and when the program is
shutting down it takes forever (10+ mins) while its deleting the objects.
The memory it uses according to task manager is varies from 500,000 k
then
it dropped down to 9 k then it went back up to 400,000k and I just killed
the process.

I'm using stl port btw. I'm looking to ask in the stl port group once I
find
one but is 34 million alot and it should take this amount of time?

Is there any way to speed this up?


Yes, I would expect it to take quite a while. The map has to traverse
34 million objects and delete them.

You probably want some completely different, not in the standard,
container. You can, with some work, implement a map inside of a vector
and such a structure would be quicker to delete by quite a bit.

are you talking a vector consisting of map elements? vector<map<int,int>> or
something? can you expand upon that?


He's talking about a vector< pair< key, value > > that stays sorted
after inserts. The loki library (in sourceforge) has one already made,
here are the notes from it:

class template AssocVector
An associative vector built as a syntactic drop-in replacement for
std::map
BEWARE: AssocVector doesn't respect all map's guarantees, the most
important being:
* iterators are invalidated by insert and erase operations
* the complexity of insert/erase is O(N) not O(log N)
* value_type is std::pair<K, V> not std::pair<const K, V>
* iterators are random

However, would it be a good idea to a 34 million element contiguous
array?

--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.
Feb 16 '06 #8

Daniel T. wrote:
In article <8e********************@comcast.com>,
"Whybother" <fu**@fuck.net> wrote:
<ro**********@gmail.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...

Whybother wrote: You probably want some completely different, not in the standard,
container. You can, with some work, implement a map inside of a vector
and such a structure would be quicker to delete by quite a bit.

are you talking a vector consisting of map elements? vector<map<int,int>> or
something? can you expand upon that?


He's talking about a vector< pair< key, value > > that stays sorted
after inserts.


That is one way. Another way would be to implement the binary tree in
that structure. A binary tree node has three parts: data; right; and
left. Create a vector of such structures and when you put one in you
also navigate from the root (v[0]) and set up index points (in case you
have to reallocate you don't want to rebuild those pointers) for right
or left of the leaf you are attaching to. Then you can still use the
same map access and you don't have to sort an array in place as you
always just put the new node on the end no matter what your insert op
is. The insert operation is not constant time but it is O(lg N) so
with large datasets like this it would be a lot better than linear.

Finding an element might look something like:

size_t find(size_t index, X value)
{
if (v[index].value == value) return index;
if (v[index].value < value) return find(v[index].left);
return find(v[index].right);
}

I believe Knuth talks about such an implementation of a tree in his
books. It isn't the easiest way to do it, and I've never seen it in a
data structures class unfortunately, but it might be what this guy
needs. It's really a totally new structure as far as the standard is
concerned (of course nothing says std::map can't be done this way
AFAIK). You might use std::vector or you might be better off with a
simple block of memory that your new map class manages itself.

There are other ways involving two arrays, one for your tree and the
other for the data; you'll have to see which, if any, will serve your
needs best.

Since the vector is allocated as a block instead of as a series of
pointers the deallocation is constant. One thing though, you won't
want any pointers in this thing at all or you are back to linear
deletion of several million units.

Feb 16 '06 #9
In article <sl**********************@FIAD06.norwich.edu>,
Neil Cerutti <le*******@email.com> wrote:
On 2006-02-16, Whybother <fu**@fuck.net> wrote:

<ro**********@gmail.com> wrote in message
news:11**********************@g43g2000cwa.googlegr oups.com...

Whybother wrote: I'm using stl port btw. I'm looking to ask in the stl port group
once I find one but is 34 million alot and it should take this
amount of time?

Is there any way to speed this up?

Yes, I would expect it to take quite a while. The map has to
traverse 34 million objects and delete them.

You probably want some completely different, not in the standard,
container. You can, with some work, implement a map inside of a
vector and such a structure would be quicker to delete by quite a
bit.


are you talking a vector consisting of map elements?
vector<map<int,int>> or something? can you expand upon that?


He probably meant a vector of pair<key, value>, kept sorted by key.
You can find elements using binary search, and destructing the thing
can be done in one quick step.

If your 34-million elements aren't already sorted when you put them in
the data structure it might end up taking much longer to build up the
vector, though, since potentially NumElements pairs have to moved to
keep it sorted. The usage pattern of the data is critical to deciding
how to solve this problem.


A deque< pair< key, value > > might be a better choice...

--
Magic depends on tradition and belief. It does not welcome observation,
nor does it profit by experiment. On the other hand, science is based
on experience; it is open to correction by observation and experiment.
Feb 16 '06 #10
On 2006-02-16, Daniel T. <po********@earthlink.net> wrote:
In article <sl**********************@FIAD06.norwich.edu>,
Neil Cerutti <le*******@email.com> wrote:
He probably meant a vector of pair<key, value>, kept sorted by key.
You can find elements using binary search, and destructing the
thing can be done in one quick step.

If your 34-million elements aren't already sorted when you put them
in the data structure it might end up taking much longer to build
up the vector, though, since potentially NumElements pairs have to
moved to keep it sorted. The usage pattern of the data is critical
to deciding how to solve this problem.


A deque< pair< key, value > > might be a better choice...


Depending on the implementation that might be true (I've read Herb
Sutter's article suggesting deque over vector as a default container),
but I don't see how the specific guarantees made in the standard
promote deque in this case.

--
Neil Cerutti
We shall reach greater and greater platitudes of achievement.
--Richard J. Daley
Feb 16 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

8 posts views Thread by Tim Clacy | last post: by
2 posts views Thread by Bryan Olson | last post: by
8 posts views Thread by Keoncheol Shin | last post: by
12 posts views Thread by __frank__ | last post: by
3 posts views Thread by tutush | last post: by
6 posts views Thread by valerij | last post: by
73 posts views Thread by Yevgen Muntyan | last post: by
21 posts views Thread by Charles Sullivan | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.