473,385 Members | 1,863 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

How to deal with a platform-dependent issue.

I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you're curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

After spending the better part of a day trying to figure out the
problem, it seems that the RogueWave map performs bulk allocation so
that as soon as one item is inserted into the map, it allocates space
for 32. (I found this by writing an allocator to forward memory ops to
malloc/free and log each such call.) This is much more than the space
needed for a typical case (where I might have say 3 or 4 entries only),
hence the tremendous increase in memory usage compared to the gcc
version (which allocates space one entry at a time).

OK, so I realize this is fundamentally a platform issue and not entirely
on-topic, but I'd welcome any advice. For example, is there a language
feature I don't know about that could circumvent this allocation
strategy? Is there another approach that achieves the functionality of
a map without making explicit use thereof? Any other recommendations?

Thanks for your help,
Mark
Jul 23 '05 #1
12 1850
Mark P wrote:
I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you're curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)


[...]

I don't know if this will help or not, but you might want to take a look at:

http://www.boost.org/libs/property_m...perty_map.html

Jul 23 '05 #2
* Mark P:
I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you're curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)
I don't see how that would work, but, accepting that it does work:

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

After spending the better part of a day trying to figure out the
problem, it seems that the RogueWave map performs bulk allocation so
that as soon as one item is inserted into the map, it allocates space
for 32. (I found this by writing an allocator to forward memory ops to
malloc/free and log each such call.) This is much more than the space
needed for a typical case (where I might have say 3 or 4 entries only),
hence the tremendous increase in memory usage compared to the gcc
version (which allocates space one entry at a time).
Just out of curiosity, by 'allocator' do you mean replacing global 'new'
or did you write an allocator to pass as template argument to 'map'?

If the latter then the bulk allocation is part of the 'map'
implementation (e.g. a B-tree like thing).

And that would preclude a solution based on supplying your own allocator
template argument.

OK, so I realize this is fundamentally a platform issue and not entirely
on-topic, but I'd welcome any advice. For example, is there a language
feature I don't know about that could circumvent this allocation
strategy? Is there another approach that achieves the functionality of
a map without making explicit use thereof? Any other recommendations?


It seems you need a map-like class optimized for 3 to 4 elements, but
also able to handle larger sizes.

I'd try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the 'map' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what's best.

If the number of maps with larger number of elements is sufficiently
large that the simple approach isn't efficient enough then perhaps
consider more complicated things, like, you already have an example of
an implementation that is efficient enough, and another approach might
be to dynamically switch internal implementation of your wrappers
(what's the pattern name for that, something like handle/envelope?),
although that would introduce some overhead by itself.

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 23 '05 #3
Alf P. Steinbach wrote:
* Mark P:
I wrote some code which makes heavy use of std::map. In fact, for a
typical instance, I may have on the order of 100K maps, each with only a
small number of elements.

(In case you're curious, this is for a computational geometry
application-- each map represents an edge of a polygon and its elements
are intermediate markers located along the edge, mapping relative
position on the edge to some auxiliary data.)

I don't see how that would work, but, accepting that it does work:

Well, I haven't told you what the application is :) It actually works
quite nicely for its intended purpose (memory issues below notwithstanding).
This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

After spending the better part of a day trying to figure out the
problem, it seems that the RogueWave map performs bulk allocation so
that as soon as one item is inserted into the map, it allocates space
for 32. (I found this by writing an allocator to forward memory ops to
malloc/free and log each such call.) This is much more than the space
needed for a typical case (where I might have say 3 or 4 entries only),
hence the tremendous increase in memory usage compared to the gcc
version (which allocates space one entry at a time).

Just out of curiosity, by 'allocator' do you mean replacing global 'new'
or did you write an allocator to pass as template argument to 'map'?


The latter.
If the latter then the bulk allocation is part of the 'map'
implementation (e.g. a B-tree like thing).
Understood. I believe it's a red-black tree. STL implementations are a
bit inscrutable in my experience, but somewhere buried in the map
implementation is a parameter __buffer_size which at first glance seems
to be responsible for the bulk allocation.

And that would preclude a solution based on supplying your own allocator
template argument.
Also understood.
OK, so I realize this is fundamentally a platform issue and not entirely
on-topic, but I'd welcome any advice. For example, is there a language
feature I don't know about that could circumvent this allocation
strategy? Is there another approach that achieves the functionality of
a map without making explicit use thereof? Any other recommendations?

It seems you need a map-like class optimized for 3 to 4 elements, but
also able to handle larger sizes.


Yep, that's be great. Got one?

I'd try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the 'map' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what's best.

That's certainly an idea to consider. The problem is that this
structure is dynamically updated, requiring insertions in the middle
(w.r.t. the sort), range operations, and so on. Still it's worth
considering so thanks for the suggestion.
If the number of maps with larger number of elements is sufficiently
large that the simple approach isn't efficient enough then perhaps
consider more complicated things, like, you already have an example of
an implementation that is efficient enough, and another approach might
be to dynamically switch internal implementation of your wrappers
(what's the pattern name for that, something like handle/envelope?),
although that would introduce some overhead by itself.


I've come up with a couple other ideas too:

1. Right now the map holds objects which are about 12 words each. I
could store pointers instead so that the extra unused nodes are smaller.
Unfortunately there's still the size of the node separate from its
data which adds a few words at least, and managing raw pointers can be a
pain.

2. Rather than have one map per edge I could have, say, one map per
polygon and use a more complicated sort to keep all the intermediate
points straight. This also seems like a pain, it breaks the edge
abstraction, and there's an efficiency penalty from searching in a
larger structure.

I think I'd be satisfied, at least provisionally, with a structure that
supports linear time searches and constant time inserts. This suggests
a list except that the list implementation also allocates in blocks of
32 after the first element. On the plus side there's less overhead
associated with the nodes of a list compared to those of a map.

All of which makes wonder, as I pause to wax philosophical, why the
S(tandard)TL isn't more standardized.
Jul 23 '05 #4
Ian
Mark P wrote:

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance. Memory
usage jumped by a factor of 10 or more which is unacceptable for my
application.

A bit OT, but have you tried the stlport supplied with CC?

Ian
Jul 23 '05 #5
Mark P wrote:
Alf P. Steinbach wrote:
I'd try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the 'map' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what's best.


That's certainly an idea to consider. The problem is that this
structure is dynamically updated, requiring insertions in the middle
(w.r.t. the sort), range operations, and so on. Still it's worth
considering so thanks for the suggestion.


If the insertions are expensive (I'm not clear on how big the items
you're storing are) then maybe some sort of list structure would be
appropriate?

HTH,
Chris
Jul 23 '05 #6
* Chris Newton:
Mark P wrote:
Alf P. Steinbach wrote:
I'd try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the 'map' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what's best.


That's certainly an idea to consider. The problem is that this
structure is dynamically updated, requiring insertions in the middle
(w.r.t. the sort), range operations, and so on. Still it's worth
considering so thanks for the suggestion.


If the insertions are expensive (I'm not clear on how big the items
you're storing are) then maybe some sort of list structure would be
appropriate?


Actually the vector wins handily over a list in that respect, because
per-element dynamic allocations are avoided (a dynamic allocation is
very expensive compared to moving ten or twenty or forty words/bytes).

Anyway, with just a little added support a vector can do list'ing too,
providing at the same time (1) constant time random read/write access,
(2) constant time insert/remove at an iterator position, where (3) the
iterator can be moved backward and forward one element in constant time.
The cost is that moving the cursor involves copying an element, and that
there can be only one insert/remove iterator at a time, and the
gain is the constant time random read/write access, as well as avoidance
of dynamic allocations and deallocations (or alternatively, avoidance of
having to use a free-list) by using a reasonable buffer size.

This little data structure is called a cursor gap array, and the idea is
simply to collect all the free space in the vector at the cursor (aka
iterator) position, which means the iterator is moved one step by
copying an element from one side of the cursor gap to the other side.

But hey, I'm off-topic, and besides, measuring the actual performance
would be appropriate before even considering adding such support.

This should probably be in [comp.programming]... ;-)

--
A: Because it messes up the order in which people normally read text.
Q: Why is it such a bad thing?
A: Top-posting.
Q: What is the most annoying thing on usenet and in e-mail?
Jul 23 '05 #7
Alf P. Steinbach wrote:
Actually the vector wins handily over a list in that respect, because
per-element dynamic allocations are avoided (a dynamic allocation is
very expensive compared to moving ten or twenty or forty words/bytes).


In isolation yes, but since the OP is apparently happy to play with
allocators I don't see why that would necessarily be a problem...

Cheers,
Chris
Jul 23 '05 #8
Ian wrote:
Mark P wrote:

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance.
Memory usage jumped by a factor of 10 or more which is unacceptable
for my application.

A bit OT, but have you tried the stlport supplied with CC?

Ian


Actually yes, though after my original post. (I'm new to CC and didn't
know there was an alternative until later browsing the CC
documentation.) stlport *does* do the desired thing, namely allocate
one tree node at a time.

The problem is that this is part of a large piece of software and I'm
not sure everyone else developing it will feel comfortable with such a
large change (i.e., using a completely different library
implementation). What I'm wondering now is whether there's some way for
me to copy the relevant files of the stlport source to some side
directory, rename map to, say, portmap, and use it like that without
interfering with the default library implementation. Is such a thing
feasible or would it be a hopeless task?
Jul 23 '05 #9
Mark P wrote:
Ian wrote:
Mark P wrote:

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance.
Memory usage jumped by a factor of 10 or more which is unacceptable
for my application.

A bit OT, but have you tried the stlport supplied with CC?

Ian


Actually yes, though after my original post. (I'm new to CC and didn't
know there was an alternative until later browsing the CC
documentation.) stlport *does* do the desired thing, namely allocate
one tree node at a time.

The problem is that this is part of a large piece of software and I'm
not sure everyone else developing it will feel comfortable with such a
large change (i.e., using a completely different library
implementation). What I'm wondering now is whether there's some way for
me to copy the relevant files of the stlport source to some side
directory, rename map to, say, portmap, and use it like that without
interfering with the default library implementation. Is such a thing
feasible or would it be a hopeless task?


I'm not that familiar with stlport, but in many stl implementations
the files are highly co-dependent. 'map' for example might
include other headers, which might include other headers, etc;
any of these could define/use objects specific to the implementation.
So, taking a few files from one implementation and trying to use
them with another would probably be very problematic.

Larry
Jul 23 '05 #10
Ian
Mark P wrote:
Ian wrote:
Mark P wrote:

This worked just fine when compiled with gcc on Linux, but after
recently running on Sun with the CC compiler (using the RogueWave STL
implementation, I believe) I found horrible memory performance.
Memory usage jumped by a factor of 10 or more which is unacceptable
for my application.

A bit OT, but have you tried the stlport supplied with CC?

Ian

Actually yes, though after my original post. (I'm new to CC and didn't
know there was an alternative until later browsing the CC
documentation.) stlport *does* do the desired thing, namely allocate
one tree node at a time.

The problem is that this is part of a large piece of software and I'm
not sure everyone else developing it will feel comfortable with such a
large change (i.e., using a completely different library
implementation). What I'm wondering now is whether there's some way for
me to copy the relevant files of the stlport source to some side
directory, rename map to, say, portmap, and use it like that without
interfering with the default library implementation. Is such a thing
feasible or would it be a hopeless task?


Hopeless.

I thought from your original post this was a fresh port to Solaris. The
transition is fairly painless, I've just done it with a large application.

Ian
Jul 23 '05 #11
Chris Newton wrote:
Mark P wrote:
Alf P. Steinbach wrote:
I'd try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the 'map' operations you use, and
set the capacity to e.g. 2 or 4 on construction; measure what's best.

That's certainly an idea to consider. The problem is that this
structure is dynamically updated, requiring insertions in the middle
(w.r.t. the sort), range operations, and so on. Still it's worth
considering so thanks for the suggestion.

If the insertions are expensive (I'm not clear on how big the items
you're storing are) then maybe some sort of list structure would be
appropriate?

HTH,
Chris


Sadly the implementation also allocates list nodes in blocks of 32 after
the first insertion. So inserting 1 item is memory cheap, but if you
insert 2 you may as well insert 33, since 2-33 are allocated together.
On the plus side, the memory overhead per list node is less than that
per map (i.e., tree) node.

I broached this in a previous post but no one commented on it-- are
there any efforts to further standardize STL containers with respect to
these sorts of implementation details?
Jul 23 '05 #12
Mark P wrote:
Chris Newton wrote:
Mark P wrote:
Alf P. Steinbach wrote:

I'd try out an implementation based on std::vector first: linear search
is almost always the fastestest ( :-) ) for a very small number of
elements; wrap it in a class providing the 'map' operations you use,
and
set the capacity to e.g. 2 or 4 on construction; measure what's best.

That's certainly an idea to consider. The problem is that this
structure is dynamically updated, requiring insertions in the middle
(w.r.t. the sort), range operations, and so on. Still it's worth
considering so thanks for the suggestion.


If the insertions are expensive (I'm not clear on how big the items
you're storing are) then maybe some sort of list structure would be
appropriate?

HTH,
Chris

Sadly the implementation also allocates list nodes in blocks of 32 after
the first insertion. So inserting 1 item is memory cheap, but if you
insert 2 you may as well insert 33, since 2-33 are allocated together.
On the plus side, the memory overhead per list node is less than that
per map (i.e., tree) node.

I broached this in a previous post but no one commented on it-- are
there any efforts to further standardize STL containers with respect to
these sorts of implementation details?


Amounts of memory used are not likely to be standardized; they
are viewed as quality of implementation (QoI) issues. The ideal
trade-offs vary depending on the environment -- embedded systems
would tend often to favour space over speed, whereas for larger
systems it's often (though not always) beneficial to use more
space to improve performance.

Certain QoI issues tend to be common enough that they become
quasi-standard; an empty vector that used dynamic allocation
would be considered pretty poor, for example.

I don't expect to see support from standards group for mandating
things like amount of memory used or constants of performance.

-- James
Jul 23 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Min | last post by:
I have seen so many pointing out "main" should explicitly return "int". Beside, the language spec or committee, or some guru said so, what is a BIG deal with it ? What difference does it make if...
21
by: Franco Gustavo | last post by:
Hi, Please help me to understand this, because I don't see what I'm missing. I was reading a lot of examples on Internet that explain that C# doesn't implement multiple inheritance it...
7
by: Abby | last post by:
Hi, I've come to the part that I need to have MD5 in my program. The only problem is that ... I don't have any clue how to do it!! The algorithm I need to include in my program is: Msg =...
21
by: Paul Tremblay | last post by:
Hi All, I am a veteran C/C++ programmer (Unix background) and I want to get to speed with Visual Studio .Net I have legacy C/C++ code that I want to use in my application. However, I'm not...
48
by: Frederick Gotham | last post by:
The "toupper" function takes an int as an argument. That's not too irrational given that a character literal is of type "int" in C. (Although why it isn't of type "char" escapes me... ) The...
8
by: Michiel Sikma | last post by:
Hello everybody, I was thinking about making a really insignificant addition to an online system that I'm making using Python: namely, I would like it to print the platform that it is running on...
4
by: dolphin | last post by:
I want to write a program to captrue voice from mike and send it to anthoer host.Are there some functions that do these things in C++?
0
by: marathoner | last post by:
I am currently migrating my Visual C++ 6.0 applications to Visual Studio 2005. I am getting compiler errors involving the VS2005's platform SDK. When I removed directory references to that SDK,...
5
by: mohamed azaz | last post by:
hi I want to know can c++ deal with hardware?? like fax modem or Lan or Audio device
1
by: Vinod Sadanandan | last post by:
Cross Platform Migration An Unproblematic Approach (Windows-UNIX ) Oracle 10\11g The principal restriction on cross-platform transportable database is that the source and destination platform...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.