Hi people,
I have some piece of code that takes way too much time to do what it
does (which is irrelevant here) and while searching for the cause I
stumbled upon the following surprising results:
The code uses a priority queue and so have lines of the form:
PQ.push(item);
and
Item item = PQ.top() ; PQ.pop();
So I added some manual profiling code around those two operations:
Timer t ; t.start();
PQ.push(item)
t.stop(); total_insert_time += t.time();
Timer t ; t.start();
Item item = PQ.top() ; PQ.pop();
t.stop(); total_remove_time += t.time();
During the particularly long run, about 150000 items are inserted in
total and each and every one of them removed. (the main loop finishes
when the PQ is empty)
But here is the puzzing (to me at least) result:
total_insert_time : 34 seconds
total_remove_time: 240 seconds
And this ratio is more or less consistent, not only among several runs
of the same algorithm with the same input in my mahcine but among
several runs on different platforms with different STL implementations
and machines.
FWIW, in all the different paltforms the code is compiled with the
default options, which in many cases means a range checked STL.
What I found very odd here is that pop_heap takes 8-10 times longer
than push_heap in an escenario where total the number of items are
balanced (the PQ is consumed until it's empty)
And there's more... it turns out that 95% of the items are insterted in
a initialization phase before the first is ever removed..... after the
first remove, a run of items may be inserted for each remove, but only
5% of the total is inserted in this phase.
I tried putting all the items from the initialization phase (95% of the
total) in a vector, with storage reserved in advance for 200000 items
to avoid additional allocations, then constructing the PQ with that
sequence (which calls make_heap up front), but the results are roughly
the same... removing all of the items takes 8-10 times longer than
inserting them.
Is this expected?
What can I do?
Given that 95% of the items are inserted up front, could using a set<>
and implementing the pop_heap() as a linear search for the first
unflagged item be better? (flagging items instead of removing them to
avoid tree rebalancing)... or maybe using a manually sorted list so
that removal becomes O(1) even if inserting is O(n.log n) instead of
O(log n) as in the heap?
TIA
Fernando Cacciola
SciSoft
http://fcacciola.50webs.com