Connecting Tech Pros Worldwide Forums | Help | Site Map

set, insert with hint

Mark P
Guest
 
Posts: n/a
#1: Jul 23 '05
Some time ago I posted here about inserting into a set with a hint:

http://groups-beta.google.com/group/...8b8d0eadb38dbf

I quoted the SGI STL docs describing a.insert(p, t), where p is the hint
iterator and t is the inserted object:

"Insert with hint is logarithmic in general, but it is amortized
constant time if t is inserted immediately before p."

Several knowledgeable people asserted that this is contrary to the
standard which says that t should be inserted immediately after p.

I wrote a little program to test this out (see below). I then obtained
the following results, which suggest that the SGI docs are in fact
correct and that the insert should occur immediately before the hint.

Using gcc 3.2.2 with -O3:

start - no hint
finish: 14 seconds
start - insert after hint
finish: 14 seconds
start - insert before hint
finish: 3 seconds

Using Sun CC 7.1 ith -O5:

start - no hint
finish: 54 seconds
start - insert after hint
finish: 43 seconds
start - insert before hint
finish: 17 seconds

Was I just confused before and the standard does say that the hint
should point after the location of the insert? How else can these
results be explained?

Thanks,
Mark


TEST CODE BELOW:

#include <set>
#include <iostream>
#include <ctime>

int main() {

const int MaxVal = 30000000;
time_t startTime, endTime;
std::set<int> is;
std::set<int>::iterator isIter;

// no hint

std::cout << "start - no hint" << std::endl;
time(&startTime);

for (int i = 0; i < MaxVal; ++i)
is.insert(i);

time(&endTime);
std::cout << "finish: " << difftime(endTime,startTime) << " seconds"
<< std::endl;

// insert after hint

is.clear();
std::cout << "start - insert after hint" << std::endl;
time(&startTime);

isIter = (is.insert(0)).first;
for (int i = 1; i < MaxVal; ++i)
isIter = is.insert(isIter,i);

time(&endTime);
std::cout << "finish: " << difftime(endTime,startTime) << " seconds"
<< std::endl;

// insert before hint

is.clear();
std::cout << "start - insert before hint" << std::endl;
time(&startTime);

isIter = (is.insert(MaxVal)).first;
for (int i = MaxVal-1; i >= 0; --i)
isIter = is.insert(isIter,i);

time(&endTime);
std::cout << "finish: " << difftime(endTime,startTime) << " seconds"
<< std::endl;

}

Pete Becker
Guest
 
Posts: n/a
#2: Jul 23 '05

re: set, insert with hint


Mark P wrote:[color=blue]
>
> I quoted the SGI STL docs describing a.insert(p, t), where p is the hint
> iterator and t is the inserted object:
>
> "Insert with hint is logarithmic in general, but it is amortized
> constant time if t is inserted immediately before p."
>
> Several knowledgeable people asserted that this is contrary to the
> standard which says that t should be inserted immediately after p.
>
> I wrote a little program to test this out (see below). I then obtained
> the following results, which suggest that the SGI docs are in fact
> correct and that the insert should occur immediately before the hint.
>[/color]

You would do better to consult the language definition. The 2003
standard gives the complexity of insert with hint as: "logarithmic in
general, but amortized cosntant if t is inserted right after p."
[color=blue]
> How else can these results be explained?[/color]

Confusion.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Mark P
Guest
 
Posts: n/a
#3: Jul 23 '05

re: set, insert with hint


Pete Becker wrote:[color=blue]
> Mark P wrote:
>[color=green]
>>
>> I quoted the SGI STL docs describing a.insert(p, t), where p is the
>> hint iterator and t is the inserted object:
>>
>> "Insert with hint is logarithmic in general, but it is amortized
>> constant time if t is inserted immediately before p."
>>
>> Several knowledgeable people asserted that this is contrary to the
>> standard which says that t should be inserted immediately after p.
>>
>> I wrote a little program to test this out (see below). I then
>> obtained the following results, which suggest that the SGI docs are in
>> fact correct and that the insert should occur immediately before the
>> hint.
>>[/color]
>
> You would do better to consult the language definition. The 2003
> standard gives the complexity of insert with hint as: "logarithmic in
> general, but amortized cosntant if t is inserted right after p."
>[color=green]
>> How else can these results be explained?[/color]
>
>
> Confusion.
>[/color]

I'll concede that I may be confused, but why does my test code run
fastest when t is inserted right before p?
Pete Becker
Guest
 
Posts: n/a
#4: Jul 23 '05

re: set, insert with hint


Mark P wrote:[color=blue][color=green]
>>[color=darkred]
>>> How else can these results be explained?[/color]
>>
>>
>>
>> Confusion.
>>[/color]
>
> I'll concede that I may be confused, but why does my test code run
> fastest when t is inserted right before p?[/color]

I meant confusion on the part of implementors. It could be that the
libraries you're using don't conform to the standard.

--

Pete Becker
Dinkumware, Ltd. (http://www.dinkumware.com)
Mark P
Guest
 
Posts: n/a
#5: Jul 23 '05

re: set, insert with hint


Pete Becker wrote:[color=blue]
> Mark P wrote:
>[color=green][color=darkred]
>>>
>>>> How else can these results be explained?
>>>
>>>
>>>
>>>
>>> Confusion.
>>>[/color]
>>
>> I'll concede that I may be confused, but why does my test code run
>> fastest when t is inserted right before p?[/color]
>
>
> I meant confusion on the part of implementors. It could be that the
> libraries you're using don't conform to the standard.
>[/color]

Ah, I see. I find this surprising (which is not to say doubtful)
because my Sun compiler uses (I think) a RogueWave STL implementation
and my gcc uses an SGI STL implementation. Then again, they may derive
from common ancestry.

If anyone has a chance to try my test code on other compilers, I'd be
interested to see the results.
Larry I Smith
Guest
 
Posts: n/a
#6: Jul 23 '05

re: set, insert with hint


Mark P wrote:[color=blue]
> Some time ago I posted here about inserting into a set with a hint:
>
> http://groups-beta.google.com/group/...8b8d0eadb38dbf
>
>
> I quoted the SGI STL docs describing a.insert(p, t), where p is the hint
> iterator and t is the inserted object:
>
> "Insert with hint is logarithmic in general, but it is amortized
> constant time if t is inserted immediately before p."
>
> Several knowledgeable people asserted that this is contrary to the
> standard which says that t should be inserted immediately after p.
>
> I wrote a little program to test this out (see below). I then obtained
> the following results, which suggest that the SGI docs are in fact
> correct and that the insert should occur immediately before the hint.
>
> Using gcc 3.2.2 with -O3:
>
> start - no hint
> finish: 14 seconds
> start - insert after hint
> finish: 14 seconds
> start - insert before hint
> finish: 3 seconds
>
> Using Sun CC 7.1 ith -O5:
>
> start - no hint
> finish: 54 seconds
> start - insert after hint
> finish: 43 seconds
> start - insert before hint
> finish: 17 seconds
>
> Was I just confused before and the standard does say that the hint
> should point after the location of the insert? How else can these
> results be explained?
>
> Thanks,
> Mark
>
>
> TEST CODE BELOW:
>
> #include <set>
> #include <iostream>
> #include <ctime>
>
> int main() {
>
> const int MaxVal = 30000000;
> time_t startTime, endTime;
> std::set<int> is;
> std::set<int>::iterator isIter;
>
> // no hint
>
> std::cout << "start - no hint" << std::endl;
> time(&startTime);
>
> for (int i = 0; i < MaxVal; ++i)
> is.insert(i);
>
> time(&endTime);
> std::cout << "finish: " << difftime(endTime,startTime) << " seconds"
> << std::endl;
>
> // insert after hint
>
> is.clear();
> std::cout << "start - insert after hint" << std::endl;
> time(&startTime);
>
> isIter = (is.insert(0)).first;
> for (int i = 1; i < MaxVal; ++i)
> isIter = is.insert(isIter,i);
>
> time(&endTime);
> std::cout << "finish: " << difftime(endTime,startTime) << " seconds"
> << std::endl;
>
> // insert before hint
>
> is.clear();
> std::cout << "start - insert before hint" << std::endl;
> time(&startTime);
>
> isIter = (is.insert(MaxVal)).first;
> for (int i = MaxVal-1; i >= 0; --i)
> isIter = is.insert(isIter,i);
>
> time(&endTime);
> std::cout << "finish: " << difftime(endTime,startTime) << " seconds"
> << std::endl;
>
> }[/color]

Many 'set' implementations are backed by a red-black tree.
The performance of these trees degrades quickly when inserting
pre-sorted data (as you are doing here) due to all of the
tree re-balancing that must occur.

Also, many implementations have special handling for the
'hint' boundary conditions ('end()' and 'begin()').
In your last example (where you inserted data in
high-to-low order) the iterator returned by the 'insert()'
call is actually 'begin()'. Since 'insert()' has special
handling for this 'hint' the performance of the NEXT 'insert()'
call is high.

If you pass an iterator that is neither 'begin()' nor 'end()',
when 'insert()' increments that iterator and it does not
point to a valid item, insert() starts its search with
'begin()' (i.e. your 'hint' iterator is ignored).

If you are inserting data in low-to-high order this
approach will speed things up greatly:

isIter = (is.insert(0)).first;
for (int i = 1; i < MaxVal; ++i)
is.insert(is.end(), i);

Regards,
Larry

--
Anti-spam address, change each 'X' to '.' to reply directly.
Mark P
Guest
 
Posts: n/a
#7: Jul 23 '05

re: set, insert with hint


Larry I Smith wrote:[color=blue]
> Many 'set' implementations are backed by a red-black tree.
> The performance of these trees degrades quickly when inserting
> pre-sorted data (as you are doing here) due to all of the
> tree re-balancing that must occur.[/color]

An excellent point which I overlooked (based on the frequent use of "Rb"
in the internal names of the STL souce, I strongly suspect it is a
red-black tree underlying the set)
[color=blue]
>
> Also, many implementations have special handling for the
> 'hint' boundary conditions ('end()' and 'begin()').
> In your last example (where you inserted data in
> high-to-low order) the iterator returned by the 'insert()'
> call is actually 'begin()'. Since 'insert()' has special
> handling for this 'hint' the performance of the NEXT 'insert()'
> call is high.
>
> If you pass an iterator that is neither 'begin()' nor 'end()',
> when 'insert()' increments that iterator and it does not
> point to a valid item, insert() starts its search with
> 'begin()' (i.e. your 'hint' iterator is ignored).
>
> If you are inserting data in low-to-high order this
> approach will speed things up greatly:
>
> isIter = (is.insert(0)).first;
> for (int i = 1; i < MaxVal; ++i)
> is.insert(is.end(), i);
>[/color]

Thanks for the good advice. I'll give this a try.

-Mark
Closed Thread


Similar C / C++ bytes