422,949 Members | 1,034 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 422,949 IT Pros & Developers. It's quick & easy.

questions about the STL sort()

P: n/a
STL sort() as an implementation of introsort, below is the code
snippet from the gnu stl.

template<typename _RandomAccessIterator, typename _Size>
void
__introsort_loop(_RandomAccessIterator __first,
_RandomAccessIterator __last,
_Size __depth_limit)
{
typedef typename iterator_traits<_RandomAccessIterator>::value_type
_ValueType;

while (__last - __first _S_threshold)
{
if (__depth_limit == 0)
{
std::partial_sort(__first, __last, __last);
return;
}
--__depth_limit;
_RandomAccessIterator __cut =
std::__unguarded_partition(__first, __last,
_ValueType(std::__median(*__first,
*(__first
+ (__last
- __first)
/ 2),
*(__last
- 1))));
std::__introsort_loop(__cut, __last, __depth_limit);
__last = __cut;
}
}

my question is : why not use two recursive calls to __introsort_loop,
it will have a clear view to present a divide-and-conquer algorithm.
Yes, it is a optimization,but tail recursion elimination is now a
common technique of today's cpp compiler.
another question is about the __final_insertion_sort() placed after
the call to __introsort_loop()
below is the code snippet.

template<typename _RandomAccessIterator>
void
__final_insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last)
{
if (__last - __first _S_threshold)
{
std::__insertion_sort(__first, __first + _S_threshold);
std::__unguarded_insertion_sort(__first + _S_threshold, __last);
}
else
std::__insertion_sort(__first, __last);
}

why not just call one __insertion_sort in the if-condition?
Jul 8 '08 #1
Share this Question
Share on Google+
5 Replies


P: n/a
Hello,

feverzsj wrote:
STL sort() as an implementation of introsort, below is the code
snippet from the gnu stl.
my question is : why not use two recursive calls to __introsort_loop,
it will have a clear view to present a divide-and-conquer algorithm.
Yes, it is a optimization,but tail recursion elimination is now a
common technique of today's cpp compiler.
I'm not sure tail recursion does apply here. Since the code is there,
you could try out what the compilers do. I'm pretty sure that saving
one recursive call saves enough, e.g. initializing the iterators for
every recursive instance, to justify doing it that way. Automatically
transforming non-tail recursion away, even partially, is a pretty
complicated operation with too many problematic cases. I would not
expect any compiler sacrificing lots of time in fruitless analysis in
that area.
>

another question is about the __final_insertion_sort() placed after
the call to __introsort_loop()
below is the code snippet.

template<typename _RandomAccessIterator>
void
__final_insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last)
{
if (__last - __first _S_threshold)
{
std::__insertion_sort(__first, __first + _S_threshold);
std::__unguarded_insertion_sort(__first + _S_threshold, __last);
}
else
std::__insertion_sort(__first, __last);
}

why not just call one __insertion_sort in the if-condition?
__unguarded_insertion_sort saves bounds-checking, therefore it needs
some elements less or equal in the area before first + _S_threshold. It
will always read at least at position __first + _S_threshold - 1, and
it will sort arbitrarily far before __first + _S_threshold. The big
question is, why there must be an element less then all in range
[__first + _S_threshold, __last) in the range [__first, __first +
_S_threshold). An that can only be found when looking at the places
calling this function.

I think this splitting should make an easily measurable difference when
sorting containers of small elements like integer.

Bernd Strieder

Jul 8 '08 #2

P: n/a
feverzsj wrote:
my question is : why not use two recursive calls to __introsort_loop,
it will have a clear view to present a divide-and-conquer algorithm.
Yes, it is a optimization,but tail recursion elimination is now a
common technique of today's cpp compiler.
What would be the advantage of not removing the second recursive call?
The only effect is that it will start relying on the tail recursion
optimization to exist and succeed, for no good reason.

The code itself is not even intended to be looked at.
Jul 8 '08 #3

P: n/a
On Jul 8, 10:32 am, feverzsj <fever...@hotmail.comwrote:
STL sort() as an implementation of introsort, below is the
code snippet from the gnu stl.
template<typename _RandomAccessIterator, typename _Size>
void
__introsort_loop(_RandomAccessIterator __first,
_RandomAccessIterator __last,
_Size __depth_limit)
{
typedef typename iterator_traits<_RandomAccessIterator>::value_type
_ValueType;
while (__last - __first _S_threshold)
{
if (__depth_limit == 0)
{
std::partial_sort(__first, __last, __last);
return;
}
--__depth_limit;
_RandomAccessIterator __cut =
std::__unguarded_partition(__first, __last,
_ValueType(std::__median(*__first,
*(__first
+ (__last
- __first)
/ 2),
*(__last
- 1))));
std::__introsort_loop(__cut, __last, __depth_limit);
__last = __cut;
}
}
my question is : why not use two recursive calls to __introsort_loop,
Probably because the code was originally a quick sort (without
the __depth_limit), and they didn't want it to core dump due to
stack overflow.
it will have a clear view to present a divide-and-conquer
algorithm. Yes, it is a optimization, but tail recursion
elimination is now a common technique of today's cpp compiler.
First, I'm not sure that it's that common. Compilers optimize
for real code, and tail recursion isn't that frequent in C++.
And second, I'm not sure that the compiler can do the
optimization; if _RandomAccessIterator has a non-trivial
destructor, it certainly can't.
another question is about the __final_insertion_sort() placed after
the call to __introsort_loop()
below is the code snippet.
template<typename _RandomAccessIterator>
void
__final_insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last)
{
if (__last - __first _S_threshold)
{
std::__insertion_sort(__first, __first + _S_threshold);
std::__unguarded_insertion_sort(__first + _S_threshold, __last);
}
else
std::__insertion_sort(__first, __last);
}
why not just call one __insertion_sort in the if-condition?
Probably optimization. Without seeing where this function is
called, and what the differences are between the two insertion
sorts, it's hard to say, but just guessing... this function is
called when the quick sort has partitionned the total range into
blocks of _S_threshold size, and all of the values in one block
are greater than the values in all previous blocks, and less
then the values in all following blocks. The standard insertion
sort will have to check for the bottom of the array in the inner
loop, something like (i and j are iterators):

for ( i = first + 1 ; i != last ; ++ i ) {
T tmp = *i ;
j = i ;
while ( j != first && *(j-1) tmp ) {
*j = *(j-1) ;
-- j ;
}
*j = tmp ;
}

If you're sorting anything but the first block after the
partitionning, you don't need the j != first test in the inner
loop.

--
James Kanze (GABI Software) email:ja*********@gmail.com
Conseils en informatique orientée objet/
Beratung in objektorientierter Datenverarbeitung
9 place Sémard, 78210 St.-Cyr-l'École, France, +33 (0)1 30 23 00 34
Jul 9 '08 #4

P: n/a
Hello,
Bernd Strieder wrote:
I'm not sure tail recursion does apply here. Since the code is there,
you could try out what the compilers do. I'm pretty sure that saving
one recursive call saves enough, e.g. initializing the iterators for
every recursive instance, to justify doing it that way. Automatically
transforming non-tail recursion away, even partially, is a pretty
complicated operation with too many problematic cases. I would not
expect any compiler sacrificing lots of time in fruitless analysis in
that area.
i see what u mean. STL is designed for both good portability and high
performance.
We can study a lot from it by comparing and rewriting.
If the __introsort_loop() was written like this:

__introsort_loop(/*...*/)
{
//...

if(__last - __first _S_threshold)
{
//....
}
std::__introsort_loop(__cut, __last, __depth_limit);
std::__introsort_loop(__first, __cut, __depth_limit);
}
Then, a tail recursion elimination could just simply replace the
params by the new one ,and goto the beginning of procedure.
The compiler may also identify only __last has changed , and just
replace the _last by __cut.
Finally ,we get the same effect as the GNU STL __introsort_loop does.

__introsort_loop(/*...*/)
{
//...

recu: if(__last - __first _S_threshold)
{
//....
}

std::__introsort_loop(__cut, __last, __depth_limit);
// tail recursion elimination could be just simply like this
__first=__first;
__last=__cut;
__depth_limit=__depth_limit;
goto recu;
}

__unguarded_insertion_sort saves bounds-checking, therefore it needs
some elements less or equal in the area before first + _S_threshold. It
will always read at least at position __first + _S_threshold - 1, and
it will sort arbitrarily far before __first + _S_threshold. The big
question is, why there must be an element less then all in range
[__first + _S_threshold, __last) in the range [__first, __first +
_S_threshold). An that can only be found when looking at the places
calling this function.

I think this splitting should make an easily measurable difference when
sorting containers of small elements like integer.
yeah ,that's a speed-up trick, but I have the same question as you
mentioned.
The place calling this function makes me doubt ,how could there must
be an element less then all in range [__first + _S_threshold, __last)
in the range [__first, __first + _S_threshold]

sort(_RandomAccessIterator __first, _RandomAccessIterator __last)
{

if (__first != __last)
{
std::__introsort_loop(__first, __last, __lg(__last - __first) * 2);
std::__final_insertion_sort(__first, __last); // the only palce it
calls
}
}

or the __introsort_loop() has assured such thing?

Furthermore ,the __unguarded_partition() uses a simple 2-way
partitioning with no bound check, becuz the median-of-three method has
assured so. I think a 3-way partitioning could be a better partition
routine for common use with a slight overhead
Jul 9 '08 #5

P: n/a
hello:

James Kanze wrote:
Probably because the code was originally a quick sort (without
the __depth_limit), and they didn't want it to core dump due to
stack overflow.
First, I'm not sure that it's that common. *Compilers optimize
for real code, and tail recursion isn't that frequent in C++.
And second, I'm not sure that the compiler can do the
optimization; if _RandomAccessIterator has a non-trivial
destructor, it certainly can't.
thx,you remind me of cpp's RAII feature, and i'm still on the road
from "c+" to "c++", a long way to pass through.

Probably optimization. *Without seeing where this function is
called, and what the differences are between the two insertion
sorts, it's hard to say, but just guessing... this function is
called when the quick sort has partitionned the total range into
blocks of _S_threshold size, and all of the values in one block
are greater than the values in all previous blocks, and less
then the values in all following blocks. *The standard insertion
sort will have to check for the bottom of the array in the inner
loop, something like (i and j are iterators):

* * for ( i = first + 1 ; i != last ; ++ i ) {
* * * * T tmp = *i ;
* * * * j = i ;
* * * * while ( j != first && *(j-1) tmp ) {
* * * * * * *j = *(j-1) ;
* * * * * * -- j ;
* * * * }
* * * * *j = tmp ;
* * }

If you're sorting anything but the first block after the
partitionning, you don't need the j != first test in the inner
loop.
below r the routines of __insertion_sort() and the
__unguarded_insertion_sort().
looks like __insertion_sort() is just a common implementation of
insertion sort,
and __unguarded_insertion_sort() is some kind insertion sort without
bound check
they'll be called right after the __introsort_loop(/*...*/).
__insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last)
{
if (__first == __last)
return;

for (_RandomAccessIterator __i = __first + 1; __i != __last; ++__i)
{
typename iterator_traits<_RandomAccessIterator>::value_type
__val = *__i;
if (__val < *__first)
{
std::copy_backward(__first, __i, __i + 1);
*__first = __val;
}
else
std::__unguarded_linear_insert(__i, __val);
}
}

__unguarded_insertion_sort(_RandomAccessIterator __first,
_RandomAccessIterator __last)
{
typedef typename iterator_traits<_RandomAccessIterator>::value_type
_ValueType;

for (_RandomAccessIterator __i = __first; __i != __last; ++__i)
std::__unguarded_linear_insert(__i, _ValueType(*__i));
}

as you mentioned:
>...the total range into blocks of _S_threshold size, and all of the valuesin one block are greater than the values in all previous blocks, and less then the values in all following blocks
it could be possibile, but i think the __unguarded_partition() may
fail to partition the array with reduplicate keys.

__unguarded_partition(_RandomAccessIterator __first,
_RandomAccessIterator __last, _Tp __pivot)
{
while (true)
{
while (*__first < __pivot)
++__first;
--__last;
while (__pivot < *__last)
--__last;
if (!(__first < __last))
return __first;
std::iter_swap(__first, __last);
++__first;
}
}
Jul 10 '08 #6

This discussion thread is closed

Replies have been disabled for this discussion.