473,748 Members | 3,585 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

overhead of using std::sort

I want to sort a vector with the std::sort function. There are two
functions: one with two arguments, the other with three arguments. I
am using the one with three arguments. I noticed that there is a huge
overhead of using this function, which comes from copying the function
object, i.e., the object that compares the elements of the vector,
which is passed to the function not by reference but by value.

Now, at the end of this mail there is a source code. It's not the
actual code of the problem I have, but only a distilled version of it.
Here the SO class has no fields, so copying an object of this class
is not a big deal. However, in my actual problem the function object
has a very large vector inside, and this function object should not be
copied even once. If you run the program, you'll see the function
object is copied seven times for sorting a vector with three
elements!

So, the problem is that when I use the std::sort function the function
object is copied many times! The larger the sequence to sort, the
greater the number of copies of the function object. It's very bad
for me because I sort small number of elements (up to four), but I do
the sorting many times (millions).

I want to use the sort function in such a way that the function object
is not coppied. I tried to call the sort function this way:

sort<vector<ire k *>::iterator, SO &>(v.begin() , v.end(), so);

but it reduced the number of copies by one only.

I am convinced that I am doing something wrong. I would appreciate
your advice on how to solve my problem.
Thanks & best,
Irek
*************** *************** **********
Ireneusz SZCZESNIAK - research assistant
http://www.iitis.gliwice.pl/~iszczesniak
*************** *************** **********

*************** *************** *************** *************** *********

#include <algorithm>
#include <vector>
#include <iostream>
#include <functional>

using namespace std;

class irek
{
int i;

public:
irek(int _i) {i = _i;}
int get_i() const {return i;}
};

struct SO : binary_function <irek, irek, bool>
{
public:
SO() {}
SO(const SO&)
{
cout << "coppied!" << endl;
}

result_type operator()(cons t first_argument_ type *a,
const second_argument _type *b)
{
return a->get_i() < b->get_i();
}
};
main()
{
// this is our function object
SO so;

irek a(2), b(3), c(1);

vector<irek *> v;
v.push_back(&b) ;
v.push_back(&c) ;
v.push_back(&a) ;

sort(v.begin(), v.end(), so);

for(int i = 0; i < v.size(); i++)
cout << v[i]->get_i() << endl;
}
Jul 22 '05 #1
7 3544
Ireneusz SZCZESNIAK, UNEXPECTED_DATA _AFTER_ADDRESS@ .SYNTAX-ERROR. wrote:

Now, at the end of this mail there is a source code. It's not the
actual code of the problem I have, but only a distilled version of it.
Here the SO class has no fields, so copying an object of this class
is not a big deal. However, in my actual problem the function object
has a very large vector inside,
What for?
The purpose of the functor is to compare 2 objects. Nothing more,
nothing less.
and this function object should not be
copied even once. If you run the program, you'll see the function
object is copied seven times for sorting a vector with three
elements!


If you really need that vector inside the functor, well, you can
always have the vector outside the class and just give a pointer
to it to the functor object. This way copying the functor object
stays cheap.

--
Karl Heinz Buchegger
kb******@gascad .at
Jul 22 '05 #2
On Thu, 29 Jul 2004 13:42:24 +0200, Ireneusz SZCZESNIAK
<no****@hotmail .com>, UNEXPECTED_DATA _AFTER_ADDRESS@ .SYNTAX-ERROR.
wrote:
I want to sort a vector with the std::sort function. There are two
functions: one with two arguments, the other with three arguments. I
am using the one with three arguments. I noticed that there is a huge
overhead of using this function, which comes from copying the function
object, i.e., the object that compares the elements of the vector,
which is passed to the function not by reference but by value.
"huge overhead"!? Most function objects are stateless, and the
overhead of copying them is precisely zero.

Obviously taking the function object by reference is not a good
option:

sort(v.begin(), v.end(), functor()); //error!

since you can't bind a temporary to a non-const refence. Equally,
passing by const reference requires your operator() to be a const
member function, which might be annoying (requiring mutable for any
members you want to change).

So passing by value is the best option.

However, I don't see any reason why the internals of std::sort need to
copy the functor - they could certainly pass it around by non-const
reference; that's just a quality of implementation issue though - the
standard allows the function object to be copied an arbitrary number
of times by any algorithm.
Now, at the end of this mail there is a source code. It's not the
actual code of the problem I have, but only a distilled version of it.
Here the SO class has no fields, so copying an object of this class
is not a big deal. However, in my actual problem the function object
has a very large vector inside, and this function object should not be
copied even once. If you run the program, you'll see the function
object is copied seven times for sorting a vector with three
elements!
The solution is to use reference semantics. For example, your function
object should contain a reference to the vector, not the vector
itself. Then the compiler generated copy constructor will be efficient
and fast, and with inlining, the overhead may well be zero.
So, the problem is that when I use the std::sort function the function
object is copied many times! The larger the sequence to sort, the
greater the number of copies of the function object. It's very bad
for me because I sort small number of elements (up to four), but I do
the sorting many times (millions).
If you are only sorting up to 4 elements, you could come up with a
faster algorithm than std::sort in any case. An explicit if ladder
with no loop will be fast - in effect unrolling the entire sorting
operation.
I want to use the sort function in such a way that the function object
is not coppied. I tried to call the sort function this way:

sort<vector<ir ek *>::iterator, SO &>(v.begin() , v.end(), so);

but it reduced the number of copies by one only.
Yes, that is not a solution.
I am convinced that I am doing something wrong. I would appreciate
your advice on how to solve my problem.


If performance is important, make sure your function objects are cheap
to copy.

Tom
Jul 22 '05 #3
On Thu, 29 Jul 2004, John Harrison wrote:
Recode your function object so that it holds a pointer to the vector
data instead of holding the vector itself.
This is the way I have it now implemented.
This will make it cheap to copy.
But it still costs to copy the function object.
You could use an ordinary pointer for this, but a better choice
might be the shared array class from boost. See
http://www.boost.org/libs/smart_ptr/smart_ptr.htm


Thanks, I will have a look.
Irek

*************** *************** **********
http://www.iitis.gliwice.pl/~iszczesniak
*************** *************** **********
Jul 22 '05 #4

"Ireneusz SZCZESNIAK" <no****@hotmail .com> wrote in message
news:Pi******** *************** ********@irek.i itis.gliwice.pl ...
On Thu, 29 Jul 2004, John Harrison wrote:
Recode your function object so that it holds a pointer to the vector
data instead of holding the vector itself.


This is the way I have it now implemented.
This will make it cheap to copy.


But it still costs to copy the function object.


IIRC the standard allows the std::sort to make copies of the supplied
function object. Function objects must be effectively stateless, so that if
the sort algorithm uses one copy half the time and another copy the other
half of the time it should not matter.

I would imagine that your function object is being copied to other helper
algorithms internal to std::sort. You are just going to have to live with
this behaviour.

john
Jul 22 '05 #5
> "huge overhead"!? Most function objects are stateless, and the
overhead of copying them is precisely zero.
My functor is not stateless, because it needs some extra data. It
used to have a vector inside, but now it has just a pointer to this
vector. Coping my functor requires at least coping a pointer.

I believe it's huge. For sorting 3 elements the functor was copied 7
times, for 1000 elements - 1328 times, and for 1000000 elements -
1312948 times. These numbers slightly change with different element
arrangements in the vector.
However, I don't see any reason why the internals of std::sort need
to copy the functor - they could certainly pass it around by
non-const reference; (...)
When you implement templated std::sort this way, then you can pass
both a functor and a pointer to a function. If std::sort took a
functor by reference, then you would not be able to pass a pointer to
a function. All the std::sort function want to do with this that you
pass is apply to it the "()" operator.
If you are only sorting up to 4 elements, you could come up with a
faster algorithm than std::sort in any case. An explicit if ladder
with no loop will be fast - in effect unrolling the entire sorting
operation.


Now it's up to four elements, but soon I will sort 16, 64, 128 and
more elements. I need to stick to std::sort.

I didn't realize that the C++ standard imposes no restrictions on
copying functors. This is why implementations are free to make those
copies of my functor.
Thanks,
Irek
Jul 22 '05 #6
On Thu, 29 Jul 2004, John Harrison wrote:
Function objects must be effectively stateless, so that if the sort
algorithm uses one copy half the time and another copy the other
half of the time it should not matter.
When I create my functor I give it some state, and this state will not
change later. But different functors have different states. Well, a
state for me is just the value of a pointer field in my functor. When
I create a functor I pass a some pointer to its constructor.
I would imagine that your function object is being copied to other
helper algorithms internal to std::sort.
Yes, this is what happens.
You are just going to have to live with this behaviour.


I guess so.
Thanks,
Irek

*************** *************** **********
http://www.iitis.gliwice.pl/~iszczesniak
*************** *************** **********
Jul 22 '05 #7
On Thu, 29 Jul 2004 15:54:39 +0200, Ireneusz SZCZESNIAK
<no****@hotmail .com> wrote:
"huge overhead"!? Most function objects are stateless, and the
overhead of copying them is precisely zero.


My functor is not stateless, because it needs some extra data. It
used to have a vector inside, but now it has just a pointer to this
vector. Coping my functor requires at least coping a pointer.

I believe it's huge. For sorting 3 elements the functor was copied 7
times, for 1000 elements - 1328 times, and for 1000000 elements -
1312948 times. These numbers slightly change with different element
arrangements in the vector.


That is a surprisingly large number of copies (O(n) by the look of
things). Clearly, if copying the functor is significantly more
expensive than, say, calling the comparator, it will become the
performance limiting factor until n is really huge.
However, I don't see any reason why the internals of std::sort need
to copy the functor - they could certainly pass it around by
non-const reference; (...)


When you implement templated std::sort this way, then you can pass
both a functor and a pointer to a function. If std::sort took a
functor by reference, then you would not be able to pass a pointer to
a function.


You would, just not an rvalue one. If you pass just the function name,
it would work fine, since you can form a reference to a function.

But I wasn't suggesting that sort took its parameter by reference (I
posted details of why not), but that the internals might. e.g.

template...
void sort_aux(It begin, It end, Comp& comp);

template...
void sort(It begin, It end, Comp comp)
{
sort_aux(begin, end, comp);
}

That would work fine for function pointers, since inside sort, comp is
an lvalue. However, it would break if you explicitly made Comp a
reference type (due to trying to form a reference to reference).
Obviously an implementation could get around this, using
boost::add_refe rence or similar.
If you are only sorting up to 4 elements, you could come up with a
faster algorithm than std::sort in any case. An explicit if ladder
with no loop will be fast - in effect unrolling the entire sorting
operation.


Now it's up to four elements, but soon I will sort 16, 64, 128 and
more elements. I need to stick to std::sort.

I didn't realize that the C++ standard imposes no restrictions on
copying functors. This is why implementations are free to make those
copies of my functor.


Personally I'm quite surprised at the huge number of copies - I
imagined there would be O(log n) copies at most (one for each divide
and conquer, if you see what I mean), but I'm sure if I think about it
the reason that it's O(n) copies may become clear. But, as you say
(and I said), it is conforming in any case.

Tom
Jul 22 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
3502
by: lok | last post by:
i have a class: template <class T1, class T2> class CPairMapping { public: typedef std::pair<T1, T2> ValuePair_t; typedef std::vector<ValuePair_t> ValueList_t; typedef std::binary_function< ValuePair_t, ValuePair_t, bool> ValuePair_IsLess; void SortAscend(const ValuePair_IsLess& isLess_) {
40
4317
by: Elijah Bailey | last post by:
I want to sort a set of records using STL's sort() function, but dont see an easy way to do it. I have a char *data; which has size mn bytes where m is size of the record and n is the number of records. Both these numbers are known
6
2339
by: alexhong2001 | last post by:
Does "std::sort" work only with sequence containers, not associative containers at all? Among sequential containers, can it be used with "list", "queue" and other sequence containers besides "vector"? Are "istringstream" and "ostringstream" covered in the book, "STL Tutorial and Reference" (second edition) by D. Musser, et al.? Seems like I could not find any related topic in the book.
9
13105
by: JasBascom | last post by:
say i had 97456 and 23456 is there already a sort function to check which one begins with the smaller number rearrange it. in this case the bottom number should clearly be rearranged to the top as it lower in numerical order.
4
3349
by: hall | last post by:
I accidently overloaded a static member function that I use as predicate in the std::sort() for a vector and ended up with a compiler error. Is this kind of overload not allowed for predicates and if so, why not? Shouldn the compiler be able to tell which of he overloaded functions to use? The second A::comp() is the one I accidently added and gives the error message (in Borland C++Builder 6) Unit1.cpp E2285 Could not find a match for
8
4222
by: Manfred | last post by:
Hello I am new to template programming, so i tried the 'example' from http://www.sgi.com/tech/stl/functors.html. I can compile the code but when i want to run the program I get a segmentation fault when the part of std::sort(...) is reached, but I don't understand why. ------------ Code start ---------------------
5
3905
by: fade | last post by:
Good afternoon, I need some advice on the following: I've got a class that has a member std::vector<CStringm_vFileName and a member CString m_path; The vector contains a bunch of filenames with no path included (no C:\...) eg: my_file2.jpg, my_file1.bmp, etc... and m_path stores the path, eg: C:\folder1 I want to sort this vector according to different criterion, such as
1
5169
by: mscava | last post by:
How can I make this construction valid? It still gives me error about no matching function std::sort(...). I made a little search and wrong thing is probably the predicate... template <typename TT Polygon<T:: Height() const { class BinPred : public std::binary_function< Vector2D<T>, Vector2D<T>, bool > {
1
1806
by: panzhiyong | last post by:
Hi there! I wonder why there is no such a overload version of sort function in STL: template<class RandomAccessIterator, class Pr, class IterSwap> void sort( RandomAccessIterator _First, RandomAccessIterator _Last, BinaryPredicate _Comp, IterSwap _Swap );
0
9363
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9238
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8237
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
6793
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
6073
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4593
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4864
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2775
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2206
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.