473,387 Members | 1,464 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

std::for_each

Sometimes I write code using std::for_each. Sometimes I use it on
several container classes that are the same size.

for_each(v.begin(), v.end(), f1());
for_each(u.begin(), u.end(), f2());

This seems inefficient so I rewrite it as:

int n = v.size();
for (int i = 0; i < n; ++i)
{
f1(v[i]);
f2(u[i]);
}

Is there some mechanism that lets me say the following:

for_each((v.begin(), u.begin()), (v.end(), u.end()), (f1(), f2()));

Thanks

--
sashan
http://www.cs.auckland.ac.nz/~sgov008/


Nov 16 '05 #1
12 2586
"sashan" <ma***@operamail.com> wrote in message
news:bl**********@lust.ihug.co.nz...
Sometimes I write code using std::for_each. Sometimes I use it on
several container classes that are the same size.

for_each(v.begin(), v.end(), f1());
for_each(u.begin(), u.end(), f2());

This seems inefficient so I rewrite it as:

int n = v.size();
for (int i = 0; i < n; ++i)
{
f1(v[i]);
f2(u[i]);
}


What makes you think this is any more efficient? If I had to guess, I would
bet that your second version runs *slower* due to the fact you are reducing
memory locality jumping back and forth between the different arrays and
functions. Also, by indexing into the array each time instead of using the
iterators which can simply be incremented, you are making it more difficult
for the compiler to optimize your sequential access (if u and v are less
trivial than a vector, like perhaps a deque, you are making it significantly
more difficult). Current compilers should be able to optimize away the
indexing, but you certainly aren't doing the compiler any favors.

Ken
Nov 16 '05 #2
"seems inefficient"...do you have proof that it actually *is* inefficient
for your particular case?

Not only have you turned two lines of code into 4+, but you've replaced the
well-known std::for_each idiom with a custom for() loop that everybody
looking at this code will have to take the time to inspect.

Dan

"sashan" <ma***@operamail.com> wrote in message
news:bl**********@lust.ihug.co.nz...
Sometimes I write code using std::for_each. Sometimes I use it on
several container classes that are the same size.

for_each(v.begin(), v.end(), f1());
for_each(u.begin(), u.end(), f2());

This seems inefficient so I rewrite it as:

int n = v.size();
for (int i = 0; i < n; ++i)
{
f1(v[i]);
f2(u[i]);
}

Is there some mechanism that lets me say the following:

for_each((v.begin(), u.begin()), (v.end(), u.end()), (f1(), f2()));

Thanks

--
sashan
http://www.cs.auckland.ac.nz/~sgov008/

Nov 16 '05 #3
Ken Alverson wrote:
"sashan" <ma***@operamail.com> wrote in message
news:bl**********@lust.ihug.co.nz...

Sometimes I write code using std::for_each. Sometimes I use it on
several container classes that are the same size.

for_each(v.begin(), v.end(), f1());
for_each(u.begin(), u.end(), f2());

This seems inefficient so I rewrite it as:

int n = v.size();
for (int i = 0; i < n; ++i)
{
f1(v[i]);
f2(u[i]);
}


What makes you think this is any more efficient?

Less comparisons and less increments, I guess.

I've attached a program that times the operations using the above
methods. In the program method 1 uses the for_each algorithm, method 2
is a for loop like above and method 3 is a for loop using iterators.
Method 3 is the sort of thing I was aiming for with my initial question.
The program tests method 1 and 2 and 3 on std::vector and method 1 and 3
on std::list (std::list doesn't support random access so method 2
doesn't work).

Please check it and tell me if anything's wrong (maybe I'm using
QueryPerformance counter wrong or something like that)

In summary, when in release build, method 1 is the slowest, method 2 and
3 are the fastest for std::vector. For std::list method 3 is the
fastest. Compiled using vc7.1.
#include <iostream>
//#include <iomanip.h>
#include <math.h>
#include <windows.h>
#include <vector>
#include <functional>
#include <algorithm>

using namespace std;

struct Sqr :
unary_function<int,int>
{
void operator()(argument_type& i){i *= i;}
};

int vector_timing()
{
cout << "vector timing" << endl;
LARGE_INTEGER Freq, Start, End;
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
BOOL Valid = QueryPerformanceFrequency(&Freq);
if (!Valid)
return -1;

//method 1
vector<int> u0(1000,3),v0(1000,3),w0(1000,3),x0(1000,3);
QueryPerformanceCounter(&Start);
for_each(u0.begin(), u0.end(), Sqr());
for_each(v0.begin(), v0.end(), Sqr());
for_each(w0.begin(), w0.end(), Sqr());
for_each(x0.begin(), x0.end(), Sqr());
QueryPerformanceCounter(&End);
std::cout << "Method 1 " << End.QuadPart - Start.QuadPart << endl;

//method 2
vector<int> u1(1000,3),v1(1000,3),w1(1000,3),x1(1000,3);
QueryPerformanceCounter(&Start);
vector<int>::size_type n = u1.size();
for (int i = 0; i < n; ++i)
{
Sqr()(u1[i]);
Sqr()(v1[i]);
Sqr()(w1[i]);
Sqr()(x1[i]);
}
QueryPerformanceCounter(&End);
std::cout << "Method 2 " << End.QuadPart - Start.QuadPart << endl;

//method 3
vector<int> u2(1000,3),v2(1000,3),w2(1000,3),x2(1000,3);
vector<int>::iterator itu2, itv2, itw2, itx2;
QueryPerformanceCounter(&Start);
for (itu2 = u2.begin(), itv2 = v2.begin(), itw2 = w2.begin(), itx2 = x2.begin();
itu2 != u2.end();
++itu2, ++itv2, ++itw2, ++itx2)
{
Sqr()(*itu2);
Sqr()(*itv2);
Sqr()(*itw2);
Sqr()(*itx2);
}
QueryPerformanceCounter(&End);
std::cout << "Method 3 " << End.QuadPart - Start.QuadPart << endl;
}

int list_timing()
{
LARGE_INTEGER Freq, Start, End;
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
BOOL Valid = QueryPerformanceFrequency(&Freq);
if (!Valid)
return -1;

cout << "list timing" << endl;
//method 1
vector<int> u0(1000,3),v0(1000,3),w0(1000,3),x0(1000,3);
QueryPerformanceCounter(&Start);
for_each(u0.begin(), u0.end(), Sqr());
for_each(v0.begin(), v0.end(), Sqr());
for_each(w0.begin(), w0.end(), Sqr());
for_each(x0.begin(), x0.end(), Sqr());
QueryPerformanceCounter(&End);
std::cout << "Method 1 " << End.QuadPart - Start.QuadPart << endl;

//method 3
vector<int> u2(1000,3),v2(1000,3),w2(1000,3),x2(1000,3);
vector<int>::iterator itu2, itv2, itw2, itx2;
QueryPerformanceCounter(&Start);
for (itu2 = u2.begin(), itv2 = v2.begin(), itw2 = w2.begin(), itx2 = x2.begin();
itu2 != u2.end();
++itu2, ++itv2, ++itw2, ++itx2)
{
Sqr()(*itu2);
Sqr()(*itv2);
Sqr()(*itw2);
Sqr()(*itx2);
}
QueryPerformanceCounter(&End);
std::cout << "Method 3 " << End.QuadPart - Start.QuadPart << endl;
}

int main()
{
vector_timing();
list_timing();
return 0;
}
Nov 16 '05 #4

"sashan" <ma***@operamail.com> skrev i meddelandet
news:bl*********@lust.ihug.co.nz...
Ken Alverson wrote:
"sashan" <ma***@operamail.com> wrote in message
news:bl**********@lust.ihug.co.nz...

Sometimes I write code using std::for_each. Sometimes I use it on
several container classes that are the same size.

for_each(v.begin(), v.end(), f1());
for_each(u.begin(), u.end(), f2());

This seems inefficient so I rewrite it as:

int n = v.size();
for (int i = 0; i < n; ++i)
{
f1(v[i]);
f2(u[i]);
}
What makes you think this is any more efficient?

Less comparisons and less increments, I guess.

I've attached a program that times the operations using the above
methods. In the program method 1 uses the for_each algorithm, method

2 is a for loop like above and method 3 is a for loop using iterators.
Method 3 is the sort of thing I was aiming for with my initial question. The program tests method 1 and 2 and 3 on std::vector and method 1 and 3 on std::list (std::list doesn't support random access so method 2
doesn't work).

Please check it and tell me if anything's wrong (maybe I'm using
QueryPerformance counter wrong or something like that)
What's wrong is that with only 1000 iterations nothing really matters,
because just about anything is fast enough.

In summary, when in release build, method 1 is the slowest, method 2 and 3 are the fastest for std::vector. For std::list method 3 is the
fastest. Compiled using vc7.1.


When I increase it to a million iterations, I get this result:
vector timing
Method 1 70041
Method 2 382012
Method 3 450109
list timing
Method 1 69126
Method 3 399203

Here for_each is *much* faster.

Bo Persson

Nov 16 '05 #5
There was a silly mistake in the previous post - the list_timing function used vector as it's container class. I've posted an ammended version that corrects the mistake. It also allows you to change the size of the container and the number of times the various methods are performed. I also change the function object to add one to the given argument instead of squaring.

Here are some results (these are all based on release builds):
For a container size of 1000000 repeating each method 1 time:
vector timing
Method 1 166452
Method 2 431866
Method 3 372398
list timing
Method 1 756946
Method 3 674451

For a container size of 1000000 repeating each method 100 time:
vector timing
Method 1 12679666
Method 2 36049916
Method 3 36056914
list timing
Method 1 81043026
Method 3 70404243

For a container size of 1000 repeating each method 100 time:
vector timing
Method 1 2284
Method 2 1095
Method 3 1308
list timing
Method 1 9081
Method 3 7118

For a container size of 1000 repeating each method 1 time:
vector timing
Method 1 25
Method 2 17
Method 3 18
list timing
Method 1 96
Method 3 62

For the case of the std::list Method 3 is fastest. In the case of std::vector method 1 (the for_each method) is faster when the container size is sufficiently large (in this case 1 000 000). How does one know when a container is sufficiently large? Dunno.


#include <iostream>
//#include <iomanip.h>
#include <math.h>
#include <windows.h>
#include <vector>
#include <functional>
#include <algorithm>
#include <list>

using namespace std;

struct AddOne :
unary_function<int,void>
{
void operator()(argument_type& i);
};

void AddOne::operator() (argument_type& i)
{
i += 1;
}

int vector_timing(long Size, long RepCount)
{
cout << "vector timing" << endl;
LARGE_INTEGER Freq, Start, End;
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
BOOL Valid = QueryPerformanceFrequency(&Freq);
if (!Valid)
return -1;

//method 1
vector<int> u0(Size,3),v0(Size,3),w0(Size,3),x0(Size,3);
QueryPerformanceCounter(&Start);
for (int i = 0; i < RepCount; ++i)
{
for_each(u0.begin(), u0.end(), AddOne());
for_each(v0.begin(), v0.end(), AddOne());
for_each(w0.begin(), w0.end(), AddOne());
for_each(x0.begin(), x0.end(), AddOne());
}
QueryPerformanceCounter(&End);
std::cout << "Method 1 " << End.QuadPart - Start.QuadPart << endl;

//method 2
vector<int> u1(Size,3),v1(Size,3),w1(Size,3),x1(Size,3);
QueryPerformanceCounter(&Start);
vector<int>::size_type n = u1.size();
for (int i = 0; i < RepCount; ++i)
{
for (int i = 0; i < n; ++i)
{
AddOne()(u1[i]);
AddOne()(v1[i]);
AddOne()(w1[i]);
AddOne()(x1[i]);
}
}
QueryPerformanceCounter(&End);
std::cout << "Method 2 " << End.QuadPart - Start.QuadPart << endl;

//method 3
vector<int> u2(Size,3),v2(Size,3),w2(Size,3),x2(Size,3);
vector<int>::iterator itu2, itv2, itw2, itx2;
QueryPerformanceCounter(&Start);
for (int i = 0; i < RepCount; ++i)
{
for (itu2 = u2.begin(), itv2 = v2.begin(), itw2 = w2.begin(), itx2 = x2.begin();
itu2 != u2.end();
++itu2, ++itv2, ++itw2, ++itx2)
{
AddOne()(*itu2);
AddOne()(*itv2);
AddOne()(*itw2);
AddOne()(*itx2);
}
}
QueryPerformanceCounter(&End);
std::cout << "Method 3 " << End.QuadPart - Start.QuadPart << endl;
return 0;
}

int list_timing(long Size, long RepCount)
{
LARGE_INTEGER Freq, Start, End;
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
ZeroMemory(&Freq, sizeof(LARGE_INTEGER));
BOOL Valid = QueryPerformanceFrequency(&Freq);
if (!Valid)
return -1;

cout << "list timing" << endl;
//method 1
list<int> u0(Size,3),v0(Size,3),w0(Size,3),x0(Size,3);
QueryPerformanceCounter(&Start);
for (int i = 0; i < RepCount; ++i)
{
for_each(u0.begin(), u0.end(), AddOne());
for_each(v0.begin(), v0.end(), AddOne());
for_each(w0.begin(), w0.end(), AddOne());
for_each(x0.begin(), x0.end(), AddOne());
}
QueryPerformanceCounter(&End);
std::cout << "Method 1 " << End.QuadPart - Start.QuadPart << endl;

//method 3
list<int> u2(Size,3),v2(Size,3),w2(Size,3),x2(Size,3);
list<int>::iterator itu2, itv2, itw2, itx2;
QueryPerformanceCounter(&Start);
for (int i = 0; i < RepCount; ++i)
{
for (itu2 = u2.begin(), itv2 = v2.begin(), itw2 = w2.begin(), itx2 = x2.begin();
itu2 != u2.end();
++itu2, ++itv2, ++itw2, ++itx2)
{
AddOne()(*itu2);
AddOne()(*itv2);
AddOne()(*itw2);
AddOne()(*itx2);
}
}
QueryPerformanceCounter(&End);
std::cout << "Method 3 " << End.QuadPart - Start.QuadPart << endl;
return 0;
}

int main()
{
vector_timing(1000,1);
list_timing(1000,1);
return 0;
}
Nov 16 '05 #6
>


In summary, when in release build, method 1 is the slowest, method 2

and

3 are the fastest for std::vector. For std::list method 3 is the
fastest. Compiled using vc7.1.


When I increase it to a million iterations, I get this result:
vector timing
Method 1 70041
Method 2 382012
Method 3 450109
list timing
Method 1 69126
Method 3 399203

Here for_each is *much* faster.

Sorry there was a mistake in the code I originally posted. The function
list_timing used std::vector as its container when it was meant to use
std::list. I've corrected the mistake and posted it as a reply to my
original code post.
Nov 16 '05 #7
Dan Smith wrote:
"seems inefficient"...do you have proof that it actually *is* inefficient
for your particular case?


It seemed inefficient because more comparisons and increments happen in
this case

for_each(v.begin(), v.end(), f1());
for_each(u.begin(), u.end(), f2());
Nov 16 '05 #8
"sashan" <ma***@operamail.com> wrote in message
news:bl**********@lust.ihug.co.nz...

For the case of the std::list Method 3 is fastest. In
the case of std::vector method 1 (the for_each method)
is faster when the container size is sufficiently large (in
this case 1 000 000). How does one know when a
container is sufficiently large? Dunno.


I suspect that the reason method 3 is faster than method 1 for list containers
is because you are only checking for the end of one container, not all four.
This is a valid optimization if you know that the lists are all the same
length. The difference is pretty minimal though, especially if the operation
is less trivial than adding one, so unless it was highly speed critical I
would still favor the for_each method.

I suspect the reason method 1 is slower with small containers is because it is
method 1. That is, method 1 is suffering all the cache misses, while the
other two methods benefit from the data being already cached from method 1's
operation. With larger data sets, the early part of the data set is pushed
out of the cache by the later data before the second method runs, so it is has
to deal with the same cache misses that the first one did.

I'll bet if you run "method 1" second, it will perform better with the small
containers.

Ken
Nov 16 '05 #9
>

I suspect that the reason method 3 is faster than method 1 for list containers
is because you are only checking for the end of one container, not all four.
This is a valid optimization if you know that the lists are all the same
length. The difference is pretty minimal though, especially if the operation
is less trivial than adding one, so unless it was highly speed critical I
would still favor the for_each method.

Would you use the following if it were available?
for_each((v.begin(), u.begin()), (v.end(), u.end()), (f1(), f2()));

I suspect the reason method 1 is slower with small containers is because it is
method 1. That is, method 1 is suffering all the cache misses, while the
other two methods benefit from the data being already cached from method 1's
operation. With larger data sets, the early part of the data set is pushed
out of the cache by the later data before the second method runs, so it is has
to deal with the same cache misses that the first one did.

I'll bet if you run "method 1" second, it will perform better with the small
containers.


That sounded like a reasonable sort of argument so I tried it and got
these results for 1000 vectors

vector timing
Method 2 16
Method 3 17
Method 1 28
list timing
Method 3 67
Method 1 96

Method 1 (the for_each method) is still slower.

--
sashan
http://www.cs.auckland.ac.nz/~sgov008/


Nov 16 '05 #10
"sashan" <ma***@operamail.com> wrote in message
news:bm**********@lust.ihug.co.nz...

The difference is pretty minimal though, especially if the operation
is less trivial than adding one, so unless it was highly speed critical I
would still favor the for_each method.


Would you use the following if it were available?
for_each((v.begin(), u.begin()), (v.end(), u.end()), (f1(), f2()));


I wouldn't be averse to it, though the syntax as you write it couldn't work
without changes to the language. Perhaps if there was a form of for_each that
took pair<Iterator,Iterator>'s.

You can sort of fake it out with:

mismatch(v.begin(),v.end(),u.begin(),function_runn er());

where function_runner does:

bool operator()(LHS_t lhs, RHS_t rhs) {
f1(lhs);
f2(rhs);
return true;
}

Of course, I'm not sure that's any clearer than writing it out or creating
your own for_each2 function.
I'll bet if you run "method 1" second, it will perform better with the smallcontainers.


That sounded like a reasonable sort of argument so I tried it and got
these results for 1000 vectors

vector timing
Method 2 16
Method 3 17
Method 1 28


Very odd. If it wasn't cache misses, I'd be very curious to know why the
smaller arrays perform so differently than the larger ones.

Ken
Nov 16 '05 #11
"sashan" <ma***@operamail.com> wrote in message
news:bm**********@lust.ihug.co.nz...

I suspect that the reason method 3 is faster than method 1 for list containersis because you are only checking for the end of one container, not all four.This is a valid optimization if you know that the lists are all the same
length. The difference is pretty minimal though, especially if the operationis less trivial than adding one, so unless it was highly speed critical I
would still favor the for_each method.


Would you use the following if it were available?
for_each((v.begin(), u.begin()), (v.end(), u.end()), (f1(), f2()));


BTW, if you want to suggest language/library features like this, a good place
to bring it up would be comp.std.c++.

Ken
Nov 16 '05 #12
I think might be possible to get pretty close to this already, say something along the lines of:

typedef std::vector<int> IntVector_t;
#include <iterator>
struct TwoIntVectorsAsOne
{
struct function_wrapper
{
function_wrapper(Sqr& uf) : m_uf(uf) {}
void operator()(const TwoIntVectorsAsOne& tao);
private:
Sqr& m_uf;
};

struct iterator : public std::iterator<std::forward_iterator_tag, TwoIntVectorsAsOne>
{
iterator() {}

TwoIntVectorsAsOne& operator*() const;
iterator& operator++();
friend bool operator==(const iterator& x, const iterator& y);
friend bool operator!=(const iterator& x, const iterator& y) {
return (!(x == y));
}
};

iterator begin();
iterator end();

TwoIntVectorsAsOne(const IntVector_t& iv1, const IntVector_t& iv2) : m_iv1(iv1), m_iv2(iv2) {}

private:
const IntVector_t& m_iv1;
const IntVector_t& m_iv2;
};

vector<int> u3(10,3),v3(10,3);
TwoIntVectorsAsOne two_as_one(u3, v3);
Sqr sqr;
TwoIntVectorsAsOne::function_wrapper fw(sqr);
for_each(two_as_one.begin(), two_as_one.end(), fw);

Obviously, this is FAR from complete, but it does illustrate one possible approach using the existing language.

Dan
"Ken Alverson" <Ke*@Alverson.net> wrote in message news:uP**************@TK2MSFTNGP10.phx.gbl...
"sashan" <ma***@operamail.com> wrote in message
news:bm**********@lust.ihug.co.nz...

I suspect that the reason method 3 is faster than method 1 for list containersis because you are only checking for the end of one container, not all four.This is a valid optimization if you know that the lists are all the same
length. The difference is pretty minimal though, especially if the operationis less trivial than adding one, so unless it was highly speed critical I
would still favor the for_each method.


Would you use the following if it were available?
for_each((v.begin(), u.begin()), (v.end(), u.end()), (f1(), f2()));


BTW, if you want to suggest language/library features like this, a good place
to bring it up would be comp.std.c++.

Ken

Nov 16 '05 #13

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

5
by: Dylan | last post by:
At the moment I often find myself doing something like the following std::list<MyClass*> mcl; /.../fill mcl with some elements //now call aMethod() for each element; for...
6
by: Philip Potter | last post by:
Hello there, I'm reading about the std::for_each() function in TC++PL, 3rd Ed. It seems like a good idea, but in practice I can never see a way to bend it to my wishes without writing huge...
2
by: psujkov | last post by:
Hi everybody, does anyone knows is there any way to have an implicit iteration counter in std::for_each ? e.g. std::for_each(c.begin(), c.end(), boost::bind(&C::set_id, _1,...
22
by: yurec | last post by:
Hi I wanna give template method of library class with some predicate, which I put to std::for_each in that method.How to break the std::for_each with predicate?(exceptions are used olny in...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.