By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
448,671 Members | 1,623 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 448,671 IT Pros & Developers. It's quick & easy.

What should I Use ?

P: n/a
Hi all,

Imagine I've an array of int :

int[] anArray = new int[..];

I want to extract all the integer that are superior to 500

I can do :

var result = new List<int>();
foreach (int i in anArray)
if (i 500)
result.Add(i);

or

var resultFind = Array.FindAll(anArray, elt =elt 500);

or

var resultWhere = anArray.Where(elt =elt 500);

or

var resultLinq = from i in anArray where i 500 select i;

What is the best way ?

Thanks for your answer
Jul 9 '08 #1
Share this Question
Share on Google+
10 Replies


P: n/a
ti*********@gmail.com wrote:
Imagine I've an array of int :

int[] anArray = new int[..];

I want to extract all the integer that are superior to 500

I can do :

var result = new List<int>();
foreach (int i in anArray)
if (i 500)
result.Add(i);

or

var resultFind = Array.FindAll(anArray, elt =elt 500);

or

var resultWhere = anArray.Where(elt =elt 500);

or

var resultLinq = from i in anArray where i 500 select i;

What is the best way ?
Methods #1 and #2 create new collections, #3 and #4 do not (in fact they do
not create anything; they will yield a new sequence only when evaluated).
Methods #3 and #4 use LINQ (and thus require .NET 3.5), methods #1 and #2 do
not. Readability is subjective, but #1 is obviously less direct than the
other three methods.

The answer is therefore predictably "it depends". In general, I'd pick
whichever one was most convenient in the context (do I need a List, an
Array, an IEnumerable?) and switch if profiling results or other
circumstances dictated otherwise. in isolation, it's too small a problem to
pick the "best way".

--
J.
Jul 9 '08 #2

P: n/a
My suggestion would be to write a small windows forms app that makes a LARGE
array of number like the one you have to search, then add 4 buttons and
behind each one, put the code to do one of the methods. Time the execution of
the search using a stopwatch and see how long each takes.
The stopwatch is supposed to be a high accuracy timer for doing things like
these.
Try to make the array large so the difference between the techniques are
exposed.

Post the results and let us know.

--
Ciaran O''Donnell
http://wannabedeveloper.spaces.live.com
"ti*********@gmail.com" wrote:
Hi all,

Imagine I've an array of int :

int[] anArray = new int[..];

I want to extract all the integer that are superior to 500

I can do :

var result = new List<int>();
foreach (int i in anArray)
if (i 500)
result.Add(i);

or

var resultFind = Array.FindAll(anArray, elt =elt 500);

or

var resultWhere = anArray.Where(elt =elt 500);

or

var resultLinq = from i in anArray where i 500 select i;

What is the best way ?

Thanks for your answer
Jul 9 '08 #3

P: n/a
Ciaran O''Donnell <Ci************@discussions.microsoft.comwrote:
My suggestion would be to write a small windows forms app that makes a LARGE
array of number like the one you have to search, then add 4 buttons and
behind each one, put the code to do one of the methods. Time the execution of
the search using a stopwatch and see how long each takes.
The stopwatch is supposed to be a high accuracy timer for doing things like
these.
Try to make the array large so the difference between the techniques are
exposed.

Post the results and let us know.
Note that that suggests that "best" == "fastest". Personally I'd pick
the means which is the most readable. For me, that's:

anArrayWhere(elt =elt 500);

I'd only start worrying about performance when I had some evidence that
it was really significant.

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon_skeet
C# in Depth: http://csharpindepth.com
Jul 9 '08 #4

P: n/a
On Jul 9, 7:53*am, timor.su...@gmail.com wrote:
Hi all,

Imagine I've an array of int :

int[] anArray = new int[..];

I want to extract all the integer that are superior to 500

I can do :

* * var result = new List<int>();
* * foreach (int i in anArray)
* * * * if (i 500)
* * * * * * result.Add(i);

or

* * var resultFind = Array.FindAll(anArray, elt =elt 500);

or

* * var resultWhere = anArray.Where(elt =elt 500);

or

* * var resultLinq = from i in anArray where i 500 select i;

What is the best way ?

Thanks for your answer
What you understand with "BEST" , fastest , less memory use, more
readable, more portable?

Depending of that you should select one method or other.

If you have a small list no method will be too different I think.

the first could be used in 2.0 , if you use linq you are forced to use
3.5

also if the code will be seen for somebody else then the first option
is trivial even if you do not know C# you could understand what it
does. the others are different though, each need to know a little bit
about C# from 2.0 features to LINQ

Jul 9 '08 #5

P: n/a
I understand your point but I like to know which options perform best in
which situations as I think it is better to code with performance in mind. If
you dont, you risk getting to the end of a development and realises you have
to go back through it and find optimisations to overcome a performance issue.
Some people call it premature optimizing and say it wastes time. If you
already know certain things are faster or more memory efficient then its good
to use them. It stops you having to come back to it later and it also helps
people reading your code pick up good ways to do things.
But each to their own and I fully understand in this case for a small array,
it wont make a difference at all. But if I or someone reading my code got to
a similar situation in the future with a more complex problem, having done
the research at this point might for-arm us with the knowledge of which one
performs better.

--
Ciaran O''Donnell
http://wannabedeveloper.spaces.live.com
"Jon Skeet [C# MVP]" wrote:
Ciaran O''Donnell <Ci************@discussions.microsoft.comwrote:
My suggestion would be to write a small windows forms app that makes a LARGE
array of number like the one you have to search, then add 4 buttons and
behind each one, put the code to do one of the methods. Time the execution of
the search using a stopwatch and see how long each takes.
The stopwatch is supposed to be a high accuracy timer for doing things like
these.
Try to make the array large so the difference between the techniques are
exposed.

Post the results and let us know.

Note that that suggests that "best" == "fastest". Personally I'd pick
the means which is the most readable. For me, that's:

anArrayWhere(elt =elt 500);

I'd only start worrying about performance when I had some evidence that
it was really significant.

--
Jon Skeet - <sk***@pobox.com>
Web site: http://www.pobox.com/~skeet
Blog: http://www.msmvps.com/jon_skeet
C# in Depth: http://csharpindepth.com
Jul 9 '08 #6

P: n/a
On Jul 9, 11:58*pm, Ciaran O''Donnell
<CiaranODonn...@discussions.microsoft.comwrote:
I understand your point but I like to know which options perform best in
which situations as I think it is better to code with performance in mind..
Eek. I think readability is far more important. Only a tiny proportion
of the implementation is likely to have a really significant impact on
performance, whereas *all* of the code benefits from being as readable
as possible.
If you dont, you risk getting to the end of a development and
realises you have to go back through it and find optimisations to
overcome a performance issue.
I'd say you need to be aware of *architectural* performance at all
times, but not small pieces of *implementation* performance. Using a
profiler later on will usually find any bottlenecks pretty quickly -
so you can target those bottlenecks and use a more efficient solution
instead of the most readable one for that particular area.
Some people call it premature optimizing and say it wastes time.
It's not the waste of time that I primarily object to, it's the
emphasis on performance over readability.
If you already know certain things are faster or more memory
efficient then its good to use them.
So long as you know the situation well enough to be able to predict
that, of course. See later for my analysis of this particular
performance dilemma.
It stops you having to come back to it later and it also helps
people reading your code pick up good ways to do things.
No, it helps people reading your code pick up *fast* ways of doing
things, which:
a) may not be the most readable ways of have the clearest design
b) may slow them down when reading the code - chances are they want to
understand what's going on with the code, not learn performance tricks
c) may well not apply in situations which appear to be similar
But each to their own and I fully understand in this case for a small array,
it wont make a difference at all. But if I or someone reading my code gotto
a similar situation in the future with a more complex problem, having done
the research at this point might for-arm us with the knowledge of which one
performs better.
As a similar "each to their own" I'll readily acknowledge that there
*are* situations where performance matters, and I'm more likely to go
out of my way to use an efficient solution over a readable one if I
know in advance that the code is going to be heavily used, and/or used
on a low-power device. (Library code is a separate issue, as you often
don't know what the callers will be doing.)

Also, there are some "gotchas" which are just worth knowing about to
start with: don't use exceptions for flow control, don't concatenate
strings in a loop where you can use StringBuilder to great effect,
etc. I'm all for being *aware* of performance characteristics,
particularly where the differences are between O(n) and O(n^2)
complexity, for instance. I just believe that readability is king.

Now, let's look at the 4 proposed pieces of code. We can ignore the
last one (using a query expression) immediately, as the previous one
(with the Where call) results in exactly the same code. So, we're down
to 3. Here they are, reproduced and labeled (somewhere arbitrarily)
for future reference:

Option 1: "Manual"
var result = new List<int>();
foreach (int i in anArray)
if (i 500)
result.Add(i);

(Eek about the lack of braces, by the way...)

Option 2: "FindAll"

var resultFind = Array.FindAll(anArray, elt =elt 500);

Option 3: "LINQ"

var resultWhere = anArray.Where(elt =elt 500);
The first thing to consider is whether or not a list is actually
required. If it's not, then both Manual and FindAll will be wasting
time and memory building the list. This will have low impact when
there aren't many items found, but if the array becomes very large and
almost all elements are over 500, a lot of time may be wasted copying
the internal list buffer. Using a LinkedList instead in Manual would
put the algorithm back to O(n) but with a higher constant factor -
creating a node for the list is more expensive in both time and memory
than just setting an element in a backing array. If there is a
reasonable idea beforehand how many elements will be over 500, the
list's initial size could be specified to help alleviate things - at
the cost of potentially wasting memory if the guess is significantly
higher than reality.

The next thing to consider is the cost of iterating over the array.
Using "foreach" on an array compiles to different code than using it
on another IEnumerable<T>. The compiler knows how to access array
elements in a an efficient way, without needing any method calls.
FindAll will likewise be able to iterate very quickly, as it knows the
backing store internally. Where operates on an arbitrary
IEnumerable<Twhich forces it to genererate an IEnuerator<Tand call
MoveNext() and the Current property, which can't be inlined as the JIT
won't know what implementations it wil be used with.

Next comes the comparison. In Manual this is "inline" and probably
about as efficient as you'll get. The other two both use delegates in
the same way. I'd expect this to be less efficient than the inline
test, as the delegate call can't really be inlined as far as I can
see. One thing to watch out for is that this version doesn't capture
any variables. If the limit weren't set to 500, but the value of a
local variable, that would involve a delegate being created in a
different class, with an instance variable accessed for the limit. I
don't know whether or not that would have any significant effect, but
it might.

Finally there's laziness. The Where call won't actually cost anything
(of any significance) immediately. The only cost will be when the
result is iterated over. If the iteration actually stops after the
first few elements for whatever reason, then both Manual and FindAll
have done a lot of work for no benefit, whereas Where won't have done
any more work than is necessary.

All of these considerations will contribute to the performance based
on the actual situation, including the intended use of the result.
Whichever option ends up being the fastest for the OP's case may well
not be the fastest in a different context. That's one of the dangers
of the approach of "look at some code, learn the fastest way of doing
things." So often the answer to a performance question is "it
depends."

Jon
Jul 10 '08 #7

P: n/a
Thanks for that response, its pretty impressive.
I totally agree with what your saying and I also think readability is
important, and I also agree about the "it depends" answer to performance.
However, I think you have taken my comments as being my life view and
guiding principles where it is more like something I like to keep in mind.
I wouldn't go out of my way to introduce complex and hard to follow code on
the off chance it saves a couple of milliseconds on the 1 time its executed.
But I think its important to keep in mind things like looking to see if you
can predetermine approximate sizes for array based lists or a stringbuilder.
Often you cant but sometimes you can.
In an app I wrote recently, i was throwing out a file full of string
representations of certain objects. I didnt know the exact size but I knew it
was normally under 1kb per object and didnt vary that much, so I decided to
initialise the stringbuilder with a length of myObjectColllection.Count*1024.
This would have caused me to use a little more memory than I needed but I
would only have to allocate it once which made it faster and didnt change the
readability much as a comment explained the logic.
So really I do agree with you in terms of readability, and that you need to
analyse each situation individualy to determine the best ways of doing
something. But in a simple cases like this where readability isnt such a big
issue (they arent that different) I think is would be worth testing this to
find out which implementation does better for speed of the search. I however
am interested in things like that.

I really appreciate to taking the time to respond though
--
Ciaran O''Donnell
http://wannabedeveloper.spaces.live.com
"Jon Skeet [C# MVP]" wrote:
On Jul 9, 11:58 pm, Ciaran O''Donnell
<CiaranODonn...@discussions.microsoft.comwrote:
I understand your point but I like to know which options perform best in
which situations as I think it is better to code with performance in mind..

Eek. I think readability is far more important. Only a tiny proportion
of the implementation is likely to have a really significant impact on
performance, whereas *all* of the code benefits from being as readable
as possible.
If you dont, you risk getting to the end of a development and
realises you have to go back through it and find optimisations to
overcome a performance issue.

I'd say you need to be aware of *architectural* performance at all
times, but not small pieces of *implementation* performance. Using a
profiler later on will usually find any bottlenecks pretty quickly -
so you can target those bottlenecks and use a more efficient solution
instead of the most readable one for that particular area.
Some people call it premature optimizing and say it wastes time.

It's not the waste of time that I primarily object to, it's the
emphasis on performance over readability.
If you already know certain things are faster or more memory
efficient then its good to use them.

So long as you know the situation well enough to be able to predict
that, of course. See later for my analysis of this particular
performance dilemma.
It stops you having to come back to it later and it also helps
people reading your code pick up good ways to do things.

No, it helps people reading your code pick up *fast* ways of doing
things, which:
a) may not be the most readable ways of have the clearest design
b) may slow them down when reading the code - chances are they want to
understand what's going on with the code, not learn performance tricks
c) may well not apply in situations which appear to be similar
But each to their own and I fully understand in this case for a small array,
it wont make a difference at all. But if I or someone reading my code got to
a similar situation in the future with a more complex problem, having done
the research at this point might for-arm us with the knowledge of which one
performs better.

As a similar "each to their own" I'll readily acknowledge that there
*are* situations where performance matters, and I'm more likely to go
out of my way to use an efficient solution over a readable one if I
know in advance that the code is going to be heavily used, and/or used
on a low-power device. (Library code is a separate issue, as you often
don't know what the callers will be doing.)

Also, there are some "gotchas" which are just worth knowing about to
start with: don't use exceptions for flow control, don't concatenate
strings in a loop where you can use StringBuilder to great effect,
etc. I'm all for being *aware* of performance characteristics,
particularly where the differences are between O(n) and O(n^2)
complexity, for instance. I just believe that readability is king.

Now, let's look at the 4 proposed pieces of code. We can ignore the
last one (using a query expression) immediately, as the previous one
(with the Where call) results in exactly the same code. So, we're down
to 3. Here they are, reproduced and labeled (somewhere arbitrarily)
for future reference:

Option 1: "Manual"
var result = new List<int>();
foreach (int i in anArray)
if (i 500)
result.Add(i);

(Eek about the lack of braces, by the way...)

Option 2: "FindAll"

var resultFind = Array.FindAll(anArray, elt =elt 500);

Option 3: "LINQ"

var resultWhere = anArray.Where(elt =elt 500);
The first thing to consider is whether or not a list is actually
required. If it's not, then both Manual and FindAll will be wasting
time and memory building the list. This will have low impact when
there aren't many items found, but if the array becomes very large and
almost all elements are over 500, a lot of time may be wasted copying
the internal list buffer. Using a LinkedList instead in Manual would
put the algorithm back to O(n) but with a higher constant factor -
creating a node for the list is more expensive in both time and memory
than just setting an element in a backing array. If there is a
reasonable idea beforehand how many elements will be over 500, the
list's initial size could be specified to help alleviate things - at
the cost of potentially wasting memory if the guess is significantly
higher than reality.

The next thing to consider is the cost of iterating over the array.
Using "foreach" on an array compiles to different code than using it
on another IEnumerable<T>. The compiler knows how to access array
elements in a an efficient way, without needing any method calls.
FindAll will likewise be able to iterate very quickly, as it knows the
backing store internally. Where operates on an arbitrary
IEnumerable<Twhich forces it to genererate an IEnuerator<Tand call
MoveNext() and the Current property, which can't be inlined as the JIT
won't know what implementations it wil be used with.

Next comes the comparison. In Manual this is "inline" and probably
about as efficient as you'll get. The other two both use delegates in
the same way. I'd expect this to be less efficient than the inline
test, as the delegate call can't really be inlined as far as I can
see. One thing to watch out for is that this version doesn't capture
any variables. If the limit weren't set to 500, but the value of a
local variable, that would involve a delegate being created in a
different class, with an instance variable accessed for the limit. I
don't know whether or not that would have any significant effect, but
it might.

Finally there's laziness. The Where call won't actually cost anything
(of any significance) immediately. The only cost will be when the
result is iterated over. If the iteration actually stops after the
first few elements for whatever reason, then both Manual and FindAll
have done a lot of work for no benefit, whereas Where won't have done
any more work than is necessary.

All of these considerations will contribute to the performance based
on the actual situation, including the intended use of the result.
Whichever option ends up being the fastest for the OP's case may well
not be the fastest in a different context. That's one of the dangers
of the approach of "look at some code, learn the fastest way of doing
things." So often the answer to a performance question is "it
depends."

Jon
Jul 10 '08 #8

P: n/a
I wish if you have added the hashlist into the discussion, would be better
to understand how HASHLIST functions. and might have a chance to read the
explanations from great personalities like John Skeet, Ciaran O' Donnell..

better late than never, pals .. I recommend Hash List.. what do you guys
say?
<ti*********@gmail.comwrote in message
news:3e**********************************@k37g2000 hsf.googlegroups.com...
Hi all,

Imagine I've an array of int :

int[] anArray = new int[..];

I want to extract all the integer that are superior to 500

I can do :

var result = new List<int>();
foreach (int i in anArray)
if (i 500)
result.Add(i);

or

var resultFind = Array.FindAll(anArray, elt =elt 500);

or

var resultWhere = anArray.Where(elt =elt 500);

or

var resultLinq = from i in anArray where i 500 select i;

What is the best way ?

Thanks for your answer
Jul 10 '08 #9

P: n/a
On Jul 10, 1:20*pm, "Chakravarthy" <dskch...@msn.comwrote:
I wish if you have added the hashlist into the discussion, would be better
to understand how HASHLIST functions. and might have a chance to read the
explanations from great personalities like John Skeet, Ciaran O' Donnell...

better late than never, pals .. I recommend Hash List.. what do you guys
say?
I don't see what the value would be. Could you give some example code
to say what you mean, and why you think it would help?

Jon
Jul 10 '08 #10

P: n/a
Thanks Jon for your answer,
I didn't know much about lazyness. For sure, it's better if I don't
need to iterate throught all the list.

This is now much clearer ;-)

On 10 juil, 14:57, "Jon Skeet [C# MVP]" <sk...@pobox.comwrote:
On Jul 10, 1:20 pm, "Chakravarthy" <dskch...@msn.comwrote:
I wish if you have added the hashlist into the discussion, would be better
to understand how HASHLIST functions. and might have a chance to read the
explanations from great personalities like John Skeet, Ciaran O' Donnell..
better late than never, pals .. I recommend Hash List.. what do you guys
say?

I don't see what the value would be. Could you give some example code
to say what you mean, and why you think it would help?

Jon
Jul 10 '08 #11

This discussion thread is closed

Replies have been disabled for this discussion.