473,387 Members | 1,517 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

How to demonstrate bigO cost of algorithms?

Hi -

I'm studying algorithms and I want to write a python program that
calculates the actual runtimes.

I want to increment up some global variable called n in my program so
that I can see the relationship between the size of the problem and the
number of steps.

Here's my clumsy attempt to implement this:

def qsort(ll):
try:
global n
n += len(ll)
print "adding %d to %d." % (len(ll), n)
print "incrementing up n to %d." % n
except:
pass
if len(ll) == 0:
return []
p = ll[0]
left = [x for x in ll if x < p]
mid = [x for x in ll if x == p]
right = [x for x in ll if x > p]
return qsort(left) + mid + qsort(right)

I'm not getting the results I hoped for. I expected that after the
program is finished, I would get results near

len(ll) * math.log(len(ll), 2)

since the big O for quicksort is nlogn for all but the worst case.

I'd like to write similar functions for mergesort, bubblesort, and all
the other basic algorithms covered in data structures classes.

All help is welcome.

--
Give and take free stuff: http://freecycle.org
Jul 18 '05 #1
13 2853
Rusty Shackleford wrote:
Hi -

I'm studying algorithms and I want to write a python program that
calculates the actual runtimes.

I want to increment up some global variable called n in my program so
that I can see the relationship between the size of the problem and the
number of steps.

Here's my clumsy attempt to implement this:

[snipped]


What is that ugly try-except block for? All it seems to do is hide
potential errors.

Anyway, if you are only testing comparison-base sorting algorithms then
why not create a new class (maybe by subclassing a builtin) and
implement a __cmp__ method that increments a global whenever called.
That way you will get an exact count of the number of comparisons.

Cheers,
Brian

Jul 18 '05 #2
On Wed, 02 Jun 2004 15:40:12 GMT, Rusty Shackleford
<rs@overlook.homelinux.net> declaimed the following in comp.lang.python:
Hi -

I'm studying algorithms and I want to write a python program that
calculates the actual runtimes.
Take a look at section 10.10 of the Python Library Manual (uh,
requires Python 2.3+). That should get you the run-times directly.

I want to increment up some global variable called n in my program so
that I can see the relationship between the size of the problem and the
number of steps.

Here's my clumsy attempt to implement this:

def qsort(ll):
try:
global n
n += len(ll)
print "adding %d to %d." % (len(ll), n)
print "incrementing up n to %d." % n
except:
pass
if len(ll) == 0:
return []
p = ll[0]
left = [x for x in ll if x < p]
mid = [x for x in ll if x == p]
right = [x for x in ll if x > p]
return qsort(left) + mid + qsort(right)

I'm not getting the results I hoped for. I expected that after the
program is finished, I would get results near

len(ll) * math.log(len(ll), 2)

since the big O for quicksort is nlogn for all but the worst case.

I'd like to write similar functions for mergesort, bubblesort, and all
the other basic algorithms covered in data structures classes.

All help is welcome.
-- ================================================== ============ <
wl*****@ix.netcom.com | Wulfraed Dennis Lee Bieber KD6MOG <
wu******@dm.net | Bestiaria Support Staff <
================================================== ============ <
Home Page: <http://www.dm.net/~wulfraed/> <
Overflow Page: <http://wlfraed.home.netcom.com/> <

Jul 18 '05 #3
Rusty Shackleford <rs@overlook.homelinux.net> wrote in
news:sl***************@frank.overlook.homelinux.ne t:
Here's my clumsy attempt to implement this:

def qsort(ll):
try:
global n
n += len(ll)
print "adding %d to %d." % (len(ll), n)
print "incrementing up n to %d." % n
except:
pass
if len(ll) == 0:
return []
p = ll[0]
left = [x for x in ll if x < p]
mid = [x for x in ll if x == p]
right = [x for x in ll if x > p]
return qsort(left) + mid + qsort(right)

I'm not getting the results I hoped for. I expected that after the
program is finished, I would get results near

len(ll) * math.log(len(ll), 2)

since the big O for quicksort is nlogn for all but the worst case.


You don't say what sort of objects you are passing in the list ll. Nor do
you say how large a value of 'n' you tried. Nor have you said how you
ensured that you weren't hitting the worst case performance.

Python may not be the best language to demonstrate this sort of effect. Two
things affect sorting speed in most languages: the number of comparisons,
and the number of times you copy data items, but other factors can come
into play when they are slow compared to the comparisons and moves. It may
well be that here your comparisons are comparatively fast (especially if
you just used integers as the input data), and the time is being swamped by
the function call overhead and creating all those intermediate lists. In
that case you may just need to increase the length of the list. A lot.

Most quicksort implementations sort the data in-place. You are creating new
lists for left, mid, right, and then again creating new lists by
concatenating the results of your recursive calls. Both of these involve
memory allocation, and the lists may need to be moved around in memory as
they grow. All of this complicates the order calculation.

Also, remember that if your lists are too long you are likely to start
swapping virtual memory, so all your times go out the window at that point.
Not to mention garbage collection.

If you want to get a result that matches your expectations, then I think
you would be best to operate on the data inplace, and probably also get rid
of the recursion.
Jul 18 '05 #4
The problem is that BigO cost describes how the computation time scales
with the size of the set, but nothing about how long that really is. For
example, let us implement a trivial algorithm that increments each
number in a list:

def f(list):
for key in xrange( len(list) ):
list[key] += 1

this is O(n), because as the size of the set increases, the computation
time increases linearly. Now say we change the implementation to this:

def f(list):
for key in xrange( len(list) ):
list[key] += 1
i = 2**32
while i:
print 'wasting time'
i -= 1

This functionally equivalent algorithm takes much longer to run, but its
complexity is still O(n).

Because O(n) only tells us how the computation time scales, to get the
real time, multiply n by the time it takes to run the algorithm on a set
of size one.

In your case of qsort which is O(log(n)), that is not a base2 log, it's
just a log, and the base doesn't really matter. Changing the base of the
log only multiplies the result by some constant. By simple algebra, we
know that log2(x) = log10(x) / log10(2). From this we can see that
changing the base of the log in your calculation multiplies the result
by a constant. What this constant is depends on the implementation of
the algorithm, and to translate BigO notation into real times you will
have to find it by performing a regression on your data.

That said, BigO notation only gives the complexity in theory, and not
always in practice. Other factors will cause the real run time to
deviate from the BigO model, so you will likely need a large set of data
to clearly see the relation.

On Wed, Jun 02, 2004 at 03:40:12PM +0000, Rusty Shackleford wrote:
Hi -

I'm studying algorithms and I want to write a python program that
calculates the actual runtimes.

I want to increment up some global variable called n in my program so
that I can see the relationship between the size of the problem and the
number of steps.

Here's my clumsy attempt to implement this:

def qsort(ll):
try:
global n
n += len(ll)
print "adding %d to %d." % (len(ll), n)
print "incrementing up n to %d." % n
except:
pass
if len(ll) == 0:
return []
p = ll[0]
left = [x for x in ll if x < p]
mid = [x for x in ll if x == p]
right = [x for x in ll if x > p]
return qsort(left) + mid + qsort(right)

I'm not getting the results I hoped for. I expected that after the
program is finished, I would get results near

len(ll) * math.log(len(ll), 2)

since the big O for quicksort is nlogn for all but the worst case.

I'd like to write similar functions for mergesort, bubblesort, and all
the other basic algorithms covered in data structures classes.

All help is welcome.


Jul 18 '05 #5
Thanks for that explanation -- it really helped. I forgot that O(n) can
translate into dn + c where d and c are constants.

I think I'm going to keep trying to figure out ways to demonstrate big O
stuff, because I just don't understand it until I see it in a
non-abstract sense.

Thanks again.

--
Give and take free stuff: http://freecycle.org
Jul 18 '05 #6
In article <sl***************@frank.overlook.homelinux.net> ,
Rusty Shackleford <rs@overlook.homelinux.net> wrote:
Here's my clumsy attempt to implement this:

def qsort(ll):
try:
global n
n += len(ll)
print "adding %d to %d." % (len(ll), n)
print "incrementing up n to %d." % n
except:
pass
if len(ll) == 0:
return []
p = ll[0]
left = [x for x in ll if x < p]
mid = [x for x in ll if x == p]
right = [x for x in ll if x > p]
return qsort(left) + mid + qsort(right)

I'm not getting the results I hoped for. I expected that after the
program is finished, I would get results near

len(ll) * math.log(len(ll), 2)

since the big O for quicksort is nlogn for all but the worst case.


Are you sure you're not in the worst case?
Is ll random, or is it already sorted?
If already sorted, you should be seeing quadratic behavior.

--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science
Jul 18 '05 #7
Rusty Shackleford wrote:
Hi -

I'm studying algorithms and I want to write a python program that
calculates the actual runtimes.

I want to increment up some global variable called n in my program so
that I can see the relationship between the size of the problem and the
number of steps.

Here's my clumsy attempt to implement this:


One Thought:

Bear in mind that O(n) is the _asymptotic_ behavior for an algorithm. It
does not consider the "constant time" factors in an algorithm like the initial
instantiation of variables, loops, objects and so forth. It only
considers the complexity of the algorithm itself.

For example, suppose you have two algorithms with an asymptotic
complexity of O(n), but the first takes 5 seconds to set up its outer
loops and control variables. Say the second takes 5 milliseconds to do
the same thing. From an algorithmic complexity POV these are both O(n).

The point is that you have to use a sufficiently large 'n' to
make this constant overhead irrelevant to the test. If you are looking
for n*log(n) growth, you're probably not going to see it for
n=1, n=2, n=3 ... You're going to need n=100, n=1000, n=10000.
IOW, choose a large enough iteration count to dwarf the constant time
overhead of the program...

----------------------------------------------------------------------------
Tim Daneliuk tu****@tundraware.com
PGP Key: http://www.tundraware.com/PGP/
Jul 18 '05 #8
Rusty Shackleford <rs@overlook.homelinux.net>
(news:sl***************@frank.overlook.homelinux.n et) wrote:
Thanks for that explanation -- it really helped. I forgot that O(n)
can translate into dn + c where d and c are constants.
???
You'd NEVER write, e.g., O(n)=3n+7. What you'd say would be simply O(n)=n.
As already pointed out, this means that complexity (and, with that,
execution time) is PROPORTIONAL to n. It doesn't give you the number of
seconds or something.

Consider the following:
def foo(n):
for i in range(n):
for j in range(i):
print i, j
The "print" statement is executed n*n/2 times. However, O(n)=n^2: if you
increase n seven times, the complexity increases 49 times.

I think I'm going to keep trying to figure out ways to demonstrate
big O stuff, because I just don't understand it until I see it in a
non-abstract sense.

Thanks again.

Jul 18 '05 #9
Mitja wrote:
Rusty Shackleford <rs@overlook.homelinux.net>
(news:sl***************@frank.overlook.homelinux.n et) wrote:
Thanks for that explanation -- it really helped. I forgot that O(n)
can translate into dn + c where d and c are constants.


???
You'd NEVER write, e.g., O(n)=3n+7. What you'd say would be simply O(n)=n.
As already pointed out, this means that complexity (and, with that,
execution time) is PROPORTIONAL to n. It doesn't give you the number of
seconds or something.


If, on the other hand, you knew that one program took 3n+7 and another took
(n^2) / 10^10 you WOULD say that one's O(n) and the other's O(n^2). This
implies that the first one is faster for large data sets.

And if you don't care how long it takes for your one-off code to sort a
billion elements, insertion sort might be the fastest algorithm you have.
Especially if reordering elements is expensive. At which time you'd use a
Schwartzian transform.

I think the best way to see this stuff is to look toy examples, an
exponential algorithm with small constants versus a linear time algorithm
with a miserable initial cost.

Josh Gilbert.
--
http://www.uvm.edu/~jgilbert/public_key.gpg
Jul 18 '05 #10
Josh Gilbert wrote:
If, on the other hand, you knew that one program took 3n+7 and another
took
(n^2) / 10^10 you WOULD say that one's O(n) and the other's O(n^2).
This
implies that the first one is faster for large data sets.


For very, very large data sets. Remember that big-O notation is
designed to describe the behavior as n tends toward infinity (which is
why we drop the constant coefficients and the lesser terms). It's
important to remember that big-O notation is not intended for a direct
performance comparison of two algorithms in the real world; it's
intended to examine scalability issues in the big picture. Those
constant coefficients and lesser terms can make a big difference in the
real world, even for really quite large n.

--
__ Erik Max Francis && ma*@alcyone.com && http://www.alcyone.com/max/
/ \ San Jose, CA, USA && 37 20 N 121 53 W && AIM erikmaxfrancis
\__/ I'll tell them that their daddy was / A good man
-- India Arie
Jul 18 '05 #11
Rusty Shackleford wrote:
I'm not getting the results I hoped for. I expected that after the
program is finished, I would get results near

len(ll) * math.log(len(ll), 2)

since the big O for quicksort is nlogn for all but the worst case.


Just what results are you getting?

Jul 18 '05 #12
Rusty Shackleford wrote:
All help is welcome.


For quicksort, you should see ((n log(n))/runtime(n)) approach some
constant as you increase n.

Jul 18 '05 #13
In article <WW****************@news02.roc.ny>,
Matt <ma**@themattfella.zzzz.com> wrote:
Rusty Shackleford wrote:
All help is welcome.


For quicksort, you should see ((n log(n))/runtime(n)) approach some
constant as you increase n.


Well, except that he's not using random pivots. For the quicksort code
he listed, if the input is already sorted, the behavior should be
Theta(n^2).

--
David Eppstein http://www.ics.uci.edu/~eppstein/
Univ. of California, Irvine, School of Information & Computer Science
Jul 18 '05 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

7
by: Anders Borum | last post by:
Hello! I'm starting to dive into algorithms in general (sorting, trees etc.) and am currently reading a book from Robert Sedgewick called "Algorithms in C++, 3rd. Edition" and would like other...
19
by: Andy B | last post by:
Hello, Sorry for this newbish question. Briefly, my problem: ------------------ I expect the database I'm working on to reach something in the order of 12-16 Gigabytes, and I am interested...
20
by: JL | last post by:
I have a need to compute least cost formulations. This seems to be in the domain of "linear programming" of which I know practially nothing. Can anyone in the group give me a point in the right...
7
by: John A Grandy | last post by:
I'm trying to get a decent idea of the relative performance of three types of implementations of data-access classes in ASP.NET 2.0. I believe this boils down to a more basic question regarding...
26
by: vlsidesign | last post by:
I am a newbie and going through "The C programming language" by Kernighan & Richie on my own time (I'm not a programmer but I want to learn because it can save me time in my normal job, and it is...
0
by: doron.grinstein | last post by:
A lot of architects tackle the issue of exposing internal web services and web applications to the Internet. How many times do you see a requirement such as "the application should be accessible to...
3
by: arnuld | last post by:
i am looking for "algorithms in C++" book. Knuth is FULL of Mathematics, not my kind of author. i checked ACCU and got these (listing only those that are available in my country: 1. Algorithms...
17
by: Happy Man | last post by:
Truth Seeker http://www.thisistruth.org/truth.php?f=TruthSeeker No one is compelled to accept the truth, but it is certainly a shame upon the human intellect when a man is not even...
2
by: =?Utf-8?B?Q3JtTmV3Ymll?= | last post by:
Hi, 1) I want to hone my problem solving skills and be good at logic. How do I achieve this? 2) Where can I find c# puzzles or c# algorithms. 3) How do I print the values of a N X N matrix...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.