473,387 Members | 1,517 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,387 software developers and data experts.

Where's the bug?

As a programming exercise, I was playing with the quicksort algorithm.
The commented line in the "partition" function speeds things up a
bit. However, when used on (my machine) arrays larger than about
5,000,000 elements, the sort appears to freeze. I don't understand
this behavior. With this line commented as shown, the sort works fine
up to available memory (circa 50,000,000 elements, (200MB) on my 256MB
machine). I'm guessing the error has to do with "val" taking a
minimum or maximum value, but I can't fugure it out. If this Q is
off-topic to c++ and more appropriate on an "algorithm-oriented"
newsgroup, I apologize in advance.

Thanks,

Joe

#include <iostream>
const int kSMALL_ENOUGH = 16; //when to call selection-sort

using namespace std;

void fillRND(unsigned int* a, int size);
void showit(unsigned int* a, int size);
void quicksort(unsigned int* a, int left, int right);
int partition(unsigned int* a, int left, int right);
void selectionsort(unsigned int* a, int left, int right);
int verify(unsigned int* a, int size);

int main(){
int clo;
cout << "Enter the number of elements to be sorted. ";
int sz;
cin >> sz;
unsigned int* a = new unsigned int[sz];
cout << "generating data...";
fillRND(a, sz);
cout << "done\nchecking data...";
verify(a, sz);
//showit(a, sz);

clo=clock();
cout << "sorting data...";
quicksort(a, 0, sz-1);
cout << "done in " << clock() - clo << " (ms)\nchecking data...";
verify(a, sz);
//showit(a, sz);

delete [] a;
system("pause");
return 0;
}

void quicksort(unsigned int* a, int left, int right){
if(left + kSMALL_ENOUGH < right) {
int split_pt = partition(a,left, right);
quicksort(a, left, split_pt);
quicksort(a, split_pt+1, right);
}else selectionsort(a, left, right);
}

int partition(unsigned int* a, int left, int right){
unsigned int val = a[left]///2 + a[right]/2
;//<---note end of line here!
int lm = left-1;
int rm = right+1;
for(;;){
while(a[--rm] > val);
while(a[++lm] < val);
if(lm < rm){
unsigned int temp = a[rm];
a[rm] = a[lm];
a[lm] = temp;
}else return rm;
}
}

void selectionsort(unsigned int* data, int left, int right){
for(int i = left; i < right; ++i) {
int min = i;
for(int j=i+1; j <= right; ++j)
if(data[j] < data[min]) min = j;
unsigned int temp = data[min];
data[min] = data[i];
data[i] = temp;
}
}

int verify(unsigned int* a, int size){
for(int i = 0; i<size-1; ++i){
if(a[i] > a[i+1]){
cout << "numbers not sorted\n";
return 1;
}
}
cout << "numbers are sorted\n";
return 0;
}

void fillRND(unsigned int* a, int size){
srand(time(0));
int bits = (CHAR_BIT * sizeof(int))/2;
for(int i = 0; i < size; ++i)
a[i] = rand() + (rand() << bits); //to fill more than 15 bits...
}

void showit(unsigned int* a, int size){
for(int i = 0; i < size; ++i)
cout << i << ' ' << a[i] << endl;
cout << endl;
}
Jul 22 '05 #1
7 1183
"J. Campbell" wrote:
int partition(unsigned int* a, int left, int right){
unsigned int val = a[left]///2 + a[right]/2
;//<---note end of line here!
int lm = left-1;
int rm = right+1;
for(;;){
while(a[--rm] > val);
while(a[++lm] < val);
if(lm < rm){
unsigned int temp = a[rm];
a[rm] = a[lm];
a[lm] = temp;
}else return rm;
}
}


consider your array to sort consists of just 2 numbers

5 5

what will be the value of val?

a[left] / 2 -> 5 / 2 -> 2
a[right] / 2 -> 5 / 2 -> 2

val therefore will get a value of 4

now look at the while loops
while( a[--rm] > val);
that loop will sure run out of bounds
while( a[++lm] < val);
again: out of bounds

Out of bounds may mean in this case: outside the
range left-right. The partition function then swaps
some array elements it shouldn't, which will influence
the whole algorithm.

The problem is, that with the calculation you do, you may end up
with a number for val which is not even in the boundaries of the
numbers you try to sort (Sorry: I am not a native english speaker
so the above may sound confusing, but I don't know how to express
it better).

One technique I learned for comming up with a value for that trashold
value is something like this:
take a[left], a[right], a[(left+right)/2] and discard the smallest
and highest value (sort them and take the second in sorting order)

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #2
"J. Campbell" <ma**********@yahoo.com> wrote...
As a programming exercise, I was playing with the quicksort algorithm.
The commented line in the "partition" function speeds things up a
bit. However, when used on (my machine) arrays larger than about
5,000,000 elements, the sort appears to freeze. I don't understand
this behavior. With this line commented as shown, the sort works fine
up to available memory (circa 50,000,000 elements, (200MB) on my 256MB
machine). I'm guessing the error has to do with "val" taking a
minimum or maximum value, but I can't fugure it out. If this Q is
off-topic to c++ and more appropriate on an "algorithm-oriented"
newsgroup, I apologize in advance.


It's not related to language, nor it is related to algorithms, I am
afraid. It's completely compiler- and system-specific. The secret
is that your program is likely causing stack overflow or some such.

Quicksort is recursive. The depth of its recursion depends on data
(quantity and contents) and difficult to predict. Since it uses
[stack] memory for recursion, you can expect it to behave differently
if you allocate more for stack, but that's beyond the scope of the
language. See your compiler options.

Victor
Jul 22 '05 #3

"Karl Heinz Buchegger" <kb******@gascad.at> wrote in message
news:3F***************@gascad.at...
"J. Campbell" wrote:
int partition(unsigned int* a, int left, int right){
unsigned int val = a[left]///2 + a[right]/2
;//<---note end of line here!
int lm = left-1;
int rm = right+1;
for(;;){
while(a[--rm] > val);
while(a[++lm] < val);
if(lm < rm){
unsigned int temp = a[rm];
a[rm] = a[lm];
a[lm] = temp;
}else return rm;
}
}


consider your array to sort consists of just 2 numbers


Karl, Thanks for the reply. I need to consider your suggestions more
closely. However, please note that in my implementation, the array never
has fewer than 16 elements (else the selection sort is used). I need to
examine if it has to do with overrunning the array, as you suggest, or with
overrunning the memory, as suggested by victor. (Or another problem
altogether). Thanks again.

5 5

what will be the value of val?

a[left] / 2 -> 5 / 2 -> 2
a[right] / 2 -> 5 / 2 -> 2

val therefore will get a value of 4

now look at the while loops
while( a[--rm] > val);
that loop will sure run out of bounds
while( a[++lm] < val);
again: out of bounds

Out of bounds may mean in this case: outside the
range left-right. The partition function then swaps
some array elements it shouldn't, which will influence
the whole algorithm.

The problem is, that with the calculation you do, you may end up
with a number for val which is not even in the boundaries of the
numbers you try to sort (Sorry: I am not a native english speaker
so the above may sound confusing, but I don't know how to express
it better).

One technique I learned for comming up with a value for that trashold
value is something like this:
take a[left], a[right], a[(left+right)/2] and discard the smallest
and highest value (sort them and take the second in sorting order)

--
Karl Heinz Buchegger
kb******@gascad.at

Jul 22 '05 #4

"Karl Heinz Buchegger" <kb******@gascad.at> wrote in message
news:3F***************@gascad.at...

consider your array to sort consists of just 2 numbers

5 5

what will be the value of val?

a[left] / 2 -> 5 / 2 -> 2
a[right] / 2 -> 5 / 2 -> 2

val therefore will get a value of 4

now look at the while loops
while( a[--rm] > val);
that loop will sure run out of bounds
while( a[++lm] < val);
again: out of bounds

Out of bounds may mean in this case: outside the
range left-right. The partition function then swaps
some array elements it shouldn't, which will influence
the whole algorithm.

The problem is, that with the calculation you do, you may end up
with a number for val which is not even in the boundaries of the
numbers you try to sort (Sorry: I am not a native english speaker
so the above may sound confusing, but I don't know how to express
it better).

One technique I learned for comming up with a value for that trashold
value is something like this:
take a[left], a[right], a[(left+right)/2] and discard the smallest
and highest value (sort them and take the second in sorting order)

--
Karl Heinz Buchegger
kb******@gascad.at


Karl...I understand your point, and I believe you are correct...thanks for
the help. Joe
Jul 22 '05 #5

"Karl Heinz Buchegger" <kb******@gascad.at> wrote in message
news:3F***************@gascad.at...

consider your array to sort consists of just 2 numbers

5 5

what will be the value of val?

a[left] / 2 -> 5 / 2 -> 2
a[right] / 2 -> 5 / 2 -> 2

val therefore will get a value of 4

now look at the while loops
while( a[--rm] > val);
that loop will sure run out of bounds
while( a[++lm] < val);
again: out of bounds

Out of bounds may mean in this case: outside the
range left-right. The partition function then swaps
some array elements it shouldn't, which will influence
the whole algorithm.

The problem is, that with the calculation you do, you may end up
with a number for val which is not even in the boundaries of the
numbers you try to sort (Sorry: I am not a native english speaker
so the above may sound confusing, but I don't know how to express
it better).

One technique I learned for comming up with a value for that trashold
value is something like this:
take a[left], a[right], a[(left+right)/2] and discard the smallest
and highest value (sort them and take the second in sorting order)

--
Karl Heinz Buchegger
kb******@gascad.at


Thanks Karl...this modification fixes the bug....and maintains the speed
boost ;-) good job to us both!!!!

#include <iostream>
const int kSMALL_ENOUGH = 16; //when to call selection-sort

using namespace std;

void fillRND(unsigned int* a, int size);
void showit(unsigned int* a, int size);
void quicksort(unsigned int* a, int left, int right);
int partition(unsigned int* a, int left, int right);
void selectionsort(unsigned int* a, int left, int right);
int verify(unsigned int* a, int size);
//unsigned int triplet(unsigned int* a, int left, int right);

int main(){
int clo;
cout << "Enter the number of elements to be sorted. ";
int sz;
cin >> sz;
unsigned int* a = new unsigned int[sz];
cout << "generating data...";
fillRND(a, sz);
cout << "done\nchecking data...";
verify(a, sz);
//showit(a, sz);

clo=clock();
cout << "sorting data...";
quicksort(a, 0, sz-1);
cout << "done in " << clock() - clo << " (ms)\nchecking data...";
verify(a, sz);
//showit(a, sz);

delete [] a;
system("pause");
return 0;
}

void quicksort(unsigned int* a, int left, int right){
if(left + kSMALL_ENOUGH < right) {
int split_pt = partition(a,left, right);
quicksort(a, left, split_pt);
quicksort(a, split_pt+1, right);
}else selectionsort(a, left, right);
}

int partition(unsigned int* a, int left, int right){
unsigned int val = a[left]/2 + a[right]/2;// = triplet(unsigned int* a,
int left, int right);
unsigned int x = a[left];
unsigned int z = a[right];

if(x < val)
if(val < z); //val = y; //x < y < z
else if(x < z) val = z; //x < z < y
else val = x; //z < x < y
else if(z < val); //val = y; //z < y < x
else if(z < x) val = z; //y < z < x
else val = x; //y < x < z

int lm = left-1;
int rm = right+1;
for(;;){
while(a[--rm] > val);
while(a[++lm] < val);
if(lm < rm){
unsigned int temp = a[rm];
a[rm] = a[lm];
a[lm] = temp;
}else return rm;
}
}

void selectionsort(unsigned int* data, int left, int right){
for(int i = left; i < right; ++i) {
int min = i;
for(int j=i+1; j <= right; ++j)
if(data[j] < data[min]) min = j;
unsigned int temp = data[min];
data[min] = data[i];
data[i] = temp;
}
}

int verify(unsigned int* a, int size){
for(int i = 0; i<size-1; ++i){
if(a[i] > a[i+1]){
cout << "numbers not sorted\n";
return 1;
}
}
cout << "numbers are sorted\n";
return 0;
}

void fillRND(unsigned int* a, int size){
srand(time(0));
int bits = (CHAR_BIT * sizeof(int))/2;
for(int i = 0; i < size; ++i)
a[i] = rand() + (rand() << bits); //to fill more than 15 bits...
}

void showit(unsigned int* a, int size){
for(int i = 0; i < size; ++i)
cout << i << ' ' << a[i] << endl;
cout << endl;
}
Jul 22 '05 #6
Joe C wrote:
consider your array to sort consists of just 2 numbers
Karl, Thanks for the reply. I need to consider your suggestions more
closely. However, please note that in my implementation, the array never
has fewer than 16 elements (else the selection sort is used).


Well. With such a large array of ints, it's quite possible that you
have a sequence of 16 identical numbers which qualify for that problem.
I need to
examine if it has to do with overrunning the array, as you suggest, or with
overrunning the memory, as suggested by victor. (Or another problem
altogether). Thanks again.


You have an 'endless loop' in partition(). That's OK, since there
is a return in it, thus it isn't realy endless. But I would start
with setting a high bound in this for loop and see if this bound is
ever reached:

int lm = left - 1;
int rm = right + 1;

for( int tmp = 0; tmp < 500000; ++tmp ) {
while( a[--rm] > val );
while( a[++lm] < val );

...
}

/* should never get here. */
assert( 0 );

I would also insert some assertion into the while loop

while( a[--rm] > val )
assert( rm > left );

while( a[++rm] < val )
assert( lm < right );
If things are the way I think they are (which is hard to diagnose
with a dataset of 50,000,000 items) one of the assertions will fire.
You then have a foot in the door and your debugger will tell you more.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #7
Joe C wrote:


Thanks Karl...this modification fixes the bug....and maintains the speed
boost ;-) good job to us both!!!!
Glad to hear that.
It was some shot in the dark. When seeing the partition function
and recognizing what val is used for I thought by myself: That's
strange, I have never seen it done that way (and I haven't seen
a quicksort since years).
int partition(unsigned int* a, int left, int right){
unsigned int val = a[left]/2 + a[right]/2;// = triplet(unsigned int* a,
int left, int right);
unsigned int x = a[left];
unsigned int z = a[right];

if(x < val)
if(val < z); //val = y; //x < y < z
else if(x < z) val = z; //x < z < y
else val = x; //z < x < y
else if(z < val); //val = y; //z < y < x
else if(z < x) val = z; //y < z < x
else val = x; //y < x < z


Yep. That looks like the thing I can remember from
my algorithms class (nearly 20 years ago). The idea
is: Ideally you want the median of all the numbers. But
the median is expensive to compute. On the other hand you
definitly want to avoid to take the lowest or the highest
value, since no partitioning would take place then. So what
to do: take 3 samples of the numbers (it doesn't matter which
ones) and choose the middle. That's good enough in most cases.

--
Karl Heinz Buchegger
kb******@gascad.at
Jul 22 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

47
by: Andrey Tatarinov | last post by:
Hi. It would be great to be able to reverse usage/definition parts in haskell-way with "where" keyword. Since Python 3 would miss lambda, that would be extremly useful for creating readable...
3
by: A.V.C. | last post by:
Hello, I found members of this group very helpful for my last queries. Have one problem with CASE. I can use the column name alias in Order By Clause but unable to use it in WHERE CLAUSE. PLS...
3
by: Xiangliang Meng | last post by:
Hi, all. In 1998, I graduated from Computer Science Dept. in a university in China. Since then, I've been using C Language for almost 6 years. Although I'm using C++ in my current job, I'm also...
7
by: Britney | last post by:
Original code: this.oleDbSelectCommand1.CommandText = "SELECT TOP 100 user_id, password, nick_name, sex, age, has_picture, city, state, " + "country FROM dbo.users WHERE (has_picture = ?) AND (sex...
5
by: comp.lang.php | last post by:
if ($willLimitByDB) $sql = preg_replace('/#(+)#/i', '$$1', $sql); This does not give me the results I want, instead of the value of $where in $sql, I literally get '$where' instead. How do I...
5
by: John | last post by:
I just cannot manage to perform a SELECT query with NULL parameter... My CATEGORY table does have one row where TCATEGORYPARENTID is null (real DB null value). TCATEGORYID and TCATEGORYPARENTID...
0
NeoPa
by: NeoPa | last post by:
Background Whenever code is used there must be a way to differentiate the actual code (which should be interpreted directly) with literal strings which should be interpreted as data. Numbers don't...
1
by: not_a_commie | last post by:
I was hoping for increased functionality with the where clause in C# 3.0. Using the new keyword 'var' would really allow us to take nice advantage of these. Specifically: 1. I want to limit it...
9
by: Emin | last post by:
Dear Experts, I have a fairly simple query in which adding a where clause slows things down by at least a factor of 100. The following is the slow version of the query ...
8
by: chrisdavis | last post by:
I'm trying to filter by query or put those values in a distinct query in a where clause in some sort of list that it goes through but NOT at the same time. Example: ROW1 ROW2 ROW3 ROW4 ,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.