473,405 Members | 2,421 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,405 software developers and data experts.

Stack overflow . . . what should I do about it?

Hello all,

To begin, yes, this -is- a homework assignment. However, it is for my
Algorithms class, and my instructor has given us explicit permission to use
"expert" groups like newsgroups, so if that's your only reason not to help,
please do. Otherwise, I guess it's OK. But, just remember, I'm not asking
you to do my work for me, just to point out my error.

My problem is not with the algorithm itself (standard divide and conquer on
a set of points to find the closest pair), but with the C++, that is, my
writing of it. I seem to be getting a stack overflow exception when the
input is too large, and I don't know what to do about it. This exception
has occurred when I compile on g++ in Cygwin and when I compile using VC++
(whatever the newest version is) on Windows. Cygwin does not do anything
when it fails, it simply stops in the middle of printing the diagnostics
(that is, printing a running tally of the closest points). I loaded up the
Beta version of the Visual C++ ide since I don't know gdb very well/at all.
When I got to a certain point in the code, VC++ told me that a stack
overflow exception had been thrown.

I know my algorithm is correct, because it works on random cases between 10
and 100, so I think that if I can defeat this stack overflow, I will have
defeated the entire problem. I am including the code that I think is
relevant: there is other code in the project, but it just performs
administrative tasks. This is the entire algorithm. More talking at the
end.

----- CODE BEGINS -----
// CLOSEST-PAIR.CC
// Implementation of the closest-pair algorithm

#include <iostream>
#include <cfloat>
#include "closest-pair.hh"

const double INFINITY = DBL_MAX; //bigger than any possible interpoint
distance
void printPointPair(const Point&, const Point&);
void printPointPair(const PointPair&);
PointPair divide(Point**, Point**, int);
/*
Closest Pair Functions

Given a collection of nPoints points, find and ***print***
* the closest pair of points
* the distance between them
in the form "(x1, y1) (x2, y2) distance"

INPUTS:
- points sorted in nondecreasing order by X coordinate
- points sorted in nondecreasing order by Y coordinate
- # of input points
*/
PointPair bruteForce(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
Point a, b;
double min = INFINITY, dist;
for (int i = 0; i < nPoints - 1; ++i)
for (int j = i + 1; j < nPoints; ++j) {
dist = pointsByX[i]->distance(pointsByX[j]);
if (min > dist) {
min = dist;
a = *pointsByX[i];
b = *pointsByX[j];
}
}
printPointPair(a, b);
return PointPair(a, b);
}

PointPair divCon(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
PointPair p = divide(pointsByX, pointsByY, nPoints);
printPointPair(p);
return p;
}

PointPair divide(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
using namespace std;
if (nPoints > 3) {
//divide
// set line
int bisect = pointsByX[nPoints/2]->x();
// sort Y lists to create new ones
Point* yL[nPoints];
Point* yR[nPoints];
int lenL = 0, lenR = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY[i]->x() <= bisect) {
yL[lenL] = pointsByY[i];
++lenL;
}
if (pointsByY[i]->x() >= bisect) {
yR[lenR] = pointsByY[i];
++lenR;
}
}

//conquer
// pass sorted arrays for repeat
PointPair pair1 = divide(pointsByX, yL, lenL);
PointPair pair2 = divide(pointsByX + nPoints - lenR, yR, lenR);
PointPair correct;
//combine
// find mindist
double p1Dist = pair1.distance(),
p2Dist = pair2.distance(),
min = 0;
if (p1Dist < p2Dist) {
correct = pair1;
min = p1Dist;
} else {
correct = pair2;
min = p2Dist;
}

// remove all non-2(mindist) points from y array
// I reuse yL here, since it is the longer than the array I need anyway
int stripLen = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY[i]->x() >= bisect - min
|| pointsByY[i]->x() <= bisect + min) {
yL[stripLen] = pointsByY[i];
++stripLen;
}
}
// then check next seven repeated until the end of the list
for (int i = 0; i < stripLen; ++i)
for (int j = i+1; j-i < 8 && j < stripLen; ++j)
if (yL[i]->distance(yL[j]) < min) {
min = yL[i]->distance(yL[j]);
correct.p1() = yL[i];
correct.p2() = yL[j];
}
return correct;
} else {
return bruteForce(pointsByX, pointsByY, nPoints);
}
}

void printPointPair(const PointPair& pp)
{
using namespace std;
double dist = pp.distance();

cout << pp << " " << dist << endl;
}

void printPointPair(const Point& a, const Point& b)
{
using namespace std;
double dist = a.distance(&b);
cout << a << " " << b << " " << dist << endl;
}

----- CODE ENDS -----

I thought that I might be concerned about the temporary arrays that I
create. However, I've been informed that since they are temporary (here I
am referring to yL and yR) they are deleted every time I exit their scope.
If this is the case, it would seem (to me) logical that the stack overflow
would occur on the first branch, but this is not the case. The first
branch, even on very large inputs (up to n = 10000) completes successfully
(according to the debugger and my diagnostics). Also, for reference, when I
compile it on G++ in Cygwin, n = 187 completes successfully, but n = 188
does not. This is consistently the case, and I can show it to repeat.

If needed, I can post the entire project, but I think with as many smart
guys as there are here, someone will find my problem just in this post.

Yours,

James
Jul 22 '05 #1
7 7044

"Aguilar, James" <jf**@cec.NOBOTSwustl.edu> wrote in message
news:ch**********@newsreader.wustl.edu...
Hello all,

To begin, yes, this -is- a homework assignment. However, it is for my
Algorithms class, and my instructor has given us explicit permission to
use "expert" groups like newsgroups, so if that's your only reason not to
help, please do. Otherwise, I guess it's OK. But, just remember, I'm not
asking you to do my work for me, just to point out my error.

My problem is not with the algorithm itself (standard divide and conquer
on a set of points to find the closest pair), but with the C++, that is,
my writing of it. I seem to be getting a stack overflow exception when
the input is too large, and I don't know what to do about it. This
exception has occurred when I compile on g++ in Cygwin and when I compile
using VC++ (whatever the newest version is) on Windows. Cygwin does not
do anything when it fails, it simply stops in the middle of printing the
diagnostics (that is, printing a running tally of the closest points). I
loaded up the Beta version of the Visual C++ ide since I don't know gdb
very well/at all. When I got to a certain point in the code, VC++ told me
that a stack overflow exception had been thrown.

I know my algorithm is correct, because it works on random cases between
10 and 100, so I think that if I can defeat this stack overflow, I will
have defeated the entire problem. I am including the code that I think is
relevant: there is other code in the project, but it just performs
administrative tasks. This is the entire algorithm. More talking at the
end.

----- CODE BEGINS -----
[snip]

PointPair divide(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
using namespace std;
if (nPoints > 3) {
//divide
// set line
int bisect = pointsByX[nPoints/2]->x();
// sort Y lists to create new ones
Point* yL[nPoints];
Point* yR[nPoints];
int lenL = 0, lenR = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY[i]->x() <= bisect) {
yL[lenL] = pointsByY[i];
++lenL;
}
if (pointsByY[i]->x() >= bisect) {
yR[lenR] = pointsByY[i];
++lenR;
}
}

//conquer
// pass sorted arrays for repeat
PointPair pair1 = divide(pointsByX, yL, lenL);
PointPair pair2 = divide(pointsByX + nPoints - lenR, yR, lenR);
PointPair correct;
//combine
// find mindist
double p1Dist = pair1.distance(),
p2Dist = pair2.distance(),
min = 0;
if (p1Dist < p2Dist) {
correct = pair1;
min = p1Dist;
} else {
correct = pair2;
min = p2Dist;
}

// remove all non-2(mindist) points from y array
// I reuse yL here, since it is the longer than the array I need anyway
int stripLen = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY[i]->x() >= bisect - min
|| pointsByY[i]->x() <= bisect + min) {
yL[stripLen] = pointsByY[i];
++stripLen;
}
}
// then check next seven repeated until the end of the list
for (int i = 0; i < stripLen; ++i)
for (int j = i+1; j-i < 8 && j < stripLen; ++j)
if (yL[i]->distance(yL[j]) < min) {
min = yL[i]->distance(yL[j]);
correct.p1() = yL[i];
correct.p2() = yL[j];
}
return correct;
} else {
return bruteForce(pointsByX, pointsByY, nPoints);
}
}

[snip]

----- CODE ENDS -----

I thought that I might be concerned about the temporary arrays that I
create. However, I've been informed that since they are temporary (here I
am referring to yL and yR) they are deleted every time I exit their scope.
Not necessarily. Many compilers will only reclaim the space of those
temporary arrays when the whole function is exitted. In other words stack
operations are only performed on entry and exit of a function.

Also that code is illegal. An array size must be a compile time constant;
which nPoints clearly is not. g++ only allows this code as a language
extension, try compiling with the --ansi option (I think) and you should get
a error.

So, one solution is to dynamically allocate those arrays, that will make
your code more standard and will reduce the stack usage.
If this is the case, it would seem (to me) logical that the stack overflow
would occur on the first branch, but this is not the case. The first
branch, even on very large inputs (up to n = 10000) completes successfully
(according to the debugger and my diagnostics). Also, for reference, when
I compile it on G++ in Cygwin, n = 187 completes successfully, but n = 188
does not. This is consistently the case, and I can show it to repeat.

If needed, I can post the entire project, but I think with as many smart
guys as there are here, someone will find my problem just in this post.

Yours,

James


john
Jul 22 '05 #2
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2p************@uni-berlin.de...

[snip info about dynamic allocation]


Well, I tried that, and you're right, it was a GNU extension. I repaired
the error, but it seems that my problem still remains. In fact, there was
no difference in performance after the edit (i.e. still fails at 188, still
succeeds at 187) so I'm still up in the air about what this might be. Any
other takers?

James
Jul 22 '05 #3
Aguilar, James wrote:

I know my algorithm is correct, because it works on random cases between 10
and 100,


That way you know it is correct for the random cases you've
tried, but not that it is correct in general. To know that you'd
have to try it for all possible cases.

To address your problem -- you either need to allocate your arrays
dynamically, using 'new' -- that way they will be allocated from
the heap, not from the stack (let's forget for a moment that the
standard does not define 'heap' and 'stack') or you can try an
environment-specific way to increase the stack size. Go for the
first solution!

Under WIN32 or UNIX you can use this snippet to check how big
the stack is:

#include <iostream>

#ifndef __WIN32__
#include <sys/time.h> // getrlimit()
#include <sys/resource.h> // getrlimit()
#include <unistd.h> // getrlimit()
#else
#include <malloc.h>
#endif

using namespace std;

void check_stack() {
cerr << "Stack size is: ";
#ifndef __WIN32__
rlimit lim;
getrlimit(RLIMIT_STACK,&lim);
cerr << lim.rlim_cur << endl;
#else
size_t avail=stackavail();
cerr << avail << endl;
#endif
}

HTH,
- J.
Jul 22 '05 #4

"Aguilar, James" <jf**@cec.NOBOTSwustl.edu> skrev i en meddelelse
news:ch**********@newsreader.wustl.edu...
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2p************@uni-berlin.de...

[snip info about dynamic allocation]
Well, I tried that, and you're right, it was a GNU extension. I repaired
the error, but it seems that my problem still remains. In fact, there was
no difference in performance after the edit (i.e. still fails at 188,

still succeeds at 187) so I'm still up in the air about what this might be. Any
other takers?

James


I have not checked your code, but if you replace a stack-based array with a
dynamically allocated one and get the stack overflow at the same location in
problem space, there almost certainly is a problem with your algorithm. Why
don't you debug the program?

/Peter
Jul 22 '05 #5

"Peter Koch Larsen" <pk*****@mailme.dk> wrote in message
news:ZP********************@news000.worldonline.dk ...

"Aguilar, James" <jf**@cec.NOBOTSwustl.edu> skrev i en meddelelse
news:ch**********@newsreader.wustl.edu...
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2p************@uni-berlin.de...
>
> [snip info about dynamic allocation]


Well, I tried that, and you're right, it was a GNU extension. I repaired
the error, but it seems that my problem still remains. In fact, there
was
no difference in performance after the edit (i.e. still fails at 188,

still
succeeds at 187) so I'm still up in the air about what this might be.
Any
other takers?

James


I have not checked your code, but if you replace a stack-based array with
a
dynamically allocated one and get the stack overflow at the same location
in
problem space, there almost certainly is a problem with your algorithm.
Why
don't you debug the program?

/Peter


Code seems wrong to me. Looks to me like the divide could fail and the
entire x and y arrays be passed to one of the recursive calls.

Obviously this would result in an infinite recursion and a stack overflow.

john
Jul 22 '05 #6

"John Harrison" <jo*************@hotmail.com> wrote in message
news:2p************@uni-berlin.de...

"Peter Koch Larsen" <pk*****@mailme.dk> wrote in message
news:ZP********************@news000.worldonline.dk ...

"Aguilar, James" <jf**@cec.NOBOTSwustl.edu> skrev i en meddelelse
news:ch**********@newsreader.wustl.edu...
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2p************@uni-berlin.de...
>
> [snip info about dynamic allocation]

Well, I tried that, and you're right, it was a GNU extension. I
repaired
the error, but it seems that my problem still remains. In fact, there
was
no difference in performance after the edit (i.e. still fails at 188,

still
succeeds at 187) so I'm still up in the air about what this might be.
Any
other takers?

James


I have not checked your code, but if you replace a stack-based array with
a
dynamically allocated one and get the stack overflow at the same location
in
problem space, there almost certainly is a problem with your algorithm.
Why
don't you debug the program?

/Peter


Code seems wrong to me. Looks to me like the divide could fail and the
entire x and y arrays be passed to one of the recursive calls.

Obviously this would result in an infinite recursion and a stack overflow.

john


I'll check it out and report back.

James
Jul 22 '05 #7
"John Harrison" <jo*************@hotmail.com> wrote in message
news:2p************@uni-berlin.de...

Code seems wrong to me. Looks to me like the divide could fail and the
entire x and y arrays be passed to one of the recursive calls.

Obviously this would result in an infinite recursion and a stack overflow.

john


I found the problem. It was in this snippet:

----- CODE BEGINS -----

int bisect = pointsByX[nPoints/2]->x();
// sort Y lists to create new ones
Point* yL[nPoints];
Point* yR[nPoints];
int lenL = 0, lenR = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY[i]->x() <= bisect) {
yL[lenL] = pointsByY[i];
++lenL;
}
if (pointsByY[i]->x() >= bisect) {
yR[lenR] = pointsByY[i];
++lenR;
}
}

----- CODE ENDS -----

Which now looks like this:

----- CODE BEGINS -----

Point* bisect = pointsByX[nPoints/2];
// sort Y lists to create new ones
Point** yL = new Point*[nPoints];
Point** yR = new Point*[nPoints];
int lenL = 0, lenR = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY[i]->isLeftOf(bisect)) {
yL[lenL] = pointsByY[i];
++lenL;
}
if (!(pointsByY[i]->isLeftOf(bisect))) {
yR[lenR] = pointsByY[i];
++lenR;
}
}
----- CODE ENDS -----

isLeftOf takes into account height, so that even if two points are on the
bisector line (which becomes more and more likely with larger and larger
random inputs), the ones below the point which is being checked will return
true for isLeftOf and the ones above will return false. This is OK because
after pL and pR have been divided, they are merged for the checking of the
2*(minDist) strip, which means that even if the closest pair is one of the
two points on the line, it will still be caught.

The algorithm works like a charm, making relatively quick work of input
sizes up to n = 1000000 (13000 milliseconds). However, with inputs that
large, my computer really starts to load up, since that's 150 MiB just in
terms of the points that are actually created at the start. I guess that
means I could conceivably sort four million points on my computer, but
beyond that, it's probably going to overflow my memory.

The cool thing is how, even on inputs of as small as 10000, the divide and
conquer is already orders of magnitude faster than the brute force version.

Yours,

James
Jul 22 '05 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

12
by: jimjim | last post by:
Hello, 1. As the subject sugests, are the objects created in the stack guarranted to have been created? 2. How can I verify this? (e.g. for objects created in the heap A *a = new A one can...
19
by: Jim | last post by:
I have spent the past few weeks designing a database for my company. The problem is I have started running into what I believe are stack overflow problems. There are two tab controls on the form...
4
by: Victor | last post by:
Hello, I've got a situation in which the number of (valid) recursive calls I make will cause stack overflow. I can use getrlimit (and setrlimit) to test (and set) my current stack size. ...
5
by: ip4ram | last post by:
I am quite puzzled by the stack overflow which I am encountering.Here is the pseudocode //define stack structure //function operating on stack void my_stack_function( function parameters) {...
0
by: Howard Kaikow | last post by:
I recently discovered http://msdn.microsoft.com/library/en-us/vclib/html/vclrf_resetstkoflw.asp, which states: "The _resetstkoflw function recovers from a stack overflow condition, enabling a...
24
by: John | last post by:
I know this is a very fundamental question. I am still quite confused if the program call stack stack should always grows upwards from the bottom, or the opposite, or doesn't matter?? That means...
7
by: amit.atray | last post by:
Environement : Sun OS + gnu tools + sun studio (dbx etc) having some Old C-Code (ansi + KR Style) and code inspection shows some big size variable (auto) allocated (on stack) say for ex. char...
87
by: CJ | last post by:
Hello: We know that C programs are often vulnerable to buffer overflows which overwrite the stack. But my question is: Why does C insist on storing local variables on the stack in the first...
5
by: moondaddy | last post by:
I'm trying to serialize some xaml elements using the XamlWriter and I'm getting a stack overflow error because the elements have some public properties which get called about a million ( a lot)...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.