By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,672 Members | 1,333 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,672 IT Pros & Developers. It's quick & easy.

Error with long running web spider

P: n/a
Hi everyone:

I have a spider that is relatively long running (somewhere between
12-24 hours). My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout. The program itself appears to run in about 14
megs of memory. Basically, the program looks up pages on a particular
website, and then reads the HTML of those pages, parses it (lots of
long regular expressions are used), and saves the found information to
an object (which is later translated to SQL and the SQL is written to
a file).

I've actually had this same problem with several long running Python
programs. Any ideas?

Thanks in advance.

Aug 22 '07 #1
Share this Question
Share on Google+
3 Replies

P: n/a
On Aug 22, 10:58 am, Josh Volz <jdv...@gmail.comwrote:

I'm running this program on Windows XP, using Python 2.5. I'm using
Active State Komodo IDE 4.0 as the run environment.

Thanks,
J.

Hi everyone:

I have a spider that is relatively long running (somewhere between
12-24 hours). My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout. The program itself appears to run in about 14
megs of memory. Basically, the program looks up pages on a particular
website, and then reads the HTML of those pages, parses it (lots of
long regular expressions are used), and saves the found information to
an object (which is later translated to SQL and the SQL is written to
a file).

I've actually had this same problem with several long running Python
programs. Any ideas?

Thanks in advance.

Aug 22 '07 #2

P: n/a
Josh Volz <jd****@gmail.comwrote:
I have a spider that is relatively long running (somewhere between
12-24 hours). My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout. The program itself appears to run in about 14
megs of memory. Basically, the program looks up pages on a particular
website, and then reads the HTML of those pages, parses it (lots of
long regular expressions are used), and saves the found information to
an object (which is later translated to SQL and the SQL is written to
a file).

I've actually had this same problem with several long running Python
programs. Any ideas?
If you were running under unix I'd suggest you "strace" the process to
see what it is doing. There are windwows strace programs (which I've
never tried) too!

You'll probably find it is wedged in TCP socket code.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Aug 22 '07 #3

P: n/a
In message <11**********************@l22g2000prc.googlegroups .com>, Josh
Volz wrote:
My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout.
What happens afterwards? Does it continue running as though nothing had
happened? Throw an exception?

From the output that appears beforehand, does it look like the freeze is
always happening in the same place?
Aug 27 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.