472,110 Members | 2,224 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,110 software developers and data experts.

Error with long running web spider

Hi everyone:

I have a spider that is relatively long running (somewhere between
12-24 hours). My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout. The program itself appears to run in about 14
megs of memory. Basically, the program looks up pages on a particular
website, and then reads the HTML of those pages, parses it (lots of
long regular expressions are used), and saves the found information to
an object (which is later translated to SQL and the SQL is written to
a file).

I've actually had this same problem with several long running Python
programs. Any ideas?

Thanks in advance.

Aug 22 '07 #1
3 1118
On Aug 22, 10:58 am, Josh Volz <jdv...@gmail.comwrote:

I'm running this program on Windows XP, using Python 2.5. I'm using
Active State Komodo IDE 4.0 as the run environment.

Thanks,
J.

Hi everyone:

I have a spider that is relatively long running (somewhere between
12-24 hours). My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout. The program itself appears to run in about 14
megs of memory. Basically, the program looks up pages on a particular
website, and then reads the HTML of those pages, parses it (lots of
long regular expressions are used), and saves the found information to
an object (which is later translated to SQL and the SQL is written to
a file).

I've actually had this same problem with several long running Python
programs. Any ideas?

Thanks in advance.

Aug 22 '07 #2
Josh Volz <jd****@gmail.comwrote:
I have a spider that is relatively long running (somewhere between
12-24 hours). My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout. The program itself appears to run in about 14
megs of memory. Basically, the program looks up pages on a particular
website, and then reads the HTML of those pages, parses it (lots of
long regular expressions are used), and saves the found information to
an object (which is later translated to SQL and the SQL is written to
a file).

I've actually had this same problem with several long running Python
programs. Any ideas?
If you were running under unix I'd suggest you "strace" the process to
see what it is doing. There are windwows strace programs (which I've
never tried) too!

You'll probably find it is wedged in TCP socket code.

--
Nick Craig-Wood <ni**@craig-wood.com-- http://www.craig-wood.com/nick
Aug 22 '07 #3
In message <11**********************@l22g2000prc.googlegroups .com>, Josh
Volz wrote:
My problem is that I keep having an issue where the
program appears to freeze. Once this freezing happens the activity of
the program drops to zero. No exception is thrown or caught. The
program simply stops doing anything. It even stops printing out its
activity to stdout.
What happens afterwards? Does it continue running as though nothing had
happened? Throw an exception?

From the output that appears beforehand, does it look like the freeze is
always happening in the same place?
Aug 27 '07 #4

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

reply views Thread by mmarkzon | last post: by
7 posts views Thread by newsgroups.comcast.net | last post: by
8 posts views Thread by jonbutler88 | last post: by
2 posts views Thread by =?Utf-8?B?Q2hhcnRz?= | last post: by
2 posts views Thread by akhilesh.noida | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.