By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
443,660 Members | 1,102 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 443,660 IT Pros & Developers. It's quick & easy.

Network failure when using urllib2

P: n/a
I have a script that uses urllib2 to repeatedly lookup web pages (in a
spider sort of way). It appears to function normally, but if it runs
too long I start to get 404 responses. If I try to use the internet
through any other programs (Outlook, FireFox, etc.) it will also fail.
If I stop the script, the internet returns.

Has anyone observed this behavior before? I am relatively new to
Python and would appreciate any suggestions.

Shuad

Jan 8 '07 #1
Share this Question
Share on Google+
4 Replies


P: n/a

jd****@gmail.com wrote:
I have a script that uses urllib2 to repeatedly lookup web pages (in a
spider sort of way). It appears to function normally, but if it runs
too long I start to get 404 responses. If I try to use the internet
through any other programs (Outlook, FireFox, etc.) it will also fail.
If I stop the script, the internet returns.

Has anyone observed this behavior before? I am relatively new to
Python and would appreciate any suggestions.

Shuad
I am assuming that you are fetching the full page every little while.
You are not supposed to do that. The admin of the web site you are
constantly hitting probably configured his server to block you
temporarily when that happens. But don't feel bad :-). This is a common
Beginners mistake.

Read here on the proper way to do this.
http://diveintopython.org/http_web_services/review.html
especially 11.3.3. Last-Modified/If-Modified-Since in the next page

Ravi Teja.

Jan 9 '07 #2

P: n/a
I am fetching different web pages (never the same one) from a web
server. Does that make a difference with them trying to block me?
Also, if it was only that site blocking me, then why does the internet
not work in other programs when this happens in the script. It is
almost like something is seeing a lot of traffic from my computer, and
cutting it off thinking it is some kind of virus or worm. I am
starting to suspect my firewall. Anyone else have this happen?

I am going to read over that documentation you suggested to see if I
can get any ideas. Thanks for the link.

Shuad

On Jan 8, 4:15 pm, "Ravi Teja" <webravit...@gmail.comwrote:
jdv...@gmail.com wrote:
I have a script that uses urllib2 to repeatedly lookup web pages (in a
spider sort of way). It appears to function normally, but if it runs
too long I start to get 404 responses. If I try to use the internet
through any other programs (Outlook, FireFox, etc.) it will also fail.
If I stop the script, the internet returns.
Has anyone observed this behavior before? I am relatively new to
Python and would appreciate any suggestions.
ShuadI am assuming that you are fetching the full page every little while.
You are not supposed to do that. The admin of the web site you are
constantly hitting probably configured his server to block you
temporarily when that happens. But don't feel bad :-). This is a common
Beginners mistake.

Read here on the proper way to do this.http://diveintopython.org/http_web_services/review.html
especially 11.3.3. Last-Modified/If-Modified-Since in the next page

Ravi Teja.
Jan 9 '07 #3

P: n/a
At Monday 8/1/2007 21:30, jd****@gmail.com wrote:
>I am fetching different web pages (never the same one) from a web
server. Does that make a difference with them trying to block me?
Also, if it was only that site blocking me, then why does the internet
not work in other programs when this happens in the script. It is
almost like something is seeing a lot of traffic from my computer, and
cutting it off thinking it is some kind of virus or worm. I am
starting to suspect my firewall. Anyone else have this happen?
Perhaps you're not closing connections once finished?
Try netstat -an from the command line and see how many open
connections you have.
--
Gabriel Genellina
Softlab SRL


__________________________________________________
Preguntá. Respondé. Descubrí.
Todo lo que querías saber, y lo que ni imaginabas,
está en Yahoo! Respuestas (Beta).
¡Probalo ya!
http://www.yahoo.com.ar/respuestas

Jan 9 '07 #4

P: n/a

jd****@gmail.com wrote:
I am fetching different web pages (never the same one) from a web
server. Does that make a difference with them trying to block me?
Also, if it was only that site blocking me, then why does the internet
not work in other programs when this happens in the script. It is
almost like something is seeing a lot of traffic from my computer, and
cutting it off thinking it is some kind of virus or worm. I am
starting to suspect my firewall. Anyone else have this happen?

I am going to read over that documentation you suggested to see if I
can get any ideas. Thanks for the link.

Shuad
No! What I suggested should not effect traffic from other servers. I
would go with Gabriel's suggestion and check for open connections just
in case. Although I can't imagine why that would give you a 404
response since it is a server response (implies successful connection).
I would expect that you would get a client error in such a case.

Of course, you can always rule out your suspicions of local conditions
(turn off security software briefly or try from a different machine)
unless your ISP is implementing safeguards against DOS attacks from
their network with normal users in mind.

Ravi Teja.

Jan 9 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.