473,289 Members | 2,141 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,289 software developers and data experts.

urlretrieve() questions

I'm building an app that needs to download a file from the
web.

I'm trying to make sure I catch any issues with the download
but I've run into a problem.

here's what I have so far:

try:
urllib.urlretrieve(url,filename)
print "File: ", filename, " downloaded"
except IOError:
print "IOError File Not Found: ", url

Pretty straight forward...but what I'm finding is if the
url is pointing to a file that is not there, the server
returns a file that's a web page displaying a 404 error.

Anyone have any recommendations for handling this?
--

Rene
Dec 23 '05 #1
1 6889
> Pretty straight forward...but what I'm finding is if the
url is pointing to a file that is not there, the server
returns a file that's a web page displaying a 404 error.

Anyone have any recommendations for handling this?
You're right, that is NOT documented in a way that's easy to find!

What I was able to find is how to what you want using urllib2 instead of
urllib. I found an old message thread that touches on the topic:
http://groups.google.com/group/comp....c7bfec87e18ba9
(also accessable as http://tinyurl.com/952dw). Here's a quick summary:
-----------------------------------------------------------------------

Ivan Karajas
Apr 28 2004, 11:03 pm show options
Newsgroups: comp.lang.python
From: Ivan Karajas <my_full_name_concatena...@myrealbox.com> - Find messages by this author
Date: Wed, 28 Apr 2004 23:03:54 -0800
Local: Wed, Apr 28 2004 11:03 pm
Subject: Re: 404 errors
Reply to Author | Forward | Print | Individual Message | Show original | Report Abuse

On Tue, 27 Apr 2004 10:46:47 +0200, Tut wrote: Tue, 27 Apr 2004 11:00:57 +0800, Derek Fountain wrote:
Some servers respond with a nicely formatted bit of HTML explaining the
problem, which is fine for a human, but not for a script. Is there some
flag or something definitive on the response which says "this is a 404
error"?

Maybe catch the urllib2.HTTPError?


This kind of answers the question. urllib will let you read whatever it
receives, regardless of the HTTP status; you need to use urllib2 if you
want to find out the status code when a request results in an error (any
HTTP status beginning with a 4 or 5). This can be done like so:

import urllib2
try:
asock = urllib2.urlopen("http://www.foo.com/qwerty.html")
except urllib2.HTTPError, e:
print e.code

The value in urllib2.HTTPError.code comes from the first line of the web
server's HTTP response, just before the headers begin, e.g. "HTTP/1.1 200
OK", or "HTTP/1.1 404 Not Found".

One thing you need to be aware of is that some web sites don't behave as
you would expect them to; e.g. responding with a redirection rather than a
404 error when you when you request a page that doesn't exist. In these
cases you might still have to rely on some clever scripting.
----------------------------------------------------------------------

I hope that helps.

Dan
Dec 23 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: HP | last post by:
I am using urllib.urlretrieve() to download a file from a web site. THe trouble is that the file name has spaces in it, such as "string string1 foo.doc". The statement:...
2
by: Sam Sungshik Kong | last post by:
Hello! I'm trying to download PDF files from web to my computer using urllib. Some pdf files are fine but other files are downloaded only 6kB which is wrong. It didn't show any error message. ...
0
by: Justin | last post by:
does anybody know if there is a way to stop urlretrieve mid transfer or are you simply bound to finishing the file you are on? If anybody knows the answer to this let me know. i have read all the...
1
by: Josh | last post by:
Hi. I am writing a script that downloads lots of zips from a usgs site. All is going well except that occasionally, in the midst of downloading a file, the script just hangs and i must either...
0
by: Ray Slakinski | last post by:
I got a small issue, I am using urllib.urlretreive to download files but in some cases I'm downloading from a CGI that is redirecting urlretrieve to a different url. Example: ...
6
by: Sven | last post by:
Hi guys and gals, I'm wrestling with the urlretrieve function in the urllib module. I want to download a file from a web server and save it locally with the same name. The problem is the URL -...
1
by: silverburgh.meryl | last post by:
Hi, I use urlretrieve to retrieve resources from a http server. Can you please tell me how can I get the HTTP error (whenever happens)? And what kind of different exception urlretrieve will...
1
by: Abandoned | last post by:
Hi.. I want to set 30 second urllib.urlretrieve timeout.. Because if urllib.urlretrieve can't connect to page wait 1-2 hour... I download the images to my server with urlretrieve if you know the...
0
by: triplezone3 | last post by:
I have looked in to urllib2, and I can't find a function which would allow me to get the progress of the download as it happens, bit by bit, like urlretrieve does, at least not easily....
0
by: DolphinDB | last post by:
The formulas of 101 quantitative trading alphas used by WorldQuant were presented in the paper 101 Formulaic Alphas. However, some formulas are complex, leading to challenges in calculation. Take...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
Hello Experts! I have written a code in MS Access for a cmd called "WhatsApp Message" to open WhatsApp using that very code but the problem is that it gives a popup message everytime I clicked on...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.