Problem using urllib to download images

tstrogen

I am using Python 2.6 on Mac OS 10.3.9.
I have been trying to use:
image = urllib.URLopener()
image.retrieve(url, filename)
to download images from websites. I am able to do so, and end up with
the appropriate file. However, when I try to open the file, I get an
error message. It's something about corrupted data, and an
unrecognised file.
Anyone know what I'm talking about/had similar experiences?
-Taidgh

Nov 3 '08 #1

Subscribe Post Reply

25775

drobinow

On Nov 3, 11:48*am, tstro...@googlemail.com wrote:

I am using Python 2.6 on Mac OS 10.3.9.
I have been trying to use:
image = urllib.URLopener()
image.retrieve(url, filename)
to download images from websites. I am able to do so, and end up with
the appropriate file. However, when I try to open the file, I get an
error message. It's something about corrupted data, and an
unrecognised file.
Anyone know what I'm talking about/had similar experiences?
-Taidgh

Please show an actual program, complete with error messages.

import urllib
image = urllib.URLopener()
image.retrieve("http://www.python.org/images/success/nasa.jpg",
"NASA.jpg")
Works for me.

Nov 3 '08 #2

tstrogen

Then perhaps it's a problem with my os.
[TERMINAL SESSION]
[18:16:33 Mon Nov 03] python
Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>import urllib
url = 'http://www.google.com/webhp?hl=en'
filename = 'logo.gif'
image = urllib.URLopener()
image.retrieve(url, filename)

('logo.gif', <httplib.HTTPMessage instance at 0x5196e8>)
[/TERMINAL SESSION]
And here's the error message I get when I try to open it: "File Error:
Couldn't open the file. It may be corrupt or a file format that
Preview doesn't recognize.".
I have had a similar result trying to open it with other programs.
-Taidgh

Nov 3 '08 #3

Jerry Hill

On Mon, Nov 3, 2008 at 2:21 PM, <ts******@googlemail.comwrote:

Then perhaps it's a problem with my os.
[TERMINAL SESSION]
[18:16:33 Mon Nov 03] python
Python 2.6 (trunk:66714:66715M, Oct 1 2008, 18:36:04)
[GCC 4.0.1 (Apple Computer, Inc. build 5370)] on darwin
Type "help", "copyright", "credits" or "license" for more information.

>>>import urllib
url = 'http://www.google.com/webhp?hl=en'

That's not the URL of an image file. Maybe you're looking for
url = 'http://www.google.com/intl/en_ALL/images/logo.gif'

>>>filename = 'logo.gif'
image = urllib.URLopener()
image.retrieve(url, filename)

('logo.gif', <httplib.HTTPMessage instance at 0x5196e8>)
[/TERMINAL SESSION]
And here's the error message I get when I try to open it: "File Error:
Couldn't open the file. It may be corrupt or a file format that
Preview doesn't recognize.".
I have had a similar result trying to open it with other programs.

That's because you downloaded some HTML and saved it in a file named
logo.gif. That's unlikely to work in any image viewing program. Try
opening the file you downloaded in a text editor and you'll see.

--
Jerry

Nov 3 '08 #4

tstrogen

That's because you downloaded some HTML and saved it in a file named

logo.gif. That's unlikely to work in any image viewing program. Try
opening the file you downloaded in a text editor and you'll see.

--
Jerry

Aha, so the first param is the file, and second is the name you save
the files as. Thankyou, for pointing out my stupid mistake. I was
confused by trying to replicate a program called 'comicdownloader.py'
off of uselesspython.com. I thought that the first param was the page
containing the file, and the second was the file. And that the file
would simply be saved as it's name on the website. Thanks again.
-Taidgh

Nov 3 '08 #5

Similar topics

python tags on websites timeout problem

by: jeff | last post by:

Hiya im trying to pull tags off a website using python ive got a few things running that have the potential to work its just i cant get them to becuase of certain errors? basically i dont...

Python

Problem with urllib.urlretrieve

by: ralobao | last post by:

Hi, i am doing a program to download all images from an specified site. it already works with most of the sites, but in some cases like: www.slashdot.org it only download 1kb of the image. This...

Python

Urgent:Linkbutton problem!

by: comshiva | last post by:

Hi all, I have converted my existing ASP.NET project from 1.1 to 2.0 and i have found that everything works fine except the linkbutton control in my datagrid which throws an javascript error when...

ASP.NET

Using XML w/ Python...

by: Jay | last post by:

OK, I have this XML doc, i dont know much about XML, but what i want to do is take certain parts of the XML doc, such as </title> blah </title> and take just that and put onto a text doc. Then...

Python

Downloading files using urllib in a for loop?

by: justsee | last post by:

Hi, I'm using Python 2.3 on Windows for the first time, and am doing something wrong in using urllib to retrieve images from urls embedded in a csv file. If I explicitly specify a url and image...

Python

Retrieve HTML from site using cookies

by: onceuponapriori | last post by:

Greetings gents. I'm a Railser working on a django app that needs to do some scraping to gather its data. I need to programatically access a site that requires a username and password. Once I...

Python

Help me to fix External . js problem in asp.net

by: althafexcel | last post by:

hi everyone Im trying to include an external js in my aspx page under the head tag, it doesn't load or it displays an object expected error whenver the function from the .js is called. Actually...

ASP.NET

configure urllib.urlretrieve timeout

by: Abandoned | last post by:

Hi.. I want to set 30 second urllib.urlretrieve timeout.. Because if urllib.urlretrieve can't connect to page wait 1-2 hour... I download the images to my server with urlretrieve if you know the...

Python

Urllib(1/2) how to open multiple client sockets?

by: ShashiGowda | last post by:

Hey there i made a script to download all images from a web site but it runs damn slow though I have a lot of bandwidth waiting to be used please tell me a way to use urllib to open many...

Python

Easy Steps to Fix "Canon Printer Won't Connect to WiFi Network"

by: taylorcarr | last post by:

A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...

General

Batch import of multiple excel files into the database

by: ryjfgjl | last post by:

If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...

Data Management

Merging data from multiple Excel files

by: ryjfgjl | last post by:

In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...

Data Management

Migrating Website to Cloud - Emmanuel Katto

by: emmanuelkatto | last post by:

Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel

General

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

What is ONU?

by: marktang | last post by:

ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...

General

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++