473,667 Members | 2,530 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Problem with urllib.urlretri eve

Hi,

i am doing a program to download all images from an specified site.
it already works with most of the sites, but in some cases like:
www.slashdot.org it only download 1kb of the image. This 1kb is a html
page with a 503 error.

What can i make to really get those images ?

Thanks

Your Help is aprecciate.
Jul 18 '05 #1
1 2752
On 11 Jun 2004 16:01:01 -0700, ra*****@gmail.c om (ralobao) wrote:
Hi,

i am doing a program to download all images from an specified site.
it already works with most of the sites, but in some cases like:
www.slashdot.org it only download 1kb of the image. This 1kb is a html
page with a 503 error.

What can i make to really get those images ?

Thanks

Your Help is aprecciate.
I did something like this a while ago. I used websucker.py in the
Tools/ directory. And then added some conditionals to tell it to only
create files for certain extentions.

As to why it fails in your case, (/me puts on psychic hat) I guessing
slashdot does something to stop people from deep-linking their image
files to stop leeches.
<{{{*>


Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1761
by: jeff | last post by:
Hiya im trying to pull tags off a website using python ive got a few things running that have the potential to work its just i cant get them to becuase of certain errors? basically i dont what to download the images and all the stuff just the html and then work from there, i think its timing out because its trying to downlaod the images as well which i dont what to do as this would decrease the speed of what im trying to achieve, the...
1
4100
by: Chris Lyon | last post by:
Could somebody please explain the difference between these two modules and explain why they are both required, and if there will ever be a unification of them?
2
2633
by: Mike Zupan | last post by:
I had some problems with urllib and py2exe under pyton2.3. I works fine until i try to use the exe file. Here is the error Error: 1 LookupError Exception in Tk callback Function: <bound method MainWindow.selectDir of <__main__.MainWindow instance at 0x00DF4058>> (type: <type 'instancemethod'>)
2
4324
by: Sam Sungshik Kong | last post by:
Hello! I'm trying to download PDF files from web to my computer using urllib. Some pdf files are fine but other files are downloaded only 6kB which is wrong. It didn't show any error message. I use urllib.urlretrieve(url, fn). So I tried to download with Internet Explorer and it worked fine.
8
8983
by: Ritesh Raj Sarraf | last post by:
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Hello Everybody, I've got a small problem with urlretrieve. Even passing a bad url to urlretrieve doesn't raise an exception. Or does it? If Yes, What exception is it ? And how do I use it in my program ? I've
1
1372
by: NewFilmFan | last post by:
I use Python 2.3 on Windows XP. I wrote this program: import httplib conn = httplib.HTTPConnection("www.x.net") conn.request("GET", "/x/y.jpg") r1 = conn.getresponse() print r1.status, r1.reason data = r1.read() datei = open('test.jpg','w')
1
1961
by: Timothy Smith | last post by:
ok what i am seeing is impossible. i DELETED the file from my webserver, uploaded the new one. when my app logs in it checks the file, if it's changed it downloads it. the impossible part, is that on my pc is downloading the OLD file i've deleted! if i download it via IE, i get the new file. SO, my only conculsion is that urllib is caching it some where. BUT i'm already calling urlcleanup(), so what else can i do? here is the code ...
6
5816
by: justsee | last post by:
Hi, I'm using Python 2.3 on Windows for the first time, and am doing something wrong in using urllib to retrieve images from urls embedded in a csv file. If I explicitly specify a url and image name it works fine(commented example in the code), but if I pass in variables in this for loop it throws errors: --- The script: import csv, urllib
5
4673
by: supercooper | last post by:
I am downloading images using the script below. Sometimes it will go for 10 mins, sometimes 2 hours before timing out with the following error: Traceback (most recent call last): File "ftp_20070326_Downloads_cooperc_FetchLibreMapProjectDRGs.py", line 108, i n ? urllib.urlretrieve(fullurl, localfile) File "C:\Python24\lib\urllib.py", line 89, in urlretrieve
1
12493
by: Abandoned | last post by:
Hi.. I want to set 30 second urllib.urlretrieve timeout.. Because if urllib.urlretrieve can't connect to page wait 1-2 hour... I download the images to my server with urlretrieve if you know the better way please help me. I'm sorry my bad english..
0
8457
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8788
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
8563
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8646
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
5675
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4200
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2776
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2013
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1778
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.