473,320 Members | 1,976 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,320 software developers and data experts.

Determine Whether File Exists On HTTP Server

Hi, I'm trying to determine whether a given URL exists. I'm new to Python
but I think that urllib is the tool for the job. However, if I give it a
non-existent file, it simply returns the 404 page. Aside from grepping this
for '404', is there a better way to do this? (Preferrably, there is a
solution that can be applied to both HTTP and FTP.) Thanks in advance.
Jul 18 '05 #1
2 4308
On Saturday 22 May 2004 12:28 am, OvErboRed wrote:
Hi, I'm trying to determine whether a given URL exists. I'm new to Python
but I think that urllib is the tool for the job. However, if I give it a
non-existent file, it simply returns the 404 page. Aside from grepping this
for '404', is there a better way to do this? (Preferrably, there is a
solution that can be applied to both HTTP and FTP.) Thanks in advance.


Try urllib2.urlopen, and put a try/except block around it. Here's what an
unhandled exception from a 404 response looks like:

Python 2.3.3 (#1, May 14 2004, 09:49:22)
[GCC 3.3.2 20031218 (Gentoo Linux 3.3.2-r5, propolice-3.3-7)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
import urllib2
handle = urllib2.urlopen('http://google.com/this_page_doesnt_exist')

Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.3/urllib2.py", line 129, in urlopen
return _opener.open(url, data)
File "/usr/lib/python2.3/urllib2.py", line 326, in open
'_open', req)
File "/usr/lib/python2.3/urllib2.py", line 306, in _call_chain
result = func(*args)
File "/usr/lib/python2.3/urllib2.py", line 901, in http_open
return self.do_open(httplib.HTTP, req)
File "/usr/lib/python2.3/urllib2.py", line 895, in do_open
return self.parent.error('http', req, fp, code, msg, hdrs)
File "/usr/lib/python2.3/urllib2.py", line 346, in error
result = self._call_chain(*args)
File "/usr/lib/python2.3/urllib2.py", line 306, in _call_chain
result = func(*args)
File "/usr/lib/python2.3/urllib2.py", line 472, in http_error_302
return self.parent.open(new)
File "/usr/lib/python2.3/urllib2.py", line 326, in open
'_open', req)
File "/usr/lib/python2.3/urllib2.py", line 306, in _call_chain
result = func(*args)
File "/usr/lib/python2.3/urllib2.py", line 901, in http_open
return self.do_open(httplib.HTTP, req)
File "/usr/lib/python2.3/urllib2.py", line 895, in do_open
return self.parent.error('http', req, fp, code, msg, hdrs)
File "/usr/lib/python2.3/urllib2.py", line 352, in error
return self._call_chain(*args)
File "/usr/lib/python2.3/urllib2.py", line 306, in _call_chain
result = func(*args)
File "/usr/lib/python2.3/urllib2.py", line 412, in http_error_default
raise HTTPError(req.get_full_url(), code, msg, hdrs, fp)
urllib2.HTTPError: HTTP Error 404: Not Found

--
Troy Melhase, tr**@gci.net
--
When Christ calls a man, he bids him come and die. - Dietrich Bonhoeffer
Jul 18 '05 #2
This works with HTTP:

import sys # exc_info
import httplib # HTTPConnection

HOST = "www.python.org"
PAGE = "/path/to/some/file.html"

try:
c = httplib.HTTPConnection( HOST )
# c._http_vsn = 10; c._http_vsn_str = "HTTP/1.0"
c.connect( )
c.putrequest ( "GET", PAGE )
c.endheaders()
r = c.getresponse()
print "%s\n%s\n%s\n" % (r.status, r.reason, r.msg)
if r.status == 200: # OK
print "%s exists" % PAGE
PageContent = r.read() # this is the requested html file in a
string
elif r.status == 404: # not found
print "%s does not exist" % PAGE
Page404 = r.read() # this is the 404 page in a string
else:
print "%s : status %s %s %s" % (PAGE, r.status, r.reason, r.msg)
except:
print sys.exc_info()[1]

Greetings
Harald Walter

"OvErboRed" <pu******@SPAMoverbored.net> wrote in message
news:Xn*****************************@127.0.0.1...
Hi, I'm trying to determine whether a given URL exists. I'm new to Python
but I think that urllib is the tool for the job. However, if I give it a
non-existent file, it simply returns the 404 page. Aside from grepping this for '404', is there a better way to do this? (Preferrably, there is a
solution that can be applied to both HTTP and FTP.) Thanks in advance.

Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: vikashskumar | last post by:
Hello everone, I am stuck in determining whether a file does not exist or does not have enough permissions so that access to this file is denied?". I am using java.io.File.exists() or...
3
by: Daniel Borden | last post by:
Big hello to my fellow programmers - I'm new to javascript and PHP and I am wondering if there is a method or function to capture a 404 error in the event a particular html file cannot be found. I...
2
by: Wayne Wengert | last post by:
I am trying to determine if a file exists on a web site but I cannot get a match. I have verified that the file does exists in the specified directory (I verified the case of the file and directory...
7
by: Stephen E. Weber | last post by:
I need to determine if a file exists using code. I tried using the system.io.file.exists function, that will appears to locate the file if I use the complete filespec, when I move the project to...
6
by: Rick Brandt | last post by:
I have a list of dynamically generated Parts that can be used to add to an Order. The list includes an image with the src set to "WhateverThePartNumberIs.png" in a particular folder. Not all...
25
by: _DD | last post by:
I'd like to include a 'Test Connection' button in an app, for testing validity of a SQL connection string. I'd prefer to keep the timeout low. What is the conventional way of doing this?
28
by: Tim Daneliuk | last post by:
I have a program wherein I want one behavior when a file is set as executable and a different behavior if it is not. Is there a simple way to determine whether a given named file is executable...
1
by: topramen | last post by:
does any one here know of a good way to to determine whether or not a given path is a directory (e.g., "c:\mydir") or a file (e.g., "c:\mydir \myfile.txt")? i started attacking this problem by...
7
by: Cramer | last post by:
I'm wondering if there is an easy way to programmatically determine if an assembly is installed in the GAC. This would be similar to our ability to easily determine if a file exists...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
0
by: Defcon1945 | last post by:
I'm trying to learn Python using Pycharm but import shutil doesn't work
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.