473,406 Members | 2,954 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,406 software developers and data experts.

Finding Default Page Name using urllib2

Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?

Thanks,

$ wget cnn.com
--11:15:25-- http://cnn.com/
=`index.html'
Resolving cnn.com... 157.166.226.25, 157.166.226.26,
157.166.224.25, ...
Connecting to cnn.com|157.166.226.25|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.cnn.com/ [following]
--11:15:25-- http://www.cnn.com/
=`index.html'
Resolving www.cnn.com... 157.166.224.25, 157.166.224.26,
157.166.226.25, ...
Reusing existing connection to cnn.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 96,094 (94K) [text/html]

100%[====================================>] 96,094 68.15K/s

11:15:28 (67.99 KB/s) - `index.html' saved [96094/96094]
Oct 27 '08 #1
2 1128

On Oct 27, 2008, at 12:17 PM, barrett wrote:
Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?
Hi barrett,
Look into the urllib2 module and specifically HTTPRedirectHandler
objects.
Good luck
Philip
Oct 27 '08 #2
Hi!
Can I do this in python?
No.
The "default page" is a property of the web-server ; and it is not
client side.
Examples :
for Apache, it's index.html or index.htm ; but if PHP is installed,
index.php is also possible.
for APS, it's init.htm (between others possibilites).
etc.

@-salutations
--
Michel Claveau

Oct 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Andre Bocchini | last post by:
I'm having some trouble using proxy authentication. I can't figure out how to authenticate with a Squid proxy. I know for a fact the proxy is using Basic instead of Digest for the authentication....
0
by: jacob c. | last post by:
When I request a URL using urllib2, it appears that urllib2 always makes the request using HTTP 1.0, and not HTTP 1.1. I'm trying to use the "If-None-Match"/"ETag" HTTP headers to conserve...
5
by: Murugesh | last post by:
Hi all, I'm a newbie to python.I need to login to a webpage after supplying usename and password. import urllib sock = urllib.urlopen("http://xop-pc.main.com") htmlSource = sock.read()...
3
by: muttu2244 | last post by:
Hi all, Am trying to read a email ids which will be in the form of links ( on which if we click, they will redirect to outlook with their respective email ids). And these links are in the...
4
by: Ju Hui | last post by:
I wanna use urllib2 to get a page with a socks 5 proxy,who can give me a sample code ? example, the proxy server is :123.123.123.123 and the port is :1080 and the username/password is :...
1
by: Alessandro Fachin | last post by:
I write this simply code that should give me the access to private page with htaccess using a proxy, i don't known because it's wrong... import urllib,urllib2 #input url...
5
by: cjl | last post by:
Hi. I am trying to screen scrape some stock data from yahoo, so I am trying to use urllib2 to retrieve the html and beautiful soup for the parsing. Maybe (most likely) I am doing something...
275
by: Astley Le Jasper | last post by:
Sorry for the numpty question ... How do you find the reference name of an object? So if i have this bob = modulename.objectname() how do i find that the name is 'bob'
1
by: Albert Hopkins | last post by:
On Mon, 2008-11-17 at 13:59 -0800, godavemon wrote: can treat an HTTP error as an exceptional event or a valid response: import urllib2 url = 'http://cnn.com/asfsdafsadfasdf/' try: page =...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.