Finding Default Page Name using urllib2

barrett

Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?

Thanks,

$ wget cnn.com
--11:15:25-- http://cnn.com/
=`index.html'
Resolving cnn.com... 157.166.226.25, 157.166.226.26,
157.166.224.25, ...
Connecting to cnn.com|157.166.226.25|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.cnn.com/ [following]
--11:15:25-- http://www.cnn.com/
=`index.html'
Resolving www.cnn.com... 157.166.224.25, 157.166.224.26,
157.166.226.25, ...
Reusing existing connection to cnn.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 96,094 (94K) [text/html]

100%[====================================>] 96,094 68.15K/s

11:15:28 (67.99 KB/s) - `index.html' saved [96094/96094]

Oct 27 '08 #1

Subscribe Post Reply

1128

Philip Semanchuk

On Oct 27, 2008, at 12:17 PM, barrett wrote:

Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?

Hi barrett,
Look into the urllib2 module and specifically HTTPRedirectHandler
objects.
Good luck
Philip

Oct 27 '08 #2

Méta-MCI $MVP$

Hi!

Can I do this in python?

No.
The "default page" is a property of the web-server ; and it is not
client side.
Examples :
for Apache, it's index.html or index.htm ; but if PHP is installed,
index.php is also possible.
for APS, it's init.htm (between others possibilites).
etc.

@-salutations
--
Michel Claveau

Oct 27 '08 #3

Similar topics

Proxy Authentication using urllib2

by: Andre Bocchini | last post by:

I'm having some trouble using proxy authentication. I can't figure out how to authenticate with a Squid proxy. I know for a fact the proxy is using Basic instead of Digest for the authentication....

Python

how to force HTTP 1.1 when using urllib2?

by: jacob c. | last post by:

When I request a URL using urllib2, it appears that urllib2 always makes the request using HTTP 1.0, and not HTTP 1.1. I'm trying to use the "If-None-Match"/"ETag" HTTP headers to conserve...

Python

Help-log in to a web page

by: Murugesh | last post by:

Hi all, I'm a newbie to python.I need to login to a webpage after supplying usename and password. import urllib sock = urllib.urlopen("http://xop-pc.main.com") htmlSource = sock.read()...

Python

not able to HTTPS page from python

by: muttu2244 | last post by:

Hi all, Am trying to read a email ids which will be in the form of links ( on which if we click, they will redirect to outlook with their respective email ids). And these links are in the...

Python

I wanna use urllib2 to get a page with a socks 5 proxy,who can give me a sample code ?

by: Ju Hui | last post by:

I wanna use urllib2 to get a page with a socks 5 proxy,who can give me a sample code ? example, the proxy server is :123.123.123.123 and the port is :1080 and the username/password is :...

Python

urllib2 request htaccess page through proxy

by: Alessandro Fachin | last post by:

I write this simply code that should give me the access to private page with htaccess using a proxy, i don't known because it's wrong... import urllib,urllib2 #input url...

Python

difference between urllib2.urlopen and firefox view 'page source'?

by: cjl | last post by:

Hi. I am trying to screen scrape some stock data from yahoo, so I am trying to use urllib2 to retrieve the html and beautiful soup for the parsing. Maybe (most likely) I am doing something...

Python

275

Finding the instance reference of an object

by: Astley Le Jasper | last post by:

Sorry for the numpty question ... How do you find the reference name of an object? So if i have this bob = modulename.objectname() how do i find that the name is 'bob'

Python

Re: Retrieve Custom 404 page.

by: Albert Hopkins | last post by:

On Mon, 2008-11-17 at 13:59 -0800, godavemon wrote: can treat an HTTP error as an exceptional event or a valid response: import urllib2 url = 'http://cnn.com/asfsdafsadfasdf/' try: page =...

Python

Navigating the Data Structures and Algorithms (DSA)

by: BarryA | last post by:

What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...

Algorithms / Advanced Math

Looking to do Android software development, any suggestions? Is flutter better?

by: nemocccc | last post by:

hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?

General

Is that possible of reading the .csv file in column wise and the column have different lengths ?

by: Sonnysonu | last post by:

This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

C / C++

How to build RAID in BIOS?

by: Hystou | last post by:

There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

Computer Hardware

Changing the language in Windows 10

by: Hystou | last post by:

Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...

Windows Server

Problem With Comparison Operator <=> in G++

by: Oralloy | last post by:

Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...

C / C++

Discussion: How does Zigbee compare with other wireless protocols in smart home applications?

by: tracyyun | last post by:

Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...

General

AI Job Threat for Devs

by: agi2029 | last post by:

Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

Career Advice

Access Europe - Using VBA to create a class based on a table - Wed 1 May

by: isladogs | last post by:

The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...

Microsoft Access / VBA