Is there a way to find the name of a page you are retrieving using
python. For example, if I get http://www.cnn.com/ i want to know that
the page is index.html. I can do this using wget. as seen in the code
below. Can I do this in python?
Thanks,
$ wget cnn.com
--11:15:25-- http://cnn.com/
=`index.html'
Resolving cnn.com... 157.166.226.25, 157.166.226.26,
157.166.224.25, ...
Connecting to cnn.com|157.166.226.25|:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://www.cnn.com/ [following]
--11:15:25-- http://www.cnn.com/
=`index.html'
Resolving www.cnn.com... 157.166.224.25, 157.166.224.26,
157.166.226.25, ...
Reusing existing connection to cnn.com:80.
HTTP request sent, awaiting response... 200 OK
Length: 96,094 (94K) [text/html]
100%[====================================>] 96,094 68.15K/s
11:15:28 (67.99 KB/s) - `index.html' saved [96094/96094]