On Aug 3, 7:50 pm, Coogan <pcb2...@columbia.eduwrote:
Hi--
I'm using Python for the first time to make a plug-in for Firefox.
The goal of this plug-in is to take the source code from a website
and use the metadata and body text for different kinds of analysis.
My question is: How can I retrieve data from a website? I'm not even
sure if this is possible through Python. Any help?
nieu
How about this? it will fetch the HTML source of the page.
import datetime, time, re, os, sys, traceback, smtplib, string,\
urllib2, urllib, inspect
from urllib2 import build_opener, HTTPCookieProcessor, Request
opener = build_opener(HTTPCookieProcessor)
from urllib import urlencode
def urlopen2(url, data=None, user_agent='urlopen2'):
"""Opens Our URLS """
if hasattr(data, "__iter__"):
data = urlencode(data)
headers = {'User-Agent' : user_agent}
return opener.open(Request(url, data, headers))
###TESTCASES START HERE###
def publishedNotes():
page = urlopen2("http://www.yourURL.com", ())
pageRead = page.read()
print pageRead
if __name__ == '__main__':
publishedNotes()
sys.exit()