ne*****@xit.net writes:
i have a small app that i am going to need to get information from a
few tables on different websites. i have looked at urllib and httplib.
the sites i need to get data from mostly have this data in tables. So
that, i think would make it easier. Anyone suggest a good starting
point for me to find out how to do this, or know of a link to a good
how-to?
Don't have a link to a howto. But you're halfway there. urllib (and
urllib2) will get HTML text from the websites. Pulling data from it
sort of depends on the nature of the HTML. If it's well-structured
XHTML, you can use your favorite xml library. if it's well structured
HTML, you can try htmllib, but it's pretty primitive. If it's not
well-structured, you can use BeautifulSoup. I've used it to pull data
from tables. The problem with any of this is that your code really
depends on the structure - or lack thereof - of the HTML you're
scraping. If they change it, your code breaks.
<mike
--
Mike Meyer <mw*@mired.org>
http://www.mired.org/home/mwm/
Independent WWW/Perforce/FreeBSD/Unix consultant, email for more information.