By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,946 Members | 1,600 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,946 IT Pros & Developers. It's quick & easy.

any suggestions for URL cataloging project?

P: n/a
I've just come up with an idea to make a small-time record of web
pages linking to other web pages. I don't want to download every page
on the internet (I'll leave google to do that). I just want to know if
anyone has any suggestions on how to acquire just the links from a web
page using python. This is for a cataloging purpose. Is there some
library or script out there that I haven't heard of?
Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Matthew K Jensen <ma**********@gmail.com> wrote:
I've just come up with an idea to make a small-time record of web
pages linking to other web pages. I don't want to download every page
on the internet (I'll leave google to do that). I just want to know if
anyone has any suggestions on how to acquire just the links from a web
page using python. This is for a cataloging purpose. Is there some
library or script out there that I haven't heard of?


Check out Tools/webchecker/ -- the Tools directory is part of Python's
source distribution and should also come with most prepackaged Python
distributions, I believe.
Alex
Jul 18 '05 #2

P: n/a
"Matthew K Jensen" <ma**********@gmail.com> wrote in message
news:a8**************************@posting.google.c om...
I've just come up with an idea to make a small-time record of web
pages linking to other web pages. I don't want to download every page
on the internet (I'll leave google to do that). I just want to know if
anyone has any suggestions on how to acquire just the links from a web
page using python. This is for a cataloging purpose. Is there some
library or script out there that I haven't heard of?


One of the examples that comes with pyparsing is urlextractor.py. Point it
at a web page and it lists out the urls and linked text.

Download pyparsing at http://pyparsing.sourceforge.net.

-- Paul
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.