471,318 Members | 1,988 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,318 software developers and data experts.

Library for crawling forums

I'm trying to write a utility to crawl forums and strip posts to be
gone through offline. Just the content, I don't need to get who posted
or sigs or any identifying info.

Can anyone suggest a library that is already geared toward this?

Oct 11 '07 #1
1 1139
BlueCrux:
I'm trying to write a utility to crawl forums and strip posts to be
gone through offline. Just the content, I don't need to get who posted
or sigs or any identifying info.

Can anyone suggest a library that is already geared toward this?
Maybe a combination of mechanize [1] and BeautifulSoup [2]?

[1] http://wwwsearch.sourceforge.net/mechanize/
[2] http://www.crummy.com/software/BeautifulSoup/
--
Thomas Wittek
Web: http://gedankenkonstrukt.de/
Jabber: st*********@jabber.i-pobox.net
GPG: 0xF534E231
Oct 11 '07 #2

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

1 post views Thread by Benjamin Lefevre | last post: by
2 posts views Thread by Björn | last post: by
4 posts views Thread by Brian Henry | last post: by
11 posts views Thread by Tomás | last post: by
1 post views Thread by rincewind | last post: by
4 posts views Thread by uanmi | last post: by
reply views Thread by rosydwin | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.