473,396 Members | 1,707 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

Library for crawling forums

I'm trying to write a utility to crawl forums and strip posts to be
gone through offline. Just the content, I don't need to get who posted
or sigs or any identifying info.

Can anyone suggest a library that is already geared toward this?

Oct 11 '07 #1
1 1185
BlueCrux:
I'm trying to write a utility to crawl forums and strip posts to be
gone through offline. Just the content, I don't need to get who posted
or sigs or any identifying info.

Can anyone suggest a library that is already geared toward this?
Maybe a combination of mechanize [1] and BeautifulSoup [2]?

[1] http://wwwsearch.sourceforge.net/mechanize/
[2] http://www.crummy.com/software/BeautifulSoup/
--
Thomas Wittek
Web: http://gedankenkonstrukt.de/
Jabber: st*********@jabber.i-pobox.net
GPG: 0xF534E231
Oct 11 '07 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Benjamin Lefevre | last post by:
I am currently developping a web crawler, mainly crawling mobile page (wml, mobile xhtml) but not only (also html/xml/...), and I ask myself which speed I can reach. This crawler is developped in...
0
by: relisoft | last post by:
Seattle, WA -- Seattle-based Reliable Software® announces the release their Windows Library into the public domain. Reliable Software Windows Library, RSWL, is the foundation for their compact,...
2
by: Björn | last post by:
Hi I´m searching for a dotnet library for creating pdf files on the fly. While searching at google I found a lot of libraries, but most libraries are handling Text as a kind of Image and I have...
4
by: Brian Henry | last post by:
Is it possible to do this... I want to make a DLL file full of reports done in crystal reports, but then i want to get a listing of all the reports in the dll file (kind of an available report...
11
by: Tomás | last post by:
Is there anywhere on the internet where you can download actual source code of an implementation of the C++ library? Stuff like: namespace std { class string { // actual code in here } }
1
by: rincewind | last post by:
No sure it's not an off-topic here (in that case please tell me the right newsgroup), but is there more C++-friendly library for XML processing than implementations of DOM? I think there must have...
4
by: uanmi | last post by:
Please create an Enterprise Library Forum asap. There is no help on gotdotnet for the many people asking questions. My project is stuck without some answers. -- regards, Mark
20
by: Nickolai Leschov | last post by:
Hello all, I am programming an embedded controller that has a 'C' library for using its system functions (I/O, timers, all the specific devices). The supplied library has .LIB and .H files. ...
2
by: teressa | last post by:
Hi Everyone, I was given a task to fix our printer friendly pages: Best practice recommendation was to dynamically load a JavaScript page. I have an asp page which is a printer-friendly page...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.