473,388 Members | 1,326 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

summarize text

hello list,

does anyone know of a library which permits to summarise text? i've
been looking at nltk but haven't found anything yet. any help would be
very welcome.
thank you all in advance,

robin

May 29 '06 #1
4 2631
> does anyone know of a library which permits to summarise text?
i've been looking at nltk but haven't found anything yet. any
help would be very welcome.


Well, summarizing text is one of those things that generally
takes a brain-cell or two to do. Automating the process would
require doing it either smartly (some sort of
neural-net/NLP/Markov-chain technology, which is a non-trivial
task--something one might consider braving in the 3rd or 4th-year
of a university computer-science program), or doing it fairly
dumbly. As an example of a "dumb" solution, you can use regexps
to trim off the first few words and the last few words and call
that a "summary":
import re
r = re.compile(r'^(.{8}.*?\b)\s.*\s(\b.{8}.*?)', re.DOTALL)
s = """This is the first line .... and it has a second line
.... and a third line
.... and the last line is the fourth line.""" result = r.sub(r"\1...\2",s.strip())
result

'This is the...fourth line.'

You can adjust the "{8}" portions for more or less
leader/trailing context characters.

The regexp might need a bit of tweaking for somewhat short
strings, but if they're fairly short, one might not need to
summarize them ;)

-tkc


May 29 '06 #2

robin wrote:
hello list,

does anyone know of a library which permits to summarise text? i've
been looking at nltk but haven't found anything yet. any help would be


unclear what you're asking, maybe look at:
http://www.cs.waikato.ac.nz/~ml/weka/index.html

http://www.kdnuggets.com/software/suites.html
http://www.ailab.si/orange

http://mallet.cs.umass.edu/index.php/Main_Page
http://minorthird.sourceforge.net/
http://www.dia.uniroma3.it/db/roadRunner/

http://www.lemurproject.org/

May 29 '06 #3
thanks for all your replies. lemur looks pretty interesting!
robin

gene tani wrote:
robin wrote:
hello list,

does anyone know of a library which permits to summarise text? i've
been looking at nltk but haven't found anything yet. any help would be


unclear what you're asking, maybe look at:
http://www.cs.waikato.ac.nz/~ml/weka/index.html

http://www.kdnuggets.com/software/suites.html
http://www.ailab.si/orange

http://mallet.cs.umass.edu/index.php/Main_Page
http://minorthird.sourceforge.net/
http://www.dia.uniroma3.it/db/roadRunner/

http://www.lemurproject.org/


May 31 '06 #4
.... sorry, I thought you said "summarize Proust".

:)
Jun 5 '06 #5

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
by: bastardx | last post by:
Hi I wonder if you can point me to some scripts because I don't know perl. I think that perl is best for what I want to do. I have special kind of logs which looks like following: ID - is client...
3
by: Xerxes | last post by:
Hi, I need help in setting up a page where the text wraps around an image. Right now, I am using table, with text in one <td> and the image in the adjacent <td>. The problem is when the text is...
2
by: Macsicarr | last post by:
Hi All Wonder if you could help me. I have created a CMS system that allows the user to enter text and pic 'tags' for their own About us page, eg text.... text.... text.... text.......
1
by: chris_wood80 | last post by:
Is there a way to summarize text in a report section footer, almost the equivalent of =Sum(Quantity) for text. One order can contain multiple line items of either the same product or of several...
2
by: ecoulson123 | last post by:
I am using Access 2000. I am trying to summarize numeric data from a large database. The problem is that I need the summarization functions to ignore "junk" data, defined in a couple ways. ...
2
by: rcamarda | last post by:
I've been trying to solve this problem for better of 4 days: We summarize registrations of students on a daily basis, however they are net changes. Example: A student registers one class for...
3
by: access baby | last post by:
when i creat query from Query Wizard and select the fields from table and click next it doesnt show me the option of Detail / Summary why is it so? i need to create query or say summarize data...
3
beacon
by: beacon | last post by:
Hey everybody, I'm using 2003 and wondering if there's a way to count the data for multiple sections in a report and summarize it at the end of the section. For instance, I have a report with a...
2
by: Ducknut | last post by:
Hi all, I was thinking that an expert like FishVal might be interested in solving this one (based on his name). I have several fish tagged with radio telemetry tags. These tags send a signal to a...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.