Hello Aaron,
This there a public program/algorithm that can tell me the key points of a
text?
For example I entered the following text:
Web logs, or blogs, the online personal diaries where big names and no
names expound on everything from pets to presidents, are going mainstream.
While still a relatively small piece of total online activity, blogging
has caught on with affluent young adults. As Forrester Research analysts
recently noted, blogging will become increasingly common as these
consumers age.
The program should give me the main keywords such as: blog,
online,people...
Fascinating. Simply keyword counting produces nearly nothing. The only
words that occur more than once are "blogging" and "names." The word
"people" that you produce in your list of keywords doesn't occur in the
paragraph at all.
You would need an algorithm that creates a contextual map through a lexical
tree and produces, effectively, an "understanding" of the key concept of the
paragraph. Effectively, you are entering the field of Computational
Linguistics.
There is some fascinating research on Natural Language Processing that began
in the late 80s (and continues today) that addresses many of these ideas.
I'm sure that some of the current "search" research has raised interest
further. Microsoft Research, IBM Research, and others are very much
interested in these areas.
One example would be the Text Mining project at IBM:
http://www.trl.ibm.com/projects/textmining/index_e.htm
A good link for coding systems that follow some of these practices is here:
http://www.cl.cam.ac.uk/Research/NL/anlt.html
There is WAY too much involved, morphologically, lexically, and
linguistically, to demonstrate even the simplest of these algorithms in a
newsgroup message. Start at your local college library and/or Google for
"Natural Language Processing" Go from there.
--
--- Nick Malik [Microsoft]
MCSD, CFPS, Certified Scrummaster
http://blogs.msdn.com/nickmalik
Disclaimer: Opinions expressed in this forum are my own, and not
representative of my employer.
I do not answer questions on behalf of my employer. I'm just a
programmer helping programmers.
--