brock@gunter-smith.com wrote:
Quote:
I'd like to be able to take a string and search within it for all words
(of the longest length possible) that are possibly contained within it
(in sequence, we're not re-ordering the letters in the string).
Obviously the brute force approach (which may be the only solution) is
to iterate through a dictionary file searching for occurances of each
entry within the string.
It sounds like you are after a LCS (Longest Common Subsequence)
implementation. Just google for "longest common subsequence" and you'll
get a thousand ways to do it. Wikipedia has one that seems to work
well:
http://en.wikipedia.org/wiki/Longest...quence_problem
LCS is used in diff algorithms.
Quote:
If anyone has done anything similar to this, were there any other
methods used to reduce the number of iterations required like using a
list of common words that are not generally elements of other words
that can be quickly broken out from the string? Or are there libraries
that may be of use in efficiently processing this type of search?
>
e.g. given the string "themeether", possible solutions might be
{'the','meet','her'} or {'theme','ether'}
>