I'd like to be able to take a string and search within it for all words
(of the longest length possible) that are possibly contained within it
(in sequence, we're not re-ordering the letters in the string).
Obviously the brute force approach (which may be the only solution) is
to iterate through a dictionary file searching for occurances of each
entry within the string.
If anyone has done anything similar to this, were there any other
methods used to reduce the number of iterations required like using a
list of common words that are not generally elements of other words
that can be quickly broken out from the string? Or are there libraries
that may be of use in efficiently processing this type of search?
e.g. given the string "themeether", possible solutions might be
{'the','meet','her'} or {'theme','ether'} 2 1789 br***@gunter-smith.com wrote:
I'd like to be able to take a string and search within it for all words
(of the longest length possible) that are possibly contained within it
(in sequence, we're not re-ordering the letters in the string).
Obviously the brute force approach (which may be the only solution) is
to iterate through a dictionary file searching for occurances of each
entry within the string.
It sounds like you are after a LCS (Longest Common Subsequence)
implementation. Just google for "longest common subsequence" and you'll
get a thousand ways to do it. Wikipedia has one that seems to work
well: http://en.wikipedia.org/wiki/Longest...quence_problem
LCS is used in diff algorithms.
If anyone has done anything similar to this, were there any other
methods used to reduce the number of iterations required like using a
list of common words that are not generally elements of other words
that can be quickly broken out from the string? Or are there libraries
that may be of use in efficiently processing this type of search?
e.g. given the string "themeether", possible solutions might be
{'the','meet','her'} or {'theme','ether'} br***@gunter-smith.com wrote:
I'd like to be able to take a string and search within it for all words
(of the longest length possible) that are possibly contained within it
(in sequence, we're not re-ordering the letters in the string).
Obviously the brute force approach (which may be the only solution) is
to iterate through a dictionary file searching for occurances of each
entry within the string.
If anyone has done anything similar to this, were there any other
methods used to reduce the number of iterations required like using a
list of common words that are not generally elements of other words
that can be quickly broken out from the string? Or are there libraries
that may be of use in efficiently processing this type of search?
e.g. given the string "themeether", possible solutions might be
{'the','meet','her'} or {'theme','ether'}
That's very similiar to the Thai word-breaking problem. Is that what
you're trying to do, in fact? There's a link listing some of the
approaches: http://www.fi.muni.cz/~xantos/poster/#x1-3000 This thread has been closed and replies have been disabled. Please start a new discussion. Similar topics
by: John van Terheijden |
last post by:
Hi.
I'm trying to make a conversion algorithm that colors even and odd words in
a HTML string with <div> tags.
// input text with all sorts of HTML tags and whitespace
$str = "<h1>Title...
|
by: Jack Smith |
last post by:
Hello, any help appreciated with following problem. I figured out the
algorithm (I think), just having trouble proving it is optimal.
Suppose we are given n tasks each of which takes 1 unit...
|
by: m sergei |
last post by:
I am not asking for code but wanted help with understanding the
algorithm to permute all characters of a string.
say string is "ABCD"
I want to know the algorithm for finding all permutations of...
|
by: Paolino |
last post by:
I have a self organizing net which aim is clustering words.
Let's think the clustering is about their 2-grams set.
Words then are instances of this class.
class clusterable(str):
def...
|
by: JohnR |
last post by:
From reading the documentation, this should be a relatively easy thing. I
have an arraylist of custom class instances which I want to search with
an"indexof" where I'm passing an instance if the...
|
by: hn.ft.pris |
last post by:
"Forward maximum match" mean that there is a dictionary contains
thousands of items which are single words. For example, "kid", "nice",
"best"...
And here get a string "kidnicebs", the forward...
|
by: sklett |
last post by:
I realize this could be a little off-topic, but there are some great minds
on this NG and I hope you can let me slide this time ;0)
I'm designing our system to manage what products can fit in...
|
by: prometheuzz |
last post by:
Hello (Java) enthusiasts,
In this article I’d like to tell you a little bit about graphs and how you can
search a graph using the BFS (breadth first search) algorithm.
I’ll address, and...
|
by: pj |
last post by:
Hi,
I 'm currently writing a program that performs transliteration (i.e.,
converts greek text written using the english alphabet to "pure" greek
text using the greek alphabet) as part of my...
|
by: Charles Arthur |
last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
|
by: ryjfgjl |
last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
|
by: emmanuelkatto |
last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud.
Please let me know.
Thanks!
Emmanuel
|
by: BarryA |
last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
|
by: marktang |
last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
|
by: Hystou |
last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
|
by: Oralloy |
last post by:
Hello folks,
I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>".
The problem is that using the GNU compilers,...
|
by: jinu1996 |
last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
|
by: tracyyun |
last post by:
Dear forum friends,
With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
| |