473,509 Members | 3,543 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

words within a word

19 New Member
Hi all. I've been given an assignment to construct an algorithm that can find all words that occur within a given word. For example:

Given furthermore, solutions could be
form,
the,
further,
more,
theme
...and so on.

I thought about using the STL next_permutation() function, but it only finds permutation with the same amount of letters. Searching the dictionary for correct words isn't hard, I just can't start to figure out how to come up with all of the words. Any suggestions?
Sep 9 '08 #1
6 3295
newb16
687 Contributor
something like this
Expand|Select|Wrap|Line Numbers
  1. foreach( word in dictionary)
  2. {
  3.  foreach ( letter in [A..Z] )
  4.  {
  5.   if( letter_count(word, letter)>letter_count(original_word, letter) )
  6.    next word;
  7.  }
  8.  print word;
  9.  
ie if each letter occurs the same or less times in dictionay word that original word, it can be constructed from the letters of original word.
Calculation of letter_count(original_word,letter) probably should be moved outside the loop because is has the same value for any word_in_dictionary.
Sep 10 '08 #2
toefraz
19 New Member
I looked at that algorithm, but it requires looking at each word in the dictionary, and then seeing if it can be made from the given word, and is a very slow algorithm (takes about 15seconds on my core 2 duo 2ghz!) I know the faster way would be to find all combinations of the given word and matching them against the dictionary because then you wouldn't have to check every single dictionary entry. I just can't figure out how to do the combinatorial algebra to divine each combination.
Sep 12 '08 #3
Banfa
9,065 Recognized Expert Moderator Expert
Try one on paper, for instance try finding the all the combinations available from 5 letters (A, B, C, D, E).

Do it by the number of combinations for each result length so find the number of combinations of length 1, then length 2 and 3, 4 and 5.

Look at how you did it and then try to duplicate it on your computer as an algorithm.

Be aware that if you have 2 letters the same in the initial word you risk producing some duplicate results. That is if the word is "free" and you are picking 2 letter combinations then there are 2 ways of picking both "fe" and "re".

You could also try look up the maths for combinations which is binomial coefficients.
Sep 12 '08 #4
arnaudk
424 Contributor
Once you have the various permutations, you'll need to check if they're actual words. For an efficient way of doing this, check out this thread.
Sep 12 '08 #5
donbock
2,426 Recognized Expert Top Contributor
I looked at that algorithm, but it requires looking at each word in the dictionary, and then seeing if it can be made from the given word, and is a very slow algorithm (takes about 15seconds on my core 2 duo 2ghz!) I know the faster way would be to find all combinations of the given word and matching them against the dictionary because then you wouldn't have to check every single dictionary entry. I just can't figure out how to do the combinatorial algebra to divine each combination.
You may be disappointed by that strategy. Take your example: "furthermore". Your strategy requires you to first compose a list of all the 1-letter sequences, 2-letter sequences, ... and finally the 11-letter sequences that can be formed from those 11 letters and then check each against your dictionary to see if it is a word.

Notice that I said "sequence" rather than your term "combination". That's because "combination" is a technical term in this context. You should look up the terms "permutation" and "combination" to see which is appropriate for your problem. If you're not already familiar with it, you should also look up the "factorial" operator.

For an 11-letter word, there are 108,303,111 different sequences to compare against your dictionary! Odds are that's a lot more terms than are in your dictionary. This number comes from trusting the combination/permutation function in Excel.

A lot of times the best way to start a coding problem is to NOT code ... but think about the strategies you might use and then find ways to compare them.

Cheers,
donbock
Sep 12 '08 #6
donbock
2,426 Recognized Expert Top Contributor
something like this
Expand|Select|Wrap|Line Numbers
  1. foreach( word in dictionary)
  2. {
  3.  foreach ( letter in [A..Z] )
  4.  {
  5.   if( letter_count(word, letter)>letter_count(original_word, letter) )
  6.    next word;
  7.  }
  8.  print word;
  9.  
ie if each letter occurs the same or less times in dictionay word that original word, it can be constructed from the letters of original word.
Calculation of letter_count(original_word,letter) probably should be moved outside the loop because is has the same value for any word_in_dictionary.
Another approach would be to sort the letters in the original word and then, for each dictionary word, compare the sorted original word against the sorted dictionary word. This is similar to Knuth's anagram algorithm in "Sorting and Searching". Notice that unlike Knuth, you are not seeking equality in this comparison. Presorting the dictionary would be a very good tactic if several input words were to be checked.

Cheers,
donbock
Sep 12 '08 #7

Sign in to post your reply or Sign up for a free account.

Similar topics

3
4313
by: TXSherry | last post by:
Hi, I cannot seem to wrap my brain around preg_replace. Though I've read the help file backwords and forwards. :/ Hoping someone can give me a solution here. Problem: Given string 'str'...
31
2905
by: Brynn | last post by:
I want to thank the person that sent the nasty messages to my contact.asp test page ... you have inspired me !!! I have now added a bad_words filter both to my JSValidation script (going up on...
12
2170
by: teoryn | last post by:
I've been spending today learning python and as an exercise I've ported a program I wrote in java that unscrambles a word. Before describing the problem, here's the code: *--beginning of file--*...
1
4352
by: none | last post by:
Hello there, I have a table with many text and varchar fields and I would like to get all distinct words from these filelds. For example: table Pet: id int(10) unsigned legende text
8
2354
by: Peter Bungart | last post by:
I have a text field in Access (XP) that lists the occurrences of various plants, which are delimited within the field by commas (e.g., cactus, mesquite, yucca, cliffrose). Is there a way to...
7
43561
by: Sling | last post by:
I code in Rexx on the mainframe which has 2 built-in functions: word(s,i) & words(s). word(s,i) returns the ith word in the s(tring), and words(s) returns the number of words within the s(tring)....
4
1649
by: MN | last post by:
Hello all - I'm really at wits end on this as I'm not having much success with locating a function or other options I have. I'm working on a message board for a website and I'm needing to check...
5
5010
by: vonclausowitz | last post by:
Repost from an VB group. Hi All, I'm looking for a way to search for multiple words in a database. There is however one but. The words have to be within a certain range of each other. For...
4
2066
by: sumedh..... | last post by:
In a compiler there are 36bits for a word and to store a character 8bits are needed. In this to store a character two words appended. Then for storing k characters string,how many words are needed?
7
2154
by: Hvid Hat | last post by:
Hello How should I go about marking certain words in a text? I've got a list of words: <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="Mark.xsl"?> <Words>...
0
7137
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
7416
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
7073
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
7506
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
1
5062
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
4732
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
3218
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
1571
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
443
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.