Help | Site Map
Connecting Tech Pros Worldwide
Reply
 
LinkBack Thread Tools
  #1  
Old August 29th, 2008, 01:32 PM
Newbie
 
Join Date: Aug 2008
Posts: 9
Default Need help regarding my word solving software

I have made a utility in c++ which solves the jumbled words.This is done by comparing all the permutatons of the word given with the words stored in the database.The database consists of around 80000 words .The database is in the form of 26 datafiles one for each alpahbet(a-z).
My problem is that the program takes too much time as it compares each permutation of the word given with the all the words in the concerned datafile.Say, for "euque " ,it will compare it with datafile containing words starting with letter "e". So as the length of the word increases ,the time taken to solve that also increases.
For my core-2-duo 2.0 Ghz processor time taken to solve a 7-letter word is around 8 seconds and for an 8-letter word it is about 1 minute.
Kindly help me in improving the speed of my program so that it can solve at least 11 letter word in optimum time.
For more details of program , you can conatct me at
< address and phone number deleted >
Reply
  #2  
Old August 29th, 2008, 01:41 PM
Moderator
 
Join Date: Mar 2007
Location: Voorschoten, the Netherlands
Age: 52
Posts: 8,460
Default

I don't really understand your question; given a bunch of characters do you want
to see if a permutation of them makes up a word in your database?

kind regards,

Jos

ps. for your own privacy I removed your address details
Reply
  #3  
Old August 29th, 2008, 01:47 PM
Needs Regular Fix
 
Join Date: Sep 2007
Location: The Netherlands
Posts: 428
Default

The way you are doing it is probably the most inefficient way, when you look at "euque", you don't need to search through 80000 words in your head to figure out the correct spelling is queue. There are many efficient text matching algorithms out there, googling a bit should yield something much more efficient.

As you pointed out, the first letter can change in the course of finding the correct solution. The length of the word, however, does not. How about sorting your dictionary according to word length instead of starting letter? This has the advantage that, although longer words generate more permutations, there probably are fewer long words in the English language so that will definitely speed things up. Plus you will have fewer files and you will only need to open one file for each new jumbled word.
Reply
  #4  
Old August 29th, 2008, 01:57 PM
Moderator
 
Join Date: Mar 2007
Location: Voorschoten, the Netherlands
Age: 52
Posts: 8,460
Default

I think the OP should work on his database instead. For all the words in the db
make a new db as follows: take a word from your old database and order the
characters in the word alphabetically. That new 'word' is the key; the original
word is a value that is associated with that key. Add the tuple to your new db
and repeat for all the words in your old db.

You end up with something like this (just a tiny example):

aaabnn - banana
aadt - data
adn - and dna
cdku - duck
eequu - queue
...
etc.etc.

Now when you have a permutation, permute it to a key of your database and voila.

kind regards,

Jos
Reply
  #5  
Old August 29th, 2008, 02:06 PM
Needs Regular Fix
 
Join Date: Sep 2007
Location: The Netherlands
Posts: 428
Default

Indeed, that would be extremely fast for all word lengths.
Reply
  #6  
Old August 29th, 2008, 02:38 PM
Moderator
 
Join Date: Mar 2007
Location: Voorschoten, the Netherlands
Age: 52
Posts: 8,460
Default

Quote:
Originally Posted by arnaudk
Indeed, that would be extremely fast for all word lengths.
yup, an 80,000 words isn't that much (*); everything can be done in core speeding
everything up much more again. An STL map could do the dirty work ;-)

kind regards,

Jos

(*) even though 80,000 words isn't much, the entire King James bible uses less
that 32,000 words (including conjugations etc.) and 32,000 tells me: 15 bits ;-)
Reply
Reply

Bookmarks

Thread Tools

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are Off
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over network members.
Post your question now . . .
It's fast and it's free

Popular Articles