By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,737 Members | 1,971 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,737 IT Pros & Developers. It's quick & easy.

Build a English-French dictionary in Java

P: 2
Hi i'm actually doing some research on how to develop a dictionary English-French using Java. Can anyone help me please
Jan 5 '07 #1
Share this Question
Share on Google+
3 Replies


10K+
P: 13,264
Hi i'm actually doing some research on how to develop a dictionary English-French using Java. Can anyone help me please
Deepends on how far you are willing to go with it. A HashMap might be a starting point for a small program. You may also want to consider storing the words in a database.
Jan 5 '07 #2

DeMan
100+
P: 1,806
Assuming that you want to give word translation (that is if someone types in "yes" you give "oui" and some other synonyms) and that you are not, for example, trying to automate the translation of the entire Dickens collection, I would suggest something along the lines of <and yes this would be VERY timeconsuming>....

We're going to use a tree structure (i think they're called suffix trees and dictionary trees depending on where in the world you are <though, as always, I expect to be corrected if they're something else> and assume you only want eng-fra (if you want to go the other way you may have to do the same work again unless someone has a better idea)

Build a tree where each node has 27 children (more if you are interestend in hyphenated, accented or any othered words). The root node is unique and called (to be original) 'root' (or something equally as meaningful). The general idea is that each child represents a letter (with a leaf extension making the 27th node) so that (as an example) cat would follow from root-c-a-t-leaf. The leaf node, rather than containing children contains a container of the translations for the word we have reached. (I know my explanations aren't always clear so ask questions when you (or i) get confused).

Obviously not all the tree has to be implemented (so long as you have good checks to make sure hasChild('x') so that you don't need to create paths root-x-z-q-y-r-w-c-v-end.

A further optimization (which is not quite as easy as it sounds) is to store the maximum unique suffix string, ie if the only word in the english dictionary to begin aa were aardvark (and I'm not claiming it is) the path would be root-a-a-rdvark...That is you store the remainder of the string once it is unique. Personally, I think this is unnecessary these days when space is not a major issue in storage (though I'm sure others disagree, particularly in projects the scale of yours).

The advantage of using such a tree structure, is that it ,akes searching quite easy.....

if someone requests the word philanthropist you know to look down through p branch's h-branch etc (which is why you need a good test for "node doesn't exist <yet>).

I'm starting to confuse myself, so I've probably confused everyone else, but if you need some further explanation post back with what i haven't explained well enough and I'll see if I can do a better job
Jan 5 '07 #3

10K+
P: 13,264
Assuming that you want to give word translation (that is if someone types in "yes" you give "oui" and some other synonyms) and that you are not, for example, trying to automate the translation of the entire Dickens collection, I would suggest something along the lines of <and yes this would be VERY timeconsuming>....

We're going to use a tree structure (i think they're called suffix trees and dictionary trees depending on where in the world you are <though, as always, I expect to be corrected if they're something else> and assume you only want eng-fra (if you want to go the other way you may have to do the same work again unless someone has a better idea)

Build a tree where each node has 27 children (more if you are interestend in hyphenated, accented or any othered words). The root node is unique and called (to be original) 'root' (or something equally as meaningful). The general idea is that each child represents a letter (with a leaf extension making the 27th node) so that (as an example) cat would follow from root-c-a-t-leaf. The leaf node, rather than containing children contains a container of the translations for the word we have reached. (I know my explanations aren't always clear so ask questions when you (or i) get confused).

Obviously not all the tree has to be implemented (so long as you have good checks to make sure hasChild('x') so that you don't need to create paths root-x-z-q-y-r-w-c-v-end.

A further optimization (which is not quite as easy as it sounds) is to store the maximum unique suffix string, ie if the only word in the english dictionary to begin aa were aardvark (and I'm not claiming it is) the path would be root-a-a-rdvark...That is you store the remainder of the string once it is unique. Personally, I think this is unnecessary these days when space is not a major issue in storage (though I'm sure others disagree, particularly in projects the scale of yours).

The advantage of using such a tree structure, is that it ,akes searching quite easy.....

if someone requests the word philanthropist you know to look down through p branch's h-branch etc (which is why you need a good test for "node doesn't exist <yet>).

I'm starting to confuse myself, so I've probably confused everyone else, but if you need some further explanation post back with what i haven't explained well enough and I'll see if I can do a better job
Disappearing from the java forum and appearing at will won't help too much either. Nice to pop in though. You should do that more often though...

Interesting solution you have given there. I guess it depends on how far the OP is willing to take it or what his/her specs say. For example a tree data structure is already available in one of the java packages (java.util.TreeMap). The OP may decide to use it create their own tree. Or the OP might already have the words in some file and just wants to create an interface program that retrieves the words by making some kind of a mapping.


I trust your holidays were great.
Jan 5 '07 #4

Post your reply

Sign in to post your reply or Sign up for a free account.