You also need a way to convert sentences grammatically. i.e. How do you take the grammatical structure of a sentence, keeping its meaning intact and convert it to the grammatical structure in another language keeping that meaning intact? This is obviously a logical task to consider rather than a programmatic one. Once you figure out how to logically translate from one language to another, the code should be relatively easy.
Often words mean different things in different contexts so in one context a word may mean something completely different from another context. Also, you have to account for colloquialisms - for instance English to French, we have a lot of figures of speech in English that just don't work in French and vice versa. If you called your wife "My cabbage", what would she think? She'd likely think you were off your rocker. But in French, "Mon chou" is a perfectly acceptable pet name.
If it's just words, then a DataTable of words should suffice. Have as many columns as you need languages, each of the columns would be indexed to allow rapid searching in any language. When someone enters a word and selects a language, you just do a search to match the words.
You might want to account for the fact that some words have multiple types depending on their context. Think about the word text which for better or worse in today's English language is both a noun and a verb (which I personally think is ridiculous!). It can describe a bunch of words, or it can describe the act of sending a text message via SMS.
You could get more complex than single words, you could have a multi-table structure. The following structure would allow you to add languages without changing your database structure which would allow for future growth...however, it still doesn't account for grammar.
I would probably consider the following structure as a base, which catches all the basic vocabulary before going on to consider something more advanced for grammar translation.
- /* This table contains a list of all your languages */
-
TABLE Languages(
-
Language_Key Int Identity(1, 1) PRIMARY KEY,
-
Language_Name Varchar(24)
-
)
-
-
/* This table contains all your types of words noun, pronoun, adjective, colloq. etc. */
-
TABLE Word_TypeDefs(
-
WTDef_Key Int Identity(1, 1) PRIMARY KEY,
-
WTDef_Name Varchar
-
)
-
-
/* This table contains all your words, in as many languages as you support */
-
TABLE Words(
-
Word_Key Int Identity(1, 1) PRIMARY KEY,
-
Word_Language Int FOREIGN KEY REFERENCES Languages(Language_Key),
-
Word Varchar(64),
-
Word_Recording Varbinary,
-
Word_Phonetic Varchar(64)
-
)
-
-
/* Because a word can be of multiple type definitions, we should split these out to their own table. For instance, odd words that could be considered both noun and verb depending on their context or for words that are nouns and colloquialisms etc. */
-
TABLE Word_Type(
-
WType_Key Int Identity(1, 1) PRIMARY KEY,
-
Word_Key Int FOREIGN KEY REFERENCES Words(Word_Key),
-
Word_TypeDef Int FOREIGN KEY REFERENCES Word_TypeDefs (WTDef_Key)
-
)
-
-
/* Allows you to have multiple language definitions i.e. you could take the word hippopotamus, there's little point in your user looking at the definition of the word in English if they only read Dutch. So you could display the definition in whatever their native language is...or they could choose to read it in the language that the word is from if they choose. */
-
TABLE Word_Definition(
-
WDef_Key Int Identity(1, 1) PRIMARY KEY,
-
Word_Key Int FOREIGN KEY REFERENCES Words(Word_Key),
-
WDef_Language Int FOREIGN KEY REFERENCES Languages(Language_Key),
-
WDef_Text Text
-
)
-
-
/* This table links words together by setting a parent word and a word that parent references in other languages. You probably want some mechanism when adding references to make sure that duplicate references don't get added, slowing your database down. For instance, if word 1 references word 2, you don't want two rows in your table where (Parent=1, Reference=2) and (Parent=2, Reference=1) even though this would satisfy a "unique" clause, it's still clearly a duplicate in this scenario. */
-
TABLE Word_References(
-
WRef_Key Int Identity(1, 1) PRIMARY KEY
-
Parent_Word Int FOREIGN KEY REFERENCES Words(Word_Key)
-
Reference_Word Int FOREIGN KEY REFERENCES Words(Word_Key)
-
)
If I sit and think some more, I could probably add some more complexity for a more flexible model, but I think that is probably advanced enough for most scenarios.