473,569 Members | 2,536 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Use arrays to store multilanguage texts

I am wondering, if it's effective to use text arrays to store
multilanguage information.

We used to do it like this:

CREATE TABLE product (
id serial NOT NULL,
price float4,
...
)

CREATE TABLE product_txt (
product_id integer,
language_id integer, -- 1=English, 2=Czech, ...
name text,
description text,
...
PRIMARY KEY (product_id, language_id)
)

Then in queries for English version we used joins:

SELECT product.*, product_txt.*
FROM product JOIN product_txt ON product.id=prod uct_txt.product _id
WHERE product_txt.lan guage_id=1

It works as is supposed to, but of course there is some database
overhead and mainly it is much more complicated to handle data this way.
Since 7.4 introduced arrays, I was thinking about using them for storing
multilanguage strings. Like this:

CREATE TABLE product (
id serial NOT NULL,
price float4,
name text[],
description text[]
)

Then I'd just do:

SELECT id, price, name[1], description[1] FROM product

Much simpler and IMHO faster (but I'm not a pg-hacker). I never had time
to test it much, but now we are going to build a new database with
multilanguage strings and I am seriously considering using arrays.

The only question, which remains unanswered is how is it with indexes. I
mean--most of the queries on such table are ordered by name, for
example. Therefore I'd do this:

SELECT id, price, name[1], description[1] FROM product ORDER BY name[1]

Is it possible to build an index, which will be used in such query? I
had no luck with CREATE INDEX product_name1 ON product (r[1]), but maybe
the syntax is just somehow different.

Are there any more drawbacks or limitations, that maybe I am not aware
of, which would discourage you from doing the multilanguage support this
way? Should we consider this or stick to the old ways?

Thanks for your attention.

--
Michal Taborsky
http://www.taborsky.cz
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (MingW32)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFAuFh5f8h kbVJsvrIRAsw6AJ 48RbAY5Uzsnzavn FBN/PHbtkRhOwCghjTO
dUw7fk8M2ae/H+T5k78947s=
=V6NY
-----END PGP SIGNATURE-----

Nov 23 '05 #1
5 3037

Michal Táborský <mi****@taborsk y.cz> writes:
I am wondering, if it's effective to use text arrays to store
multilanguage information.

We used to do it like this:
I went through almost exactly the same design evolution. My initial design had
dozens of "text_id" fields littered throughout the database and had a
"localized_text " table with the expansion of every text_id for every language.

It was a total headache.
name text[],
description text[]
That's exactly what I have now.
Then I'd just do:

SELECT id, price, name[1], description[1] FROM product
Well I defined functions lang_?? for each language code, so that the index of
each language wasn't hard coded into the application.
The only question, which remains unanswered is how is it with indexes. I
mean--most of the queries on such table are ordered by name, for
example. Therefore I'd do this:
Well in my case I was never using an index on the localized text columns
anyways.
SELECT id, price, name[1], description[1] FROM product ORDER BY name[1]

Is it possible to build an index, which will be used in such query? I
had no luck with CREATE INDEX product_name1 ON product (r[1]), but maybe
the syntax is just somehow different.
It is in 7.4. You might need another set of parentheses to make it parse, like
"ON product ((r[1]))". In 7.3 you would need to make a function for each
language and index on lang_en(r). As I said I think that's not a bad idea
anyways.
Are there any more drawbacks or limitations, that maybe I am not aware
of, which would discourage you from doing the multilanguage support this
way? Should we consider this or stick to the old ways?


My main worry is what happens if I have to add a new language. I'll have to
update every record of every table to add a new element to the arrays. There's
no centralized place that "knows" about all the translated texts, I have to
know all the tables and columns that will have to be fixed.

Similarly it's impossible to query the database for "tell me all the strings
that haven't been translated yet". I would have to do an individual query for
every column of this type.

Note also that in my case I'm entering the data for both languages (we only
have two so far) at the same time in the same form. Most forms in the
application just have two columns and the user has to fill out the same
information in each language.

If you wanted a more traditional l10n interface where the user has to enter
translations for every string in the application then again the inability to
pull out all the strings might be a problem.

But in my model the convenience of being able to update both languages in the
same update and not having to worry about the order in which inserts are done
and keeping track of ids to ensure relationships, are huge. I can't imagine
having continued in the other track.

The benefit of having the translation available immediately without having to
do an extra join depends a lot on the application. In my case I had dozens of
fields to look up and often in the midst of fairly complex queries. I was
worried about performance and also about it raising the likelihood of the
optimizer getting confused. Also it made utter messes of the queries.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postg resql.org

Nov 23 '05 #2
Michal Táborský wrote:
I am wondering, if it's effective to use text arrays to store
multilanguage information.
[...snip...]
SELECT id, price, name[1], description[1] FROM product ORDER BY name[1]

Is it possible to build an index, which will be used in such query? I
had no luck with CREATE INDEX product_name1 ON product (r[1]), but maybe
the syntax is just somehow different.
Maybe something like this:

CREATE TABLE product (
id serial NOT NULL,
price float4,
name text[],
description text[]
);
insert into product (price, name, description)
values (10,
array['apples-english','apple s-spanish','apple s-german'],
array['big bunch of apples-english','...-spanish','...-german']
);
insert into product (price, name, description)
values (42,
array['pears-english'],
array['big bunch of pears-english']
);
create or replace function get_lang(text) returns int as '
select case
when $1 = ''english'' then 1
when $1 = ''spanish'' then 2
when $1 = ''german'' then 3
else 1
end
' language sql strict immutable;
create or replace function get_lang_str(te xt[], int)
returns text as '
select coalesce($1[$2], $1[1])
' language sql strict immutable;
create index product_idx1_en glish on
product(get_lan g_str(name,get_ lang('english') ));
create index product_idx1_sp anish on
product(get_lan g_str(name,get_ lang('spanish') ));
create index product_idx1_ge rman on
product(get_lan g_str(name,get_ lang('german')) );

set enable_seqscan to off;
explain analyze
select
id,
price,
get_lang_str(na me,get_lang('sp anish')) as name,
get_lang_str(de scription,get_l ang('spanish')) as description
from
product
order by
get_lang_str(na me,get_lang('sp anish'));
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Scan using product_idx1_sp anish on product (cost=0.00..6.0 3
rows=2 width=72) (actual time=0.147..0.1 93 rows=2 loops=1)
Total runtime: 0.246 ms
(2 rows)

select
id,
price,
get_lang_str(na me,get_lang('fo o')) as name,
get_lang_str(de scription,get_l ang('foo')) as description
from
product
order by
get_lang_str(na me,get_lang('fo o'));
id | price | name | description
----+-------+----------------+-----------------------------
1 | 10 | apples-english | big bunch of apples-english
2 | 42 | pears-english | big bunch of pears-english
(2 rows)

select
id,
price,
get_lang_str(na me,get_lang('sp anish')) as name,
get_lang_str(de scription,get_l ang('spanish')) as description
from
product
order by
get_lang_str(na me,get_lang('sp anish'));
id | price | name | description
----+-------+----------------+----------------------------
1 | 10 | apples-spanish | ...-spanish
2 | 42 | pears-english | big bunch of pears-english
(2 rows)

Are there any more drawbacks or limitations, that maybe I am not aware
of, which would discourage you from doing the multilanguage support this
way? Should we consider this or stick to the old ways?


Notice I was trying to be careful about cases where a bad language
string is used, or the array is missing languages other than english.
I.e. I picked english as the default language and would need to ensure
every array had at least that string.

I'm not sure how maintainable it will be. You'd have to play with it for
a while and decide for yourself.

It would be interesting to see a speed comparison in a real-life
application -- so if you give it a try, please let us know how it turns out.

HTH,

Joe
---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

Nov 23 '05 #3
> > I am wondering, if it's effective to use text arrays to store
multilanguage information.

We solved it like this:

http://savannah.gnu.org/cgi-bin/view...N.sql?rev=1.15

- i18n_curr_lang holds the (ISO) language string per user
- i18n_keys holds the strings to be translated
- i18n_translatio ns holds the translations
- function i18n() inserts a text value into i18n_keys thus
marking it for translation, this we use during insertion of
data in the "primary language" (think gettext, think English)
- function _() returns a translated text value (or it's
primary language equivalent), we use that in queries and view
definitions
- v_missing_trans lations is a view of, well, ...

Then here:

http://savannah.gnu.org/cgi-bin/view...ons.py?rev=1.3

we've got a Python tool that will connect to your database and
create schema files ready for re-insertion via psql after
adding the missing translations. Kind of like a gettext po
file.

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #4

Karsten Hilbert <Ka************ *@gmx.net> writes:
I am wondering, if it's effective to use text arrays to store
multilanguage information.
We solved it like this:

http://savannah.gnu.org/cgi-bin/view...N.sql?rev=1.15


This doesn't work for me I get

/cgi-bin/viewcvs/gnumed/gnumed/gnumed/serve: unknown location
- i18n_curr_lang holds the (ISO) language string per user
I'm interested in this part. Is there a facility for having variables on the
postgres side that you can set once at the beginning of a session and then
refer to in functions?

I instead made my application insert lang_en,lang_fr etc into the queries as
appropriate. This made the application code a little uglier than necessary,
but it's not too bad, I put the function name in a variable and call
$xlate(column) in a double-quoted string.

I think having a global postgres variable would be a nice shortcut, though I
fear it would make it harder to do things like index lang_en() and lang_fr().

Do you cheat and make your _() IMMUTABLE? It's only immutable if you guarantee
that i18n_currr_lang can never change in the middle of a query. But even then
you're lying to the server and indexing _() would do strange things, for
example. In fact I think it would produce corrupted indexes if it worked at
all.
- v_missing_trans lations is a view of, well, ...


This is the one thing that the array solution can't do. But in my case it's
dynamic text and both languages are being provided at the same time by the
same people. I'm not going to have a translator going through batches of them
like gettext.

Actually I also have a separate solution for the static application text that
is going to be translated in the gettext style. In that case there is a
translations table much like gettext. However all the logic for lookups is
handled on the application side in a Smarty prefilter.

--
greg
---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddres sHere" to ma*******@postg resql.org)

Nov 23 '05 #5
http://savannah.gnu.org/cgi-bin/view...N.sql?rev=1.15

Works for me. You'll have to make sure you get the entire
string for the URL. It may have been wrapped by your mail
handling agent.
- i18n_curr_lang holds the (ISO) language string per user
I'm interested in this part. Is there a facility for having variables on the
postgres side that you can set once at the beginning of a session and then
refer to in functions?

We set them the first time a user connects to a particular
database. Unless they are changed later (possible anytime with
immediate transparent switching of languages in queries and
views) the language will stay the same across sessions/
clients/ connections.
I instead made my application insert lang_en,lang_fr etc into the queries as
appropriate. This made the application code a little uglier than necessary,
but it's not too bad, I put the function name in a variable and call
$xlate(column) in a double-quoted string.
We then use select _(fruits.name) style queries. The _()
function uses the language in i18n_curr_lang (for
CURRENT_USER).
I think having a global postgres variable would be a nice shortcut, though I
fear it would make it harder to do things like index lang_en() and lang_fr(). I'd think on could index i18n_translatio ns just fine:

create unique index on i18n_translatio ns(trans) where lang='xx';
Do you cheat and make your _() IMMUTABLE? It's only immutable if you guarantee
that i18n_currr_lang can never change in the middle of a query. Given that it'd take a writing query that should run in a
transaction it shouldn't change within a read query. However,
I'm not too sure about views, like we use translated columns
in views like this:

create view a_view as
select
bla as bla,
_(bla) as l10n_bla
....
But even then
you're lying to the server and indexing _() would do strange things, for
example. In fact I think it would produce corrupted indexes if it worked at
all.

I haven't experienced any of that but it may well happen -
dunno.
- v_missing_trans lations is a view of, well, ...


This is the one thing that the array solution can't do. But in my case it's
dynamic text and both languages are being provided at the same time by the
same people. I'm not going to have a translator going through batches of them
like gettext.

Well, you wouldn't *have* to use that view, it's just for
convenience. You could just as well insert the dynamic text
into i18n_translatio ns (lang, orig, trans) whenever dynamic
text is inserted. We mainly use that (+ i18n()) for gettext
style batch translation of static application text. Except
that all the lookup is handled by _() on the server right
inside the query :-)

Karsten
--
GPG key ID E4071346 @ wwwkeys.pgp.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postg resql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #6

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
1771
by: Giovane Calabrese | last post by:
ok. im try to make a multilanguage system based on that tutorial : http://www.123aspx.com/redir.aspx?res=29112 in my aspx pages all text are in labels , and i want to take the name labels ( in one loop ) and for each label take on the XML document the right(in the right language) Value (text).
1
1544
by: Frank Landen | last post by:
Hi NG, I´ve wrote a really big application in vb6 and now, i´ve to implement a multilanguage function in it. I´ve searched with google for a tool, which i could use to do the localization without having so much work. I now the resource-file.... ...but isnt there nothing else ? It would be nice, when I could change the language at runtime,...
1
1342
by: R.A.M. | last post by:
Hi, (Sorry for my English...) I am learning .NET 2.0 and I have a question: how to, generally, implement applications with many country languages? For instance, I want to write ASP.NET application with language selection on logon page and then have texts depening on language selected. Thank you very much for help. /RAM/
41
4901
by: Rene Nyffenegger | last post by:
Hello everyone. I am not fluent in JavaScript, so I might overlook the obvious. But in all other programming languages that I know and that have associative arrays, or hashes, the elements in the hash are alphabetically sorted if the key happens to be alpha numeric. Which I believe makes sense because it allows for fast lookup of a key.
15
2499
by: AG | last post by:
I have seen the samples for multilanguage support regarding what is normally static content. Can anyone point me to a good sample for multilanguage support for dynamic content. Like a gridview that is populated from a database? TIA -- AG Email: discuss at adhdata dot com
1
2728
by: paolob | last post by:
I need to provide multilanguage support to my application. I alredy writed a multilanguage test application, using VS6 and resource only dll. Now i'm using Visual Studio 2005, I find out that it supports multilanguage through .resx file, but I can't find a good example that explains me how to do this. I appreciate very much if you can point...
1
1252
by: nickiz | last post by:
Hello, i'm trying to create a simple converter application using php. The main function i use is strtr() that can translate from $string via arrays. Because the arrays need to add in the file php, so it is difficult to user add their own word in this application. I was create a database that contain 2 fields. The question is how the script to...
0
1166
by: Sourcerer | last post by:
How to implement multilanguage support for a program in VC++ in MSVS .NET? What I mean by this is how to make texts, such as texts on command buttons, labels, etc. show in different languages, depending on the language choice of the user (of course, not that I would be translating these personally, but it would be nice if only a support for...
7
1301
by: Giovanni | last post by:
Hi! I've an array of string like this: $arraystring="John"; $arraystring="50"; $arraystring="teacher"; $arraystring is global variable. I'm trying to code a function that get a string as parameter and print the relative arraystring's value.
0
7614
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
1
7676
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
6284
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5219
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3653
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3642
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2114
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1221
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
938
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.