By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,501 Members | 2,817 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,501 IT Pros & Developers. It's quick & easy.

Writing multilingual site - how to effectively output strings depending on language

P: n/a
I plan to write site on a few language. For this reason I'm trying to
find the most convenient way to output strings depending on their
unique ID. As far as I plan to have pretty much strings translated, I'm
concerned about the performance issues any of the approaches may have
on the server. Perhaps someone has already solved that of kind of
problems and can share his/her experience in this matter.

Jul 10 '06 #1
Share this Question
Share on Google+
12 Replies


P: n/a
ro********@gmail.com wrote:
I plan to write site on a few language. For this reason I'm trying to
find the most convenient way to output strings depending on their
unique ID. As far as I plan to have pretty much strings translated, I'm
concerned about the performance issues any of the approaches may have
on the server. Perhaps someone has already solved that of kind of
problems and can share his/her experience in this matter.
I've used two approaches:
1. http://ca.php.net/manual/en/ref.gettext.php
2. my own library to a translation table in the database

-david-

Jul 10 '06 #2

P: n/a
I personally put the text for each language into a separate file, then load
that file into memory at the start of each script. Each individual string is
therefore taken from memory. Because there is only one disk I/O for the
whole file, not each piece of text, it is very fast.

This is documented at
http://www.tonymarston.net/php-mysql...alisation.html

HTH

--
Tony Marston
http://www.tonymarston.net
http://www.radicore.org

<ro********@gmail.comwrote in message
news:11*********************@h48g2000cwc.googlegro ups.com...
>I plan to write site on a few language. For this reason I'm trying to
find the most convenient way to output strings depending on their
unique ID. As far as I plan to have pretty much strings translated, I'm
concerned about the performance issues any of the approaches may have
on the server. Perhaps someone has already solved that of kind of
problems and can share his/her experience in this matter.

Jul 11 '06 #3

P: n/a

ro********@gmail.com wrote:
I plan to write site on a few language. For this reason I'm trying to
find the most convenient way to output strings depending on their
unique ID. As far as I plan to have pretty much strings translated, I'm
concerned about the performance issues any of the approaches may have
on the server. Perhaps someone has already solved that of kind of
problems and can share his/her experience in this matter.
GetText is the standard method. there is a php extension. You write
strings such as

_('Hello bob')

which is then extracted by some of the gettext commandline tools. This
then produces a .po file which can be distributed to translators. When
the translation is done you run another command line tool which creates
a .mo file which is used by php to do the translation.

When you change your text in a _(' ') or add a new one then you can
create a new .po file and merge it with the old .po file which will
produce a third .po file which tells the translator what is new, what
has changed only a little (fzzy matching) or what is the same as last
time. This reduces their workload.

By not using a message id in the code you can write in your own
language in the code and know straight away that the message is the one
you want. You would not have to cross reference to know what the
program is saying.

regards
Fletch

Jul 11 '06 #4

P: n/a
Tony Marston wrote:
I personally put the text for each language into a separate file, then load
that file into memory at the start of each script. Each individual string is
therefore taken from memory. Because there is only one disk I/O for the
whole file, not each piece of text, it is very fast.
Wouldn't the number of disk I/Os depend on the size of the file and the
size of the read buffer in the device driver?

-david-

Jul 11 '06 #5

P: n/a

"David Haynes" <da***********@sympatico.cawrote in message
news:UX*******************@fe08.usenetserver.com.. .
Tony Marston wrote:
>I personally put the text for each language into a separate file, then
load that file into memory at the start of each script. Each individual
string is therefore taken from memory. Because there is only one disk I/O
for the whole file, not each piece of text, it is very fast.

Wouldn't the number of disk I/Os depend on the size of the file and the
size of the read buffer in the device driver?
Stop being pedantic. I'm talking about reading in the whole file into memory
in one operation instead of doing a separate read for each piece of text. So
in my code I perform a single read operation - how the operating system or
file system handles this, such as splitting the operation into several disk
accesses, or even retrieving from memory without accessing the disk at all,
is largely irrelevant.

--
Tony Marston

http://www.tonymarston.net
http://www.radicore.org

Jul 11 '06 #6

P: n/a
On 2006-07-11, Tony Marston <to**@NOSPAM.demon.co.ukwrote:
>
"David Haynes" <da***********@sympatico.cawrote in message
news:UX*******************@fe08.usenetserver.com.. .
>Tony Marston wrote:
>>I personally put the text for each language into a separate file, then
load that file into memory at the start of each script. Each individual
string is therefore taken from memory. Because there is only one disk I/O
for the whole file, not each piece of text, it is very fast.

Wouldn't the number of disk I/Os depend on the size of the file and the
size of the read buffer in the device driver?

Stop being pedantic. I'm talking about reading in the whole file into memory
in one operation instead of doing a separate read for each piece of text. So
in my code I perform a single read operation - how the operating system or
file system handles this, such as splitting the operation into several disk
accesses, or even retrieving from memory without accessing the disk at all,
is largely irrelevant.
That's not pedantic; reading the whole file takes at least as much memory
as the file is big. If you have a 4Kb disk cache, a 16Kb file will require
4 reads throughout the read instead of... 4 immediately. If your page is
in high demand, saving 75% of memory isn't a minor accomplishment.

--
Andrew Poelstra <http://www.wpsoftware.net/projects/>
To email me, use "apoelstra" at the above domain.
"You people hate mathematics." -- James Harris
Jul 11 '06 #7

P: n/a

"Andrew Poelstra" <ap*******@localhost.localdomainwrote in message
news:sl*********************@localhost.localdomain ...
On 2006-07-11, Tony Marston <to**@NOSPAM.demon.co.ukwrote:
>>
"David Haynes" <da***********@sympatico.cawrote in message
news:UX*******************@fe08.usenetserver.com. ..
>>Tony Marston wrote:
I personally put the text for each language into a separate file, then
load that file into memory at the start of each script. Each individual
string is therefore taken from memory. Because there is only one disk
I/O
for the whole file, not each piece of text, it is very fast.

Wouldn't the number of disk I/Os depend on the size of the file and the
size of the read buffer in the device driver?

Stop being pedantic. I'm talking about reading in the whole file into
memory
in one operation instead of doing a separate read for each piece of text.
So
in my code I perform a single read operation - how the operating system
or
file system handles this, such as splitting the operation into several
disk
accesses, or even retrieving from memory without accessing the disk at
all,
is largely irrelevant.

That's not pedantic; reading the whole file takes at least as much memory
as the file is big. If you have a 4Kb disk cache, a 16Kb file will require
4 reads throughout the read instead of... 4 immediately. If your page is
in high demand, saving 75% of memory isn't a minor accomplishment.
But which of these is more efficient?:
(a) Reading a file of 1000 entries into memory so that each entry can be
read from memory.
(b) Reading each of the 1000 entries from disk individually.

It's a toss up between memory usage or speed, and most people go for speed
and let the memory sort itself out.

--
Tony Marston
http://www.tonymarston.net
http://www.radicore.org
Jul 12 '06 #8

P: n/a
Tony Marston wrote:
But which of these is more efficient?:
(a) Reading a file of 1000 entries into memory so that each entry can be
read from memory.
(b) Reading each of the 1000 entries from disk individually.
Seeing as the file will be read into the filesystem cache, its
unlikely that you will be doing 1000 disk access' to get the data.

/marcin
Jul 12 '06 #9

P: n/a

"Marcin Dobrucki" <Ma*************@TAKETHISAWAY.nokia.comwrote in message
news:4J********************@news2.nokia.com...
Tony Marston wrote:
>But which of these is more efficient?:
(a) Reading a file of 1000 entries into memory so that each entry can be
read from memory.
(b) Reading each of the 1000 entries from disk individually.

Seeing as the file will be read into the filesystem cache, its unlikely
that you will be doing 1000 disk access' to get the data.
Since when does reading from memory cause a read from disk?

--
Tony Marston

http://www.tonymarston.net
http://www.radicore.org

Jul 12 '06 #10

P: n/a
Tony Marston wrote:
Since when does reading from memory cause a read from disk?
Well, with virtual memory, that is quite possible. However, seeing as
the file will probably be in the file cache, and will probably be
accessed frequently during the page generation, disk access should be
very limited.

/Marcin
Jul 12 '06 #11

P: n/a

"Marcin Dobrucki" <Ma*************@TAKETHISAWAY.nokia.comwrote in message
news:z3********************@news2.nokia.com...
Tony Marston wrote:
>Since when does reading from memory cause a read from disk?

Well, with virtual memory, that is quite possible. However, seeing as the
file will probably be in the file cache, and will probably be accessed
frequently during the page generation, disk access should be very limited.
You are including a lot of "ifs, ands and buts" which do nothing but
introduce unnecessary complications. The point at issue is about disk
accesses that are written into the program, not executed by the file system.
So as far as I am concerned reading in 1000 entries in one go is more
efficient than reading the same 1000 entries one at a time. 1 read or 1000
reads - which is more efficient?

--
Tony Marston

http://www.tonymarston.net
http://www.radicore.org

Jul 12 '06 #12

P: n/a

Marcin Dobrucki wrote:
Tony Marston wrote:
Since when does reading from memory cause a read from disk?

Well, with virtual memory, that is quite possible. However, seeing as
the file will probably be in the file cache, and will probably be
accessed frequently during the page generation, disk access should be
very limited.

/Marcin
Reading a single file is going to be faster and more efficient, by and
large. There is a substantial amount of overhead with finding and
opening a file compared to doing a simple read on it (consider the
worst case where the 1000 files are spread out across the entire drive,
causing a longer seek time for every file, granted, the OS will try to
reorder the queue, but there is only so much it can do). The file is
going to be cached in an OS buffer once you start reading data in.
Unless the single file is incredibly, incredibly big or split up on the
physical drive, causing more seeks to load the data in or swapped into
virtual memory (which is unlikely, strings aren't that big), it will be
faster. You also have the advantage that it may not have to be opened
again since all the strings are in memory already, compared to some
sort of dynamic loading scheme. Not to mention whatever it is won't
have to keep track of 1000 open file handles and their related
information.

Its similar with executing a bunch of single SELECT statements or one
big SELECT statement: Parsing, executing, then return of data. Seek,
open, then return of data.

Jul 12 '06 #13

This discussion thread is closed

Replies have been disabled for this discussion.