473,698 Members | 2,023 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

I18n issue with optik

Hi,

I've written a script which uses Optik/Optparse to display the
options (which works fine). The text for the help message is localised
(with german umlauts) and when I execute the script with the localised
environment variable set, I get this traceback[1]. The interesting
thing is that the localised optparse messages from displays fine -
it's only my localisation that errors.

From my understanding, my script doesn't put out anything, it's
optik/optparse who does that. My po file is directly copied from the
optik po file (who displays fine) and modified so the po file should
be fine, too.

What can I do to troubleshoot whether the culprit is my script, optik
or gettext?

Would it make sense to post the script and the mo or po files?
Thorsten

[1]
Traceback (most recent call last):
File "script.py" , line 37, in <module>
options, args = cmdlineparser.p arse_args()
File "/usr/lib/python2.5/optparse.py", line 1378, in parse_args
stop = self._process_a rgs(largs, rargs, values)
File "/usr/lib/python2.5/optparse.py", line 1418, in _process_args
self._process_l ong_opt(rargs, values)
File "/usr/lib/python2.5/optparse.py", line 1493, in
_process_long_o pt
option.process( opt, value, values, self)
File "/usr/lib/python2.5/optparse.py", line 782, in process
self.action, self.dest, opt, value, values, parser)
File "/usr/lib/python2.5/optparse.py", line 804, in take_action
parser.print_he lp()
File "/usr/lib/python2.5/optparse.py", line 1648, in print_help
file.write(self .format_help(). encode(encoding , "replace"))
UnicodeDecodeEr ror: 'ascii' codec can't decode byte 0xc4 in position
264: ordinal not in range(128)
Mar 31 '07
23 1315
* Thorsten Kampe (Sun, 1 Apr 2007 20:08:39 +0100)
* Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
Yes, I could do that but I'd rather know first if my code is wrong or
the optparse code.

It might be the bug mentioned in
http://mail.python.org/pipermail/pyt...ay/065458.html

The patch although doesn't work. From my unicode-charset-codepage-
codeset-challenged point of view the encoding of sys.stdout doesn't
matter. The charset is defined in the .po/.mo file (but of course
optparse can't know if the message has been translated by gettext
("_").
If I "patch" line 1648 (the one mentioned in the traceback) of
optparse.py from

file.write(self .format_help(). encode(encoding , "replace"))
to
file.write(self .format_help())

....then everything works and is displayed fine (even without the
"unicode = True" parameter to gettext.install ).

But the "patch" might make other things fail, of course...

Thorsten
Apr 1 '07 #11
* Thorsten Kampe (Sun, 1 Apr 2007 20:22:51 +0100)
* Thorsten Kampe (Sun, 1 Apr 2007 20:08:39 +0100)
* Thorsten Kampe (Sun, 1 Apr 2007 19:45:59 +0100)
Yes, I could do that but I'd rather know first if my code is wrong or
the optparse code.
It might be the bug mentioned in
http://mail.python.org/pipermail/pyt...ay/065458.html

The patch although doesn't work. From my unicode-charset-codepage-
codeset-challenged point of view the encoding of sys.stdout doesn't
matter. The charset is defined in the .po/.mo file (but of course
optparse can't know if the message has been translated by gettext
("_").

If I "patch" line 1648 (the one mentioned in the traceback) of
optparse.py from

file.write(self .format_help(). encode(encoding , "replace"))
to
file.write(self .format_help())

...then everything works and is displayed fine [...]
....but only in Cygwin rxvt, the standard Windows console doesn't show
the right colors.

I give up and revert back to ASCII. This whole charset mess is not
meant to solved by mere mortals.

Thorsten

Apr 1 '07 #12
Thorsten Kampe napisa³(a):
>>Under Windows I get " File "G:\program files\python\li b\encodings
\cp1252.py" , line 12, in encode
return codecs.charmap_ encode(input,er rors,encoding_t able)"
I'm not very experienced with internationaliz ation, but if you change::

gettext.install ('test')

to::

gettext.install ('test', unicode=True)

what happens?

No traceback anymore from optparse but the non-ascii umlauts are
displayed as question marks ("?").
And this is expected behaviour of encode() with errors set to 'replace'.
I think this is the solution to your problem. I was a bit surprised I
never saw this error, but I always use the unicode=True setting to
gettext.install ()...

--
Jarek Zgoda
http://jpa.berlios.de/
Apr 1 '07 #13
On Apr 1, 8:47 am, Thorsten Kampe <thors...@thors tenkampe.dewrot e:
I guess the culprit is this snippet from optparse.py:

# used by test suite
def _get_encoding(s elf, file):
encoding = getattr(file, "encoding", None)
if not encoding:
encoding = sys.getdefaulte ncoding()
return encoding

def print_help(self , file=None):
"""print_help(f ile : file = stdout)

Print an extended help message, listing all options and any
help text provided with them, to 'file' (default stdout).
"""
if file is None:
file = sys.stdout
encoding = self._get_encod ing(file)
file.write(self .format_help(). encode(encoding , "replace"))

So this means: when the encoding of sys.stdout is US-ASCII, Optparse
sets the encoding to of the help text to ASCII, too.
..encode() method doesn't set an encoding. It encodes unicode text into
bytes according to specified encoding. That means optparse needs ascii
or unicode (at least) for help text. In other words you'd better use
unicode throughout your program.
But that's
nonsense because the Encoding is declared in the Po (localisation)
file.
For backward compatibility gettext is working with bytes by default,
so the PO file encoding is not even involved. You need to use unicode
gettext.
How can I set the encoding of sys.stdout to another encoding?
What are you going to set it to? As I understand you're going to
distribute your program to some users. How are you going to find out
the encoding of the terminal of your users?

-- Leo

Apr 1 '07 #14
* Leo Kislov (1 Apr 2007 14:24:17 -0700)
On Apr 1, 8:47 am, Thorsten Kampe <thors...@thors tenkampe.dewrot e:
I guess the culprit is this snippet from optparse.py:

# used by test suite
def _get_encoding(s elf, file):
encoding = getattr(file, "encoding", None)
if not encoding:
encoding = sys.getdefaulte ncoding()
return encoding

def print_help(self , file=None):
"""print_help(f ile : file = stdout)

Print an extended help message, listing all options and any
help text provided with them, to 'file' (default stdout).
"""
if file is None:
file = sys.stdout
encoding = self._get_encod ing(file)
file.write(self .format_help(). encode(encoding , "replace"))

So this means: when the encoding of sys.stdout is US-ASCII, Optparse
sets the encoding to of the help text to ASCII, too.

.encode() method doesn't set an encoding. It encodes unicode text into
bytes according to specified encoding. That means optparse needs ascii
or unicode (at least) for help text. In other words you'd better use
unicode throughout your program.
But that's
nonsense because the Encoding is declared in the Po (localisation)
file.

For backward compatibility gettext is working with bytes by default,
so the PO file encoding is not even involved. You need to use unicode
gettext.
You mean

gettext.install ('test', unicode = True)
and
description = _(u'THIS SOFTWARE COMES WITHOUT WARRANTY, LIABILITY OR
SUPPORT!') ?

If I modify my code like this, I don't get any traceback anymore, but
the non-ascii umlauts are still displayed as question marks.
Thorsten
Apr 2 '07 #15
* Jarek Zgoda (Sun, 01 Apr 2007 22:02:15 +0200)
Thorsten Kampe napisa?(a):
>Under Windows I get " File "G:\program files\python\li b\encodings
\cp1252.py", line 12, in encode
return codecs.charmap_ encode(input,er rors,encoding_t able)"
I'm not very experienced with internationaliz ation, but if you change::

gettext.install ('test')

to::

gettext.install ('test', unicode=True)

what happens?
No traceback anymore from optparse but the non-ascii umlauts are
displayed as question marks ("?").

And this is expected behaviour of encode() with errors set to 'replace'.
I think this is the solution to your problem. I was a bit surprised I
never saw this error, but I always use the unicode=True setting to
gettext.install ()...
I can't see the "solution" here. Is the optparse "print_help " function
wrong? Why should there even be errors if I use "unicode = True" with
gettext.install ?

I have ISO-8859-15 gettext translations and I want optparse to display
them correctly. What do I have to do?

Thorsten
Apr 2 '07 #16
Thorsten Kampe napisa³(a):
I can't see the "solution" here. Is the optparse "print_help " function
wrong? Why should there even be errors if I use "unicode = True" with
gettext.install ?

I have ISO-8859-15 gettext translations and I want optparse to display
them correctly. What do I have to do?
Please, see gettext module documentation on this topic.

The solution is: always install your translation with unicode=True
setting. This assures usage of ugettext() instead of gettext() and works
properly with character sets other than ASCII. Your messages are
internally decoded to unicode objects and passed to output. Then the
displayed output will be limited only by the encoding of your terminal,
but it will not crash your program on any inconsistency, you would see
question marks.

--
Jarek Zgoda

"We read Knuth so you don't have to."
Apr 2 '07 #17
* Steven Bethard (Sun, 01 Apr 2007 10:21:40 -0600)
Thorsten Kampe wrote:
I'm not very experienced with internationaliz ation, but if you change::

gettext.install ('test')

to::

gettext.install ('test', unicode=True)

what happens?
Actually, this is the solution.

But there's one more problem: the solution only works when the
Terminal encoding is not US-ASCII. Unfortunately (almost) all
terminals I tried are set to US-ASCII (rxvt under Cygwin, Console[1]
running bash, Poderosa[2] running bash). Only the Windows Console is
CP852 and this works.

I got the tip to set a different encoding by
sys.stdout = codecs.EncodedF ile(sys.stdout, 'utf-8')

but unfortunately this does not change the encoding of any Terminal.
So my question is: how can I set a different encoding to sys.stdout
(or why can I set it without any error but nothing changes?)
Thorsten

[1] http://sourceforge.net/project/scree...group_id=43764
[2] http://en.poderosa.org/present/about_poderosa.html
Apr 2 '07 #18
* Jarek Zgoda (Mon, 02 Apr 2007 17:52:34 +0200)
Thorsten Kampe napisa?(a):
I can't see the "solution" here. Is the optparse "print_help " function
wrong? Why should there even be errors if I use "unicode = True" with
gettext.install ?

I have ISO-8859-15 gettext translations and I want optparse to display
them correctly. What do I have to do?

Please, see gettext module documentation on this topic.

The solution is: always install your translation with unicode=True
setting. This assures usage of ugettext() instead of gettext() and works
properly with character sets other than ASCII. Your messages are
internally decoded to unicode objects and passed to output. Then the
displayed output will be limited only by the encoding of your terminal,
You are right. My problem is that all the terminals I use are set to
US-ASCII (rxvt under Cygwin, Console[1] running bash, Poderosa[2]
running bash). Even those who actually support non-ASCII characters.

I got the tip to set a different encoding by
sys.stdout = codecs.EncodedF ile(sys.stdout, 'utf-8')

but unfortunately this does not change the encoding.

So my question is: how can I set a different encoding to sys.stdout
(or why can I set it without any error but nothing changes?)
Thorsten

[1] http://sourceforge.net/project/scree...group_id=43764
[2] http://en.poderosa.org/present/about_poderosa.html
Apr 2 '07 #19
Thorsten Kampe schrieb:
[snipp]
I got the tip to set a different encoding by
sys.stdout = codecs.EncodedF ile(sys.stdout, 'utf-8')

but unfortunately this does not change the encoding of any Terminal.
So my question is: how can I set a different encoding to sys.stdout
(or why can I set it without any error but nothing changes?)
AFAIK you can't. If the terminal is limited to ascii it won't be able to
display anything else; it might not even have the right font, so how are
you supposed to fix that? The .encode(encodin g, "replace") ensures safe
downgrades though.

cheers
Paul

Apr 2 '07 #20

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

10
2988
by: Albretch | last post by:
.. Can you define the Character Set for particular tables instead of databases? . Which DBMSs would let you do that? . How do you store in a DBMS i18n'ed users' from input, coming over the web (basically from everywhere) store it and properly serve it back to users, . . .? . Can you point me to info on this? I would preferably use Java/JDBC drivers.
0
1256
by: Laszlo Zsolt Nagy | last post by:
Hello, I wonder if there is a standard for making i18n in Python projects. I have several Python projects that are internationalized. I also have Python packages with i18n. But it is still not clean to me what is the recommended way to do it. Currently, I use a module called 'localization.py' with this code: from i18n_domain import DOMAIN import gettext
10
3731
by: Albretch | last post by:
.. Can you define the Character Set for particular tables instead of databases? . Which DBMSs would let you do that? . How do you store in a DBMS i18n'ed users' from input, coming over the web (basically from everywhere) store it and properly serve it back to users, . . .? . Can you point me to info on this? I would preferably use Java/JDBC drivers.
4
1417
by: Donn Ingle | last post by:
Hi, A soon-to-be happy new year to everyone! I'm 100% new to this i18n lark and my approach so far has been to create a .mo file per module in my app. My thinking was, why load one huge .mo file when a single module only needs a few strings? Since then, it seems, I have made the wrong decision. For example I have module A that imports module B. Each one does this:
0
8674
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9027
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8861
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7725
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5860
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4369
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4619
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3046
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
2329
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.