473,473 Members | 1,955 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Locale confusion

[Long posting due to the examples, but pretty simple question.]

I'm sitting here with a Debian Linux 'Woody' system with the default Python
2.2 installation, and I want the re module to understand that
re.compile(r'\W+'. re.LOCALE) doesn't match my national, accented
characters.

I don't quite understand how the locale module reasons about these things,
and Python doesn't seem to act as other programs on my system. Bug or my
mistake? Here's my environment:

frailea> env |grep -e LC -e LANG
LC_MESSAGES=C
LC_TIME=C
LANG=sv_SE
LC_NUMERIC=C
LC_MONETARY=C
frailea> locale
LANG=sv_SE
LC_CTYPE="sv_SE"
LC_NUMERIC=C
LC_TIME=C
LC_COLLATE="sv_SE"
LC_MONETARY=C
LC_MESSAGES=C
LC_PAPER="sv_SE"
LC_NAME="sv_SE"
LC_ADDRESS="sv_SE"
LC_TELEPHONE="sv_SE"
LC_MEASUREMENT="sv_SE"
LC_IDENTIFICATION="sv_SE"
LC_ALL=

This seems to indicate that $LANG acts as a fallback when other things (e.g.
LC_CTYPE isn't defined) and that's also what the glibc setlocale(3) man page
says. Works well for me in general, too. However, consider this tiny Python
program:

frailea> cat foo
import locale
print locale.getlocale()
locale.setlocale(locale.LC_CTYPE)
print locale.getlocale()

When I paste it into an interactive Python session, the locale is already
set up correctly (which is what I suppose interactive mode /should/ do):
import locale
print locale.getlocale() ['sv_SE', 'ISO8859-1'] locale.setlocale(locale.LC_CTYPE) 'sv_SE' print locale.getlocale() ['sv_SE', 'ISO8859-1']


When I run it as a script it isn't though, and the setlocale() call does not
appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale:

frailea> python foo
(None, None)
(None, None)

The corresponding program written in C works as expected:

frailea> cat foot.c
#include <stdio.h>
#include <locale.h>
int main(void) {
printf("%s\n", setlocale(LC_CTYPE, 0));
printf("%s\n", setlocale(LC_CTYPE, ""));
printf("%s\n", setlocale(LC_CTYPE, 0));
return 0;
}
frailea> ./foot
C
sv_SE
sv_SE

So, is this my fault or Python's? I realize I could just adapt and set
$LC_CTYPE explicitly in my environment, but I don't want to capitulate for a
Python bug, if that's what this is.

BR,
Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
Jul 18 '05 #1
2 1938
Jorgen Grahn wrote:
[snip]

frailea> cat foo
import locale
print locale.getlocale()
locale.setlocale(locale.LC_CTYPE)
print locale.getlocale()

When I paste it into an interactive Python session, the locale is already set up correctly (which is what I suppose interactive mode /should/ do):
import locale
print locale.getlocale() ['sv_SE', 'ISO8859-1'] locale.setlocale(locale.LC_CTYPE) 'sv_SE' print locale.getlocale() ['sv_SE', 'ISO8859-1']

When I run it as a script it isn't though, and the setlocale() call

does not appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale:

frailea> python foo
(None, None)
(None, None)

The corresponding program written in C works as expected:

frailea> cat foot.c
#include <stdio.h>
#include <locale.h>
int main(void) {
printf("%s\n", setlocale(LC_CTYPE, 0));
printf("%s\n", setlocale(LC_CTYPE, ""));
printf("%s\n", setlocale(LC_CTYPE, 0));
return 0;
}
frailea> ./foot
C
sv_SE
sv_SE

So, is this my fault or Python's? I realize I could just adapt and set $LC_CTYPE explicitly in my environment, but I don't want to capitulate for a Python bug, if that's what this is.


Try locale.setlocale(locale.LC_CTYPE,"") as in your C program. It would
be great if locale.setlocale with one parameter would be deprecated,
because it suddenly acts like getlocale. It's unpythonic.

By the way, since you took time to setup various LC_* variables there
is no need to play with LC_CTYPE category. Just use the standard idiom.
import locale
locale.setlocale(LC_ALL,"")

Serge.

Jul 18 '05 #2
On 11 Jan 2005 05:49:32 -0800, Se*********@gmail.com <Se*********@gmail.com> wrote:
Jorgen Grahn wrote:
[snip]

frailea> cat foo
import locale
print locale.getlocale()
locale.setlocale(locale.LC_CTYPE)
print locale.getlocale()
....
When I run it as a script it isn't though, and the setlocale() call does not
appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale: .... So, is this my fault or Python's? I realize I could just adapt and

set
$LC_CTYPE explicitly in my environment, but I don't want to

capitulate for a
Python bug, if that's what this is.


Try locale.setlocale(locale.LC_CTYPE,"") as in your C program.


Oops, you are right. locale.setlocale(locale.LC_CTYPE,"") sets the locale
from my environment (and gets it right!) while
locale.setlocale(locale.LC_CTYPE) /returns/ the current locale. I don't know
how I could have missed that, since it's clearly documented and also maps
directly to C usage.
It would
be great if locale.setlocale with one parameter would be deprecated,
because it suddenly acts like getlocale. It's unpythonic.
I dislike the term "unpythonic", but I tend to agree with you in practice
here. Even better, but maybe not feasible, would be an approach to locales
which doesn't involve changing a global state in this fashion.
By the way, since you took time to setup various LC_* variables there
is no need to play with LC_CTYPE category. Just use the standard idiom.
import locale
locale.setlocale(LC_ALL,"")


Thanks for pointing that out. I picked out LC_CTYPE for my small program
because I was in a hurry and didn't want to risk non-standard sorting
elsewhere in the program. I hate what the LC_COLLATE=C does to swedish
national characters, but I hate what LC_COLLATE=sv_SE does to non-alphabetic
characters even more.

To paraphrase Barbie: "i18n is hard". ;-)

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
by: Ksenia Marasanova | last post by:
Hi, I have some problems with locale module. On my workstation, changing locale doesn't have effect: Python 2.3 (#1, Sep 13 2003, 00:49:11) on darwin Type "help", "copyright", "credits" or...
1
by: Cagdas Ozgenc | last post by:
Greetings. I am confused about C++ standard library locale stuff. It seems as if the implementations of locales are not part of the library, but only some guideline classes are there. What is...
0
by: | last post by:
Hi all, running the following code as an console application: #include <locale> #include <iostream> #include <sstream> #include <tchar.h> using namespace std;
3
by: robert | last post by:
Why can the default locale not be set by its true name? but only by '' ? : PythonWin 2.5 (r25:51908, Sep 19 2006, 09:52:17) on win32. (None, None) Traceback (most recent call last): File...
2
by: cris | last post by:
Hi, I got trouble when I try to change the locale used in a program foo.cc // ----------------- begin #include <iostream> #include <locale> #include <exception> using namespace std; int...
2
by: Daniele C. | last post by:
I will report a snippet from http://php.net/setlocale I spent about 2 seconds before thinking: can it really be? I mean, is there no way to set locale info per thread? And after a brief web...
24
by: Donn Ingle | last post by:
Hello, I hope someone can illuminate this situation for me. Here's the nutshell: 1. On start I call locale.setlocale(locale.LC_ALL,''), the getlocale. 2. If this returns "C" or anything...
0
by: Roman Bertle | last post by:
Hello, I try to format monetary values using the locale module, python2.5: Python 2.5.2a0 (r251:54863, Jan 3 2008, 17:59:56) on linux2 Type "help", "copyright", "credits" or "license" for...
3
by: Mario Ruggier | last post by:
Hi, i had the following problem when installing py3.0rc1 on a Mac OS X 10.5.5. On this system, the default locale values are: $ locale LANG= LC_COLLATE="C" LC_CTYPE="UTF-8" LC_MESSAGES="C"
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
1
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
muto222
php
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.