473,566 Members | 3,184 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Locale confusion

[Long posting due to the examples, but pretty simple question.]

I'm sitting here with a Debian Linux 'Woody' system with the default Python
2.2 installation, and I want the re module to understand that
re.compile(r'\W +'. re.LOCALE) doesn't match my national, accented
characters.

I don't quite understand how the locale module reasons about these things,
and Python doesn't seem to act as other programs on my system. Bug or my
mistake? Here's my environment:

frailea> env |grep -e LC -e LANG
LC_MESSAGES=C
LC_TIME=C
LANG=sv_SE
LC_NUMERIC=C
LC_MONETARY=C
frailea> locale
LANG=sv_SE
LC_CTYPE="sv_SE "
LC_NUMERIC=C
LC_TIME=C
LC_COLLATE="sv_ SE"
LC_MONETARY=C
LC_MESSAGES=C
LC_PAPER="sv_SE "
LC_NAME="sv_SE"
LC_ADDRESS="sv_ SE"
LC_TELEPHONE="s v_SE"
LC_MEASUREMENT= "sv_SE"
LC_IDENTIFICATI ON="sv_SE"
LC_ALL=

This seems to indicate that $LANG acts as a fallback when other things (e.g.
LC_CTYPE isn't defined) and that's also what the glibc setlocale(3) man page
says. Works well for me in general, too. However, consider this tiny Python
program:

frailea> cat foo
import locale
print locale.getlocal e()
locale.setlocal e(locale.LC_CTY PE)
print locale.getlocal e()

When I paste it into an interactive Python session, the locale is already
set up correctly (which is what I suppose interactive mode /should/ do):
import locale
print locale.getlocal e() ['sv_SE', 'ISO8859-1'] locale.setlocal e(locale.LC_CTY PE) 'sv_SE' print locale.getlocal e() ['sv_SE', 'ISO8859-1']


When I run it as a script it isn't though, and the setlocale() call does not
appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale:

frailea> python foo
(None, None)
(None, None)

The corresponding program written in C works as expected:

frailea> cat foot.c
#include <stdio.h>
#include <locale.h>
int main(void) {
printf("%s\n", setlocale(LC_CT YPE, 0));
printf("%s\n", setlocale(LC_CT YPE, ""));
printf("%s\n", setlocale(LC_CT YPE, 0));
return 0;
}
frailea> ./foot
C
sv_SE
sv_SE

So, is this my fault or Python's? I realize I could just adapt and set
$LC_CTYPE explicitly in my environment, but I don't want to capitulate for a
Python bug, if that's what this is.

BR,
Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
Jul 18 '05 #1
2 1954
Jorgen Grahn wrote:
[snip]

frailea> cat foo
import locale
print locale.getlocal e()
locale.setlocal e(locale.LC_CTY PE)
print locale.getlocal e()

When I paste it into an interactive Python session, the locale is already set up correctly (which is what I suppose interactive mode /should/ do):
import locale
print locale.getlocal e() ['sv_SE', 'ISO8859-1'] locale.setlocal e(locale.LC_CTY PE) 'sv_SE' print locale.getlocal e() ['sv_SE', 'ISO8859-1']

When I run it as a script it isn't though, and the setlocale() call

does not appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale:

frailea> python foo
(None, None)
(None, None)

The corresponding program written in C works as expected:

frailea> cat foot.c
#include <stdio.h>
#include <locale.h>
int main(void) {
printf("%s\n", setlocale(LC_CT YPE, 0));
printf("%s\n", setlocale(LC_CT YPE, ""));
printf("%s\n", setlocale(LC_CT YPE, 0));
return 0;
}
frailea> ./foot
C
sv_SE
sv_SE

So, is this my fault or Python's? I realize I could just adapt and set $LC_CTYPE explicitly in my environment, but I don't want to capitulate for a Python bug, if that's what this is.


Try locale.setlocal e(locale.LC_CTY PE,"") as in your C program. It would
be great if locale.setlocal e with one parameter would be deprecated,
because it suddenly acts like getlocale. It's unpythonic.

By the way, since you took time to setup various LC_* variables there
is no need to play with LC_CTYPE category. Just use the standard idiom.
import locale
locale.setlocal e(LC_ALL,"")

Serge.

Jul 18 '05 #2
On 11 Jan 2005 05:49:32 -0800, Se*********@gma il.com <Se*********@gm ail.com> wrote:
Jorgen Grahn wrote:
[snip]

frailea> cat foo
import locale
print locale.getlocal e()
locale.setlocal e(locale.LC_CTY PE)
print locale.getlocal e()
....
When I run it as a script it isn't though, and the setlocale() call does not
appear to fall back to looking at $LANG as it's supposed to(?), so my
LC_CTYPE remains in the POSIX locale: .... So, is this my fault or Python's? I realize I could just adapt and

set
$LC_CTYPE explicitly in my environment, but I don't want to

capitulate for a
Python bug, if that's what this is.


Try locale.setlocal e(locale.LC_CTY PE,"") as in your C program.


Oops, you are right. locale.setlocal e(locale.LC_CTY PE,"") sets the locale
from my environment (and gets it right!) while
locale.setlocal e(locale.LC_CTY PE) /returns/ the current locale. I don't know
how I could have missed that, since it's clearly documented and also maps
directly to C usage.
It would
be great if locale.setlocal e with one parameter would be deprecated,
because it suddenly acts like getlocale. It's unpythonic.
I dislike the term "unpythonic ", but I tend to agree with you in practice
here. Even better, but maybe not feasible, would be an approach to locales
which doesn't involve changing a global state in this fashion.
By the way, since you took time to setup various LC_* variables there
is no need to play with LC_CTYPE category. Just use the standard idiom.
import locale
locale.setlocal e(LC_ALL,"")


Thanks for pointing that out. I picked out LC_CTYPE for my small program
because I was in a hurry and didn't want to risk non-standard sorting
elsewhere in the program. I hate what the LC_COLLATE=C does to swedish
national characters, but I hate what LC_COLLATE=sv_S E does to non-alphabetic
characters even more.

To paraphrase Barbie: "i18n is hard". ;-)

/Jorgen

--
// Jorgen Grahn <jgrahn@ Ph'nglui mglw'nafh Cthulhu
\X/ algonet.se> R'lyeh wgah'nagl fhtagn!
Jul 18 '05 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3238
by: Ksenia Marasanova | last post by:
Hi, I have some problems with locale module. On my workstation, changing locale doesn't have effect: Python 2.3 (#1, Sep 13 2003, 00:49:11) on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import locale >>> locale.setlocale(locale.LC_ALL, 'nl_NL')
1
1996
by: Cagdas Ozgenc | last post by:
Greetings. I am confused about C++ standard library locale stuff. It seems as if the implementations of locales are not part of the library, but only some guideline classes are there. What is the standard conformant way of formatting and parsing locale specific date, time, strings with various character encoding, etc. Thanks
0
1174
by: | last post by:
Hi all, running the following code as an console application: #include <locale> #include <iostream> #include <sstream> #include <tchar.h> using namespace std;
3
7188
by: robert | last post by:
Why can the default locale not be set by its true name? but only by '' ? : PythonWin 2.5 (r25:51908, Sep 19 2006, 09:52:17) on win32. (None, None) Traceback (most recent call last): File "<interactive input>", line 1, in <module> File "C:\Python25\lib\locale.py", line 476, in setlocale return _setlocale(category, locale) Error:...
2
11907
by: cris | last post by:
Hi, I got trouble when I try to change the locale used in a program foo.cc // ----------------- begin #include <iostream> #include <locale> #include <exception> using namespace std; int main(int argc, char** argv)
2
3629
by: Daniele C. | last post by:
I will report a snippet from http://php.net/setlocale I spent about 2 seconds before thinking: can it really be? I mean, is there no way to set locale info per thread? And after a brief web search I came up with this keyword: _configthreadlocale Read more at http://msdn2.microsoft.com/en-us/library//26c0tb7x(VS.80).aspx (beware, this...
24
3352
by: Donn Ingle | last post by:
Hello, I hope someone can illuminate this situation for me. Here's the nutshell: 1. On start I call locale.setlocale(locale.LC_ALL,''), the getlocale. 2. If this returns "C" or anything without 'utf8' in it, then things start to go downhill: 2a. The app assumes unicode objects internally. i.e. Whenever there is
0
1523
by: Roman Bertle | last post by:
Hello, I try to format monetary values using the locale module, python2.5: Python 2.5.2a0 (r251:54863, Jan 3 2008, 17:59:56) on linux2 Type "help", "copyright", "credits" or "license" for more information. 'de_AT.utf8' {'mon_decimal_point': ',', 'int_frac_digits': 2, 'p_sep_by_space': 1, 'frac_digits': 2, 'thousands_sep': '',...
3
1612
by: Mario Ruggier | last post by:
Hi, i had the following problem when installing py3.0rc1 on a Mac OS X 10.5.5. On this system, the default locale values are: $ locale LANG= LC_COLLATE="C" LC_CTYPE="UTF-8" LC_MESSAGES="C"
0
7588
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language...
0
7903
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8115
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
0
6272
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then...
0
5216
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
0
3648
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in...
0
3630
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2093
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
0
930
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.