473,746 Members | 2,709 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

locale.CODESET / different in python shell and scripts

When I type the following code in the interactive python shell,
I get 'UTF-8'; but if I put the code into a Python script and
run the script - in the same terminal on my Linux box in which
I opened the python shell before -, I get 'ANSI_X3.4-1968'.

How does that come?

Thanks in advance for your answers! Nuff.
The Code:

import locale
print locale.nl_langi nfo(locale.CODE SET)

Jul 18 '05 #1
9 3586
Nuff Said wrote:
When I type the following code in the interactive python shell,
I get 'UTF-8'; but if I put the code into a Python script and
run the script - in the same terminal on my Linux box in which
I opened the python shell before -, I get 'ANSI_X3.4-1968'.

How does that come?


Because, for some reason, locale.setlocal e() is called in your
interactive startup, but not in the normal startup.

It is uncertain why this happens - setlocale is not normally
called automatically; not even in interactive mode. Perhaps
you have created your own startup file?

Regards,
Martin

Jul 18 '05 #2
"Martin v. Löwis" <ma****@v.loewi s.de> writes:
Nuff Said wrote:
When I type the following code in the interactive python shell,
I get 'UTF-8'; but if I put the code into a Python script and
run the script - in the same terminal on my Linux box in which
I opened the python shell before -, I get 'ANSI_X3.4-1968'.
How does that come?


Because, for some reason, locale.setlocal e() is called in your
interactive startup, but not in the normal startup.

It is uncertain why this happens - setlocale is not normally
called automatically; not even in interactive mode. Perhaps
you have created your own startup file?


readline calls setlocale() iirc.

Cheers,
mwh

--
Not only does the English Language borrow words from other
languages, it sometimes chases them down dark alleys, hits
them over the head, and goes through their pockets. -- Eddy Peters
Jul 18 '05 #3
Michael Hudson wrote:
It is uncertain why this happens - setlocale is not normally
called automatically; not even in interactive mode. Perhaps
you have created your own startup file?

readline calls setlocale() iirc.


Sure. However, we restore the locale to what it was before
readline initialization messes with the locale.

Regards,
Martin

Jul 18 '05 #4
On Tue, 27 Apr 2004 22:29:59 +0200, Martin v. Löwis wrote:
Because, for some reason, locale.setlocal e() is called in your
interactive startup, but not in the normal startup.

It is uncertain why this happens - setlocale is not normally
called automatically; not even in interactive mode. Perhaps
you have created your own startup file?


I use two Python versions on my Linux box (Fedora Core 1):
the Python 2.2 which came with Fedora and a Python 2.3 which
I compiled myself. (I didn't tinker with the last one;
Fedora's Python is a (well known) mess.)

Both Python versions give me 'ANSI_X3.4-1968' when I run a script
with 'print locale.nl_langi nfo(locale.CODE SET)'.
When I execute the same command in an interactive Python shell,
I get the (correct) 'UTF-8'.

(By 'correct', I mean that the bash command 'locale' gives me
'LANG=en_US.UTF-8, LC_CTYPE="en_US .UTF-8", ...'. This seems to
be correct, because e.g. the 'less ...' command shows files which
are UTF-8 encoded in the correct way; files which are e.g.
'ISO-8859-1' encoded are not shown in the correct way.)
Things are getting even worse:

I write a Python script which uses Unicode strings; now I want
to 'print ...' one of those strings (containing non-ASCII characters;
e.g. German umlauts).
With Fedora's Python 2.2 I have to use 'print s.encode('ISO-8859-1')
or something similar.
With my self-compiled Python 2.3, I have to use (the expected)
'print s.encode('UTF-8')' (though it shows me 'ANSI_X3.4-1968' when
using 'print locale.nl_langi nfo(locale.CODE SET)' in the same file).

???

Any ideas what's going wrong here?

(I tried 'python -S ...'; doesn't make a difference.)

Jul 18 '05 #5
Nuff Said wrote:
Both Python versions give me 'ANSI_X3.4-1968' when I run a script
with 'print locale.nl_langi nfo(locale.CODE SET)'.
When I execute the same command in an interactive Python shell,
I get the (correct) 'UTF-8'.


PLEASE invoke

locale.setlocal e(locale.LC_ALL , "")

before invoking nl_langinfo. Different C libraries behave differently
in their nl_langinfo responses if setlocale hasn't been called.

Regards,
Martin

Jul 18 '05 #6
On Thu, 29 Apr 2004 22:14:23 +0200, Martin v. Löwis wrote:
PLEASE invoke

locale.setlocal e(locale.LC_ALL , "")

before invoking nl_langinfo. Different C libraries behave differently
in their nl_langinfo responses if setlocale hasn't been called.


Thanks a lot for your help!

That solved (part of) the problem; now I get 'UTF-8' (which is correct)
when running the following script (with either my self-compiled Python
2.3 or Fedora's Python 2.2):

#!/usr/bin/env python
# -*- coding: UTF-8 -*-

import locale

locale.setlocal e(locale.LC_ALL , "")
encoding = locale.nl_langi nfo(locale.CODE SET)
print encoding
Still, one problem remains:

When I add the following line to the above script

print u"schönes Mädchen".encode (encoding)

the result is:

schönes Mädchen (with my self-compiled Python 2.3)
schönes Mädchen (with Fedora's Python 2.2)

I observed, that my Python gives me (the correct value) 15 for
len(u"schönes Mädchen") whereas Fedora's Python says 17 (one more
for each German umlaut, i.e. the len of the UTF-8 representation of
the string; observe, that the file uses the coding cookie for UTF-8).
Maybe Fedora's Python was compiled without Unicode support?

(Is that even possible? I recall something about a UCS2 resp.
UCS4 switch when compiling Python; but without Unicode support?
And if it would be possible, shouldn't a Python without Unicode
support disallow strings of the form u"..." resp. show a warning???)
This really drives me nuts because I thought the above approach
should be the correct way to assure that Python scripts can print
non-ASCII characters on any terminal (which is able to display
those characters in some encoding as UTF-8, ISO-8859-x, ...).

Is there something I do utterly wrong here?
Python can't be that complicated?

Nuff.

Jul 18 '05 #7
Nuff Said wrote:
When I add the following line to the above script

print u"schönes Mädchen".encode (encoding)

the result is:

schönes Mädchen (with my self-compiled Python 2.3)
schönes Mädchen (with Fedora's Python 2.2)

I observed, that my Python gives me (the correct value) 15 for
len(u"schönes Mädchen") whereas Fedora's Python says 17 (one more
for each German umlaut, i.e. the len of the UTF-8 representation of
the string; observe, that the file uses the coding cookie for UTF-8).
Maybe Fedora's Python was compiled without Unicode support?
Certainly not: It would not support u"" literals without Unicode.

Please understand that you can use non-ASCII characters in source
code unless you also use the facilities described in

http://www.python.org/peps/pep-0263.html

So instead of "ö", you should write "\xf6".
Is there something I do utterly wrong here?
Yes, you are.
Python can't be that complicated?


Python is not. Encodings are.

Regards,
Martin

Jul 18 '05 #8
On Fri, 30 Apr 2004 04:30:34 +0200, Martin v. Löwis wrote:
Nuff Said wrote:
When I add the following line to the above script

print u"schönes Mädchen".encode (encoding)

the result is:

schönes Mädchen (with my self-compiled Python 2.3)
schönes Mädchen (with Fedora's Python 2.2)

I observed, that my Python gives me (the correct value) 15 for
len(u"schönes Mädchen") whereas Fedora's Python says 17 (one more
for each German umlaut, i.e. the len of the UTF-8 representation of
the string; observe, that the file uses the coding cookie for UTF-8).
Maybe Fedora's Python was compiled without Unicode support?
Certainly not: It would not support u"" literals without Unicode.


That's what I thought.

Please understand that you can use non-ASCII characters in source
code unless you also use the facilities described in

http://www.python.org/peps/pep-0263.html

So instead of "ö", you should write "\xf6".


But *I do use* the line

# -*- coding: UTF-8 -*-

from your PEP (directly after the shebang-line; s. the full source
code in my earlier posting). I thought, that allows me to write u"ö"
(which - as described above - works in one of my two Pythons).

??? Nuff.
Jul 18 '05 #9
On Fri, 30 Apr 2004 11:56:19 +0200, Nuff Said wrote:
But *I do use* the line

# -*- coding: UTF-8 -*-

from your PEP (directly after the shebang-line; s. the full source
code in my earlier posting). I thought, that allows me to write u"ö"
(which - as described above - works in one of my two Pythons).


Follow up to myself:

Arrgh!!! Think I got it now. Your PEP 263: 'Source Code Encodings' was
incorporated into Python 2.3 (i.e. my self-compiled Python) but not
into Python 2.2 (Fedora's Python).

Thanks for your help!

Jul 18 '05 #10

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
3847
by: Logan | last post by:
Several people asked me for the following HOWTO, so I decided to post it here (though it is still very 'alpha' and might contain many (?) mistakes; didn't test what I wrote, but wrote it - more or less - during my own installation of Python 2.3 on Fedora Core 1 Linux for a friend of mine). Anyway, HTH, L.
2
14359
by: Jorgen Grahn | last post by:
I couldn't think of a good solution, and it's hard to Google for... I write python command-line programs under Win2k, and I use the bash shell from Cygwin. I cannot use Cygwin's python package because of a binary module which has to be compiled with Visual C 6. My scripts start with a '#!/usr/bin/env python' shebang, as God intended. Now, I assume I can make cmd.exe run foo.py by asociating *.py with the python interpreter.
42
4105
by: Fred Ma | last post by:
Hello, This is not a troll posting, and I've refrained from asking because I've seen similar threads get all nitter-nattery. But I really want to make a decision on how best to invest my time. I'm not interested on which language is better in *general*, just for my purpose. My area of research is in CAD algorithms, and I'm sensing the need to resort to something more expedient than C++, bash scripting, or sed scripting.
9
3657
by: Jeff Wagner | last post by:
I have a project of converting numerous DOS cmd shell scripts to Python. Is there a tutorial to getting started? Thanks, Jeff
3
3255
by: Ksenia Marasanova | last post by:
Hi, I have some problems with locale module. On my workstation, changing locale doesn't have effect: Python 2.3 (#1, Sep 13 2003, 00:49:11) on darwin Type "help", "copyright", "credits" or "license" for more information. >>> import locale >>> locale.setlocale(locale.LC_ALL, 'nl_NL')
8
2029
by: Jan Danielsson | last post by:
Hello all, How do I make a python script actually a _python_ in unix:ish environments? I know about adding: #!/bin/sh ..as the first row in a shell script, but when I installed python on a NetBSD system, I didn't get a "python" executable; only a "python2.4"
2
3640
by: Daniele C. | last post by:
I will report a snippet from http://php.net/setlocale I spent about 2 seconds before thinking: can it really be? I mean, is there no way to set locale info per thread? And after a brief web search I came up with this keyword: _configthreadlocale Read more at http://msdn2.microsoft.com/en-us/library//26c0tb7x(VS.80).aspx (beware, this page is very slow or their servers are anyway). Looks like this function exists since Windows 95 and...
0
1077
by: Tom Chilton | last post by:
Hi, I am running Gentoo Linux. I have emerge'd Bibus and everything seemed to go well. When I try to start it from a shell I get: tom@linuxbox ~ $ bibus Traceback (most recent call last): File "/usr/share/bibus/bibus.py", line 63, in ?
25
2549
by: Markus Elfring | last post by:
Hello, I have found two class libraries that have got the method "getLanguage". http://www.icu-project.org/apiref/icu4c/classLocale.html#7c0e53c666ea52387d0edae91f75c94f http://java.sun.com/javase/6/docs/api/java/util/Locale.html#getLanguage() This method provides the capability to get the ISO 639 language code from a locale. I am looking for a C function that offers the same service. I know that a current setting can be queried by...
0
8974
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
9508
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
9350
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9285
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9218
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
6060
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
4836
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
2765
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2199
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.