473,396 Members | 1,785 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,396 software developers and data experts.

How to print a unicode string?

I'd like to print out a unicode string.

I'm running Python inside Emacs, which understands utf-8, so I want to
force Python to send utf-8 to sys.stdout.

From what I've googled, I think I need to set my locale. I don't
understand how.

import locale
print locale.getlocale()
--(None,None)
print locale.getdefaultlocal()
--('en_GB','cp1252')
print locale.normalize('en_GB.utf-8')
--en_GB.UTF8
locale.setlocale(locale.LC_ALL,'en_GB.UTF8')
-- locale.Error: unsupported locale setting

I'd be grateful for advice.
Damon.
Jun 27 '08 #1
13 3173
da**********@gmail.com writes:
From what I've googled, I think I need to set my locale. I don't
understand how.

import locale
print locale.getlocale()
--(None,None)
print locale.getdefaultlocal()
--('en_GB','cp1252')
print locale.normalize('en_GB.utf-8')
--en_GB.UTF8
locale.setlocale(locale.LC_ALL,'en_GB.UTF8')
-- locale.Error: unsupported locale setting

I'd be grateful for advice.
Just because the locale library knows the normalised name for it
doesn't mean it's available on your OS. Have you confirmed that your
OS (independent of Python) supports the locale you're trying to set?

--
\ "Oh, I love your magazine. My favorite section is 'How To |
`\ Increase Your Word Power'. That thing is really, really, |
_o__) really... good." -- Homer, _The Simpsons_ |
Ben Finney
Jun 27 '08 #2
On Apr 19, 12:51 am, Ben Finney wrote:
Just because the locale library knows the normalised name for it
doesn't mean it's available on your OS. Have you confirmed that your
OS (independent of Python) supports the locale you're trying to set?
No. How do I found out which locales my OS supports? (I'm running
Windows
XP.) Why does it matter what locales my OS supports, when all I want
is to set the encoding to be used for the output, and the output is
all
going to Emacs, and I know that Emacs supports utf8?
(Emacs 22.2.1 i386-mingw-nt5.1.2600.)

Damon.
Jun 27 '08 #3
From what I've googled, I think I need to set my locale.

Not on this operating system. On Windows, you need to change
your console. If it is a cmd.exe-style console, use chcp.
For IDLE, changing the output encoding is not supported.

If you want to output into a file, use codecs.open.

If you absolutely want to output UTF-8 to the terminal even
though the terminal will not be able to render it correctly,
use

sys.stdout = codecs.getwriter("UTF-8")(sys.stdout)

HTH,
Martin
Jun 27 '08 #4
On Apr 18, 5:38*pm, damonwisc...@gmail.com wrote:
I'd like to print out a unicode string.

I'm running Python inside Emacs, which understands utf-8, so I want to
force Python to send utf-8 to sys.stdout.

From what I've googled, I think I need to set my locale. I don't
understand how.

import locale
print locale.getlocale()
--(None,None)
print locale.getdefaultlocal()
--('en_GB','cp1252')
print locale.normalize('en_GB.utf-8')
--en_GB.UTF8
locale.setlocale(locale.LC_ALL,'en_GB.UTF8')
--*locale.Error: unsupported locale setting

I'd be grateful for advice.
Damon.
u_str = u'hell\u00F6 w\u00F6rld' #o's with umlauts

print u_str.encode('utf-8')

--output:--
hellö wörld
Jun 27 '08 #5
On Apr 18, 6:36*pm, 7stud <bbxx789_0...@yahoo.comwrote:
On Apr 18, 5:38*pm, damonwisc...@gmail.com wrote:
I'd like to print out a unicode string.
I'm running Python inside Emacs, which understands utf-8, so I want to
force Python to send utf-8 to sys.stdout.
From what I've googled, I think I need to set my locale. I don't
understand how.
import locale
print locale.getlocale()
--(None,None)
print locale.getdefaultlocal()
--('en_GB','cp1252')
print locale.normalize('en_GB.utf-8')
--en_GB.UTF8
locale.setlocale(locale.LC_ALL,'en_GB.UTF8')
--*locale.Error: unsupported locale setting
I'd be grateful for advice.
Damon.

u_str = u'hell\u00F6 w\u00F6rld' *#o's with umlauts

print u_str.encode('utf-8')

--output:--
hellö wörld
Or maybe you want this:

u_str = u'hell\u00F6 w\u00F6rld'
regular_str = u_str.encode('utf-8')
print repr(regular_str)

--output:--
'hell\_x_c3\_x_b6 w\_x_c3\_x_b6rld'
#underscores added to keep your browser from rendering the utf-8
characters
Jun 27 '08 #6
On Apr 19, 1:36 am, 7stud <bbxx789_0...@yahoo.comwrote:
u_str = u'hell\u00F6 w\u00F6rld' #o's with umlauts
print u_str.encode('utf-8')

--output:--
hellö wörld
Maybe on your system. On my system, those same commands produce
hell\303\266 w\303\266rld

Those \303\266 symbols are single characters -- when I move around
with cursor keys, the cursor jumps across them with a single key-
press.

As I wrote, I'm running Python inside Emacs 22.2.1 (using python-
mode).

Damon.
Jun 27 '08 #7
da**********@gmail.com writes:
On Apr 19, 12:51 am, Ben Finney wrote:
Just because the locale library knows the normalised name for it
doesn't mean it's available on your OS. Have you confirmed that
your OS (independent of Python) supports the locale you're trying
to set?

No. How do I found out which locales my OS supports? (I'm running
Windows XP.)
Can't help you there.
Why does it matter what locales my OS supports, when all I want is
to set the encoding to be used for the output
Because the Python 'locale' module is all about using the OS's
(actually, the underlying C library's) locale support.

The locale you request with 'locale.setlocale' needs to be supported
by the locale database, which is independent of any specific
application, be it Python, Emacs, or otherwise.

--
\ "Two rules to success in life: 1. Don't tell people everything |
`\ you know." -- Sassan Tat |
_o__) |
Ben Finney
Jun 27 '08 #8
On Apr 19, 1:14 am, "Martin v. Löwis" <mar...@v.loewis.dewrote:
From what I've googled, I think I need to set my locale.

Not on this operating system. On Windows, you need to change
your console. If it is a cmd.exe-style console, use chcp.
For IDLE, changing the output encoding is not supported.

If you want to output into a file, use codecs.open.

If you absolutely want to output UTF-8 to the terminal even
though the terminal will not be able to render it correctly,
use

sys.stdout = codecs.getwriter("UTF-8")(sys.stdout)
Thank you for the suggestion. As I said, I am running Python through
Emacs 22.2.1, so I doubt it is a cmd.exe-style console, and it most
certainly is not IDLE. I want to output to the Emacs buffer, via the
python-mode plugin for Emacs, not to a file.

I tried your suggestion of setting sys.stdout, and it works perfectly.
As I said, the output is going to Emacs, and Emacs _does_ know how to
render UTF-8.

How can I make this a global setting? Is it possible to change an
environment variable, so that Python uses this coding automatically?
Or pass a command-line argument when Emacs python-mode invokes the
Python interpreter? Or execute this line of Python in a startup script
which is invoked whenever a new Python session is started?

Thank you again for your help,
Damon.
Jun 27 '08 #9
On Apr 19, 1:53 am, Ben Finney wrote:
Damon Wischik writes:
>Why does it matter what locales my OS supports, when all I want is
to set the encoding to be used for the output

Because the Python 'locale' module is all about using the OS's
(actually, the underlying C library's) locale support.

The locale you request with 'locale.setlocale' needs to be supported
by the locale database, which is independent of any specific
application, be it Python, Emacs, or otherwise.
Let me try to ask a better question. It seems that the logical choice
of locale (en_GB.utf8) is not supported by my operating system.
Nonetheless, I want Python to output in utf-8, because I know for
certain that the terminal I am using (Emacs 22.2.1 with python-mode)
will display utf-8 correctly. It therefore seems that I cannot use the
locale mechanism to indicate to Python the encoding I want for
sys.stdout. What other mechanisms are there for me to indicate what I
want to Python?

Another poster pointed me to
>sys.stdout = codecs.getwriter("UTF-8")(sys.stdout)
and this works great. All I want now is some reassurance that this is
the most appropriate way for me to achieve what I want (e.g. least
likely to break with future versions of Python, most in keeping with
the design of Python, easiest for me to maintain, etc.).

Damon.
Jun 27 '08 #10
On Apr 19, 12:38 am, Damon Wischik wrote:
I'd like to print out a unicode string.

I'm running Python inside Emacs, which understands utf-8, so I want to
force Python to send utf-8 to sys.stdout.
Thank you everyone who was sent suggestions. Here is my solution (for
making Python output utf-8, and persuading Emacs 22.2.1 with python-
mode to print it).

1. Set the registry key HKEY_CURRENT_USER\Software\GNU\Emacs\Home to
have value "d:\documents\home". This makes Emacs look for a .emacs
file in this directory (the home directory).

2. Put a file called .emacs file in the home directory. It should
include these lines:
(setenv "PYTHONPATH" "d:/documents/home")
(prefer-coding-system 'utf-8)
The first line means that python will look in my home directory for
libraries etc. The second line tells Emacs to default to utf-8 for its
buffers. Without the second line, Emacs may default to a different
coding, and it will not know what to do when it receives utf-8.

3. Put a file called sitecustomize.py in the home directory. This file
should contain these lines:
import codecs
import sys
sys.stdout = codecs.getwriter("UTF-8")(sys.stdout)

4. Now it should all work. If I enter
print u'La Pe\xf1a'
then it comes out with a n-tilde.

NB. An alternative solution is to edit site.py in the Python install
directory, and replace the line
encoding = "ascii" # Default value set by _PyUnicode_Init()
with
encoding = 'utf8'
But the trouble with this is that it will be overwritten if I install
a new version of Python.
NB. I also have these lines in my .emacs file, to load python-mode,
and to make it so that ctrl+enter executes the current paragraph:
; Python file association
(load "c:/program files/emacs-plugins/python-mode-1.0/python-mode.el")
(setq auto-mode-alist
(cons '("\\.py$" . python-mode) auto-mode-alist))
(setq interpreter-mode-alist
(cons '("python" . python-mode)
interpreter-mode-alist))
(autoload 'python-mode "python-mode" "Python editing mode." t)
; Note: the command for invoking Python is specified at the end,
; as a custom variable.
;; DJW's command to select the current paragraph, then execute-region.
(defun py-execute-paragraph (vis)
"Send the current paragraph to Python
Don't know what vis does."
(interactive "P")
(save-excursion
(forward-paragraph)
(let ((end (point)))
(backward-paragraph)
(py-execute-region (point) end ))))
(setq py-shell-switch-buffers-on-execute nil)
(global-set-key [(ctrl return)] 'py-execute-paragraph)

(custom-set-variables
;; custom-set-variables was added by Custom -- don't edit or cut/
paste it!
;; Your init file should contain only one such instance.
'(py-python-command "c:/program files/Python25/python.exe"))
Damon.
Jun 27 '08 #11
On Apr 18, 7:14 pm, "Martin v. Löwis" <mar...@v.loewis.dewrote:
From what I've googled, I think I need to set my locale.

Not on this operating system. On Windows, you need to change
your console. If it is a cmd.exe-style console, use chcp.
For IDLE, changing the output encoding is not supported.

If you want to output into a file, use codecs.open.

If you absolutely want to output UTF-8 to the terminal even
though the terminal will not be able to render it correctly,
use

sys.stdout = codecs.getwriter("UTF-8")(sys.stdout)
And in Py3k?
>
HTH,
Martin
Jun 27 '08 #12
Is it possible to change an
environment variable, so that Python uses this coding automatically?
No.
Or pass a command-line argument when Emacs python-mode invokes the
Python interpreter?
No.
Or execute this line of Python in a startup script
which is invoked whenever a new Python session is started?
Yes, you can add the code I suggested to sitecustomize.py.

Regards,
Martin
Jun 27 '08 #13
On 2008-04-19 03:09, da**********@gmail.com wrote:
Another poster pointed me to
>>sys.stdout = codecs.getwriter("UTF-8")(sys.stdout)
and this works great. All I want now is some reassurance that this is
the most appropriate way for me to achieve what I want (e.g. least
likely to break with future versions of Python, most in keeping with
the design of Python, easiest for me to maintain, etc.).
While the above works nicely for Unicode objects you write
to sys.stdout, you are going to have problems with non-ASCII
8-bit strings, e.g. binary data.

Python will have to convert these to Unicode before applying
the UTF-8 codec and uses the default encoding for this, which
is ASCII.

You could wrap sys.stdout using a codecs.EncodedFile() which provides
transparent recoding, but then you have problems with Unicode objects,
since the recoder assumes that it has to work with strings on input
(to e.g. the .write() method).

There's no ideal solution - it really depends a lot on what
your application does and how it uses strings and Unicode.

--
Marc-Andre Lemburg
eGenix.com

Professional Python Services directly from the Source (#1, Apr 19 2008)
>>Python/Zope Consulting and Support ... http://www.egenix.com/
mxODBC.Zope.Database.Adapter ... http://zope.egenix.com/
mxODBC, mxDateTime, mxTextTools ... http://python.egenix.com/
__________________________________________________ ______________________

:::: Try mxODBC.Zope.DA for Windows,Linux,Solaris,MacOSX for free ! ::::
eGenix.com Software, Skills and Services GmbH Pastor-Loeh-Str.48
D-40764 Langenfeld, Germany. CEO Dipl.-Math. Marc-Andre Lemburg
Registered at Amtsgericht Duesseldorf: HRB 46611
Jun 27 '08 #14

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Ricardo Bugalho | last post by:
Hello, I'm using Python 2.3.4 and I noticed that, when stdout is a terminal, the 'print' statement converts Unicode strings into the encoding defined by the locales instead of the one returned by...
9
by: François Pinard | last post by:
Hi, people. I hope someone would like to enlighten me. For any application handling Unicode internally, I'm usually careful at properly converting those Unicode strings into 8-bit strings before...
5
by: Tristan Miller | last post by:
Greetings. I have an XML file listing various information about text glyphs (Unicode value, HTML entity name, SGML entity name, etc.). All glyphs have a Unicode value, but not all of them have...
12
by: Peter Lin | last post by:
Hey, I am just wondering if anyone has got any idea of setting up a new class so that you could just print like the old ways with the printer class, since I am writing a program that really...
6
by: HappyHippy | last post by:
More of a minor niggle than anything but how would I remove the aforementioned space? eg. strName = 'World' print 'Hello', strName, ', how are you today?' comes out as "Hello World , how are...
3
by: Terry Hancock | last post by:
I still run into my own ignorance a lot with unicode in Python. Is it possible to define some combination of __repr__, __str__, and/or __unicode__ so that the unicode() wrapper isn't necessary...
8
by: =?gb2312?B?yMvR1MLkyNXKx8zs0cSjrM37vKvM7NHEsru8+7z | last post by:
I lookup the utf-8 form of delta from the link. http://www.fileformat.info/info/unicode/char/0394/index.htm and then I want to print it in the python ( I work under windows) #!/usr/bin/python...
5
by: sniipe | last post by:
Hi, I have a problem with unicode string in Pylons templates(Mako). I will print first char from my string encoded in UTF-8 and urllib.quote(), for example string '£ukasz': ...
0
by: WaterWalk | last post by:
Until Python 2.5, the exception object still uses ansi string. Thus, in the following example: f = open(u"\u6d4b.log") Suppose the file to open does not exist, the output message of the...
0
by: Charles Arthur | last post by:
How do i turn on java script on a villaon, callus and itel keypad mobile phone
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.