473,657 Members | 2,378 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

unicode newbie - printing mixed languages to the terminal

Hi list.

I've never used unicode in a Python script before, but I need to now.
I'm not sure where to start. I'm hoping that a kind soul can help me
out here.

My current (almost non-existant) knowledge of unicode:
>From the docs I know about the unicode string type, and how to declare
string types. What I don't understand yet is what encodings are and
when you'd want/need to use them. What I'd like is to just be able to
print out unicode strings in mixed languages, and they'd appear on the
terminal the same way they get shown in a web browser (when you have
appropriate fonts installed), without any fuss.

Here is an example of how I'd like my script to work:

$ ./test.py

Random hiragana: <some jp characters>
Random romaji: kakikukeko

Is this possible?
>From my limited knowledge, I *think* I need to do to things:
1) In my Python script, run .encode() on my unicode variable before
printing it out (I assume I need to encode into Japanese)

Question: How does this work when you have multiple languages in a
single unicode string? Do you need to separate them into separate
strings (one per language) and print them separately?

Or is it the case that you can (unlike a web browser) *only*
display/print one language at a time? (I really want mixed language -
English AND Japanese).

2) Setup the terminal to display the output. From various online docs
it looks like I need to set the LANG environment variable to Japanese,
and then start konsole (or gnome-terminal if that will work better).
But again, it looks like this limits me to 1 language.

If what I want to do is very hard, I'll output html instead and view
it in a web browser. But I'd prefer to use the terminal instead if
possible :-)

Thanks in advance.

David.
Jun 27 '08 #1
2 1595
David wrote:
Hi list.

I've never used unicode in a Python script before, but I need to now.
I'm not sure where to start. I'm hoping that a kind soul can help me
out here.

My current (almost non-existant) knowledge of unicode:
>>From the docs I know about the unicode string type, and how to declare
string types. What I don't understand yet is what encodings are and
when you'd want/need to use them. What I'd like is to just be able to
print out unicode strings in mixed languages, and they'd appear on the
terminal the same way they get shown in a web browser (when you have
appropriate fonts installed), without any fuss.

Here is an example of how I'd like my script to work:

$ ./test.py

Random hiragana: <some jp characters>
Random romaji: kakikukeko

Is this possible?
>>From my limited knowledge, I *think* I need to do to things:

1) In my Python script, run .encode() on my unicode variable before
printing it out (I assume I need to encode into Japanese)

Question: How does this work when you have multiple languages in a
single unicode string? Do you need to separate them into separate
strings (one per language) and print them separately?

Or is it the case that you can (unlike a web browser) *only*
display/print one language at a time? (I really want mixed language -
English AND Japanese).

2) Setup the terminal to display the output. From various online docs
it looks like I need to set the LANG environment variable to Japanese,
and then start konsole (or gnome-terminal if that will work better).
But again, it looks like this limits me to 1 language.

If what I want to do is very hard, I'll output html instead and view
it in a web browser. But I'd prefer to use the terminal instead if
possible :-)
I suggest you read http://www.amk.ca/python/howto/unicode to demystify
what Unicode is and does, and how to use it in Python.

Printing text from different languages is possible if and only if the
output device (terminal, in this case) supports a character encoding
that accommodates all the characters you wish to print. UTF-8 is a
fairly ubiquitous candidate that fits that criteria, since it
encompasses Unicode in its entirety (as opposed to latin-1, for example,
which only includes a very small subset of Unicode).

HTH,

--
Carsten Haese
http://informixdb.sourceforge.net
Jun 27 '08 #2
I suggest you read http://www.amk.ca/python/howto/unicode to demystify what
Unicode is and does, and how to use it in Python.
That document really helped.

This page helped me to setup the console:http://www.jw-stumpel.nl/stestu.html#T3

I ran:

dpkg-reconfigure locales # And enabled a en_ZA.utf8
update-locale LANG=en_ZA.utf8

(And then rebooted, but I don't know if that was necessary).

I can now print mixed language unicode to the console from Python.

Thanks for your help.

David.
Jun 27 '08 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
1533
by: Fuzzyman | last post by:
How does the print statement decode unicode strings itis passed ? (By that I mean which encoding does it use). Under windows it doesn't appear to use defaultencoding. On my system the default encoding is ascii, yet the terminal encoding is latin1 (or cp1252 or whatever, but not ascii). This means that print '£' works fine, yet unicode('£') will raise the UnicodeDecodeError. However print u'£' will also work fine. (Under pythonce this...
29
3509
by: Ron Garret | last post by:
>>> u'\xbd' u'\xbd' >>> print _ Traceback (most recent call last): File "<stdin>", line 1, in ? UnicodeEncodeError: 'ascii' codec can't encode character u'\xbd' in position 0: ordinal not in range(128) >>>
22
6067
by: Filipe | last post by:
Hi all, I'm starting to learn python but am having some difficulties with how it handles the encoding of data I'm reading from a database. I'm using pymssql to access data stored in a SqlServer database, and the following is the script I'm using for testing purposes. ----------------------------------------------------------------------------- import pymssql
8
2255
by: sonald | last post by:
Hi, I am using python2.4.1 I need to pass russian text into python and validate the same. Can u plz guide me on how to make my existing code support the russian text. Is there any module that can be used for unicode support in python? Incase of decimal numbers, how to handle "comma as a decimal point"
9
2932
by: Jim | last post by:
Hello, I'm trying to write exception-handling code that is OK in the presence of unicode error messages. I seem to have gotten all mixed up and I'd appreciate any un-mixing that anyone can give me. I'm used to writing code like this.
4
2497
by: Rehceb Rotkiv | last post by:
Hello, I have this little grep-like program: ++++++++++snip++++++++++ #!/usr/bin/python import sys import re
7
4015
by: 7stud | last post by:
Based on this example and the error: ----- u_str = u"abc\u9999" print u_str UnicodeEncodeError: 'ascii' codec can't encode character u'\u9999' in position 3: ordinal not in range(128) ------
24
3370
by: Donn Ingle | last post by:
Hello, I hope someone can illuminate this situation for me. Here's the nutshell: 1. On start I call locale.setlocale(locale.LC_ALL,''), the getlocale. 2. If this returns "C" or anything without 'utf8' in it, then things start to go downhill: 2a. The app assumes unicode objects internally. i.e. Whenever there is
8
2651
by: Yves Dorfsman | last post by:
Can you put UTF-8 characters in a dbhash in python 2.5 ? It fails when I try: #!/bin/env python # -*- coding: utf-8 -*- import dbhash db = dbhash.open('dbfile.db', 'w') db = u'☺'
0
8399
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8827
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
8504
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
8606
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
7337
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
4159
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
2732
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
1959
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1622
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.