473,385 Members | 2,180 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

Encoding problems


Hi All!

I have a program that looks like this:

# -*- coding: iso-8859-2 -*-
s1 = 'néz'
s2 = raw_input('Please type in "néz":')
print repr(s1)
print repr(s2)

I always type in the same ('néz') on the input.

On windows, this the result:

C:\Temp\>test.py
Please type in "nÚz":néz
'n\xe9z'
'n\x82z'

C:\Temp>

On FreeBSD, this is the result:

%python ./test.py
Please type in "néz":néz
'n\xe9z'
'n\xe9z'
%

Apparently, the encoding of the python file and the encoding used on the
win32 console is different.
I need to write a console mode program that processes input from the
console. Both on UNIX and Windows.
It is no use if I give the encoding of the file, the raw input will be
still bad. Is there a way to give
an encoding for raw_input somehow? Of course I could convert the input
explicitely but it depends on the
platform.... Somehow Python should know the encoding of the console.

Comments are more than welcome.

Laci 2.0
Jul 18 '05 #1
6 2218
Hi !
On W-XP / W2K, you can solve the problem with :
- change the font of the console to "Lucida handwriter"
- change your script to :

# -*- coding: cp1252 -*-

import os
ecran=os.popen('MODE CON: CP SELECT=1252').readlines()

s1 = 'néz'
s2 = raw_input('Please type in "néz":')
print repr(s1)
print repr(s2)

Note than you can force the console to MODE CP = 1252, and del two lines or
the script. Perso, I do it via an icon.

*sorry for my bad english*

@-salutations
--
Michel Claveau
mél : http://cerbermail.com/?6J1TthIa8B
Jul 18 '05 #2

Michel Claveau - abstraction méta-galactique non triviale en fuite
perpétuelle. wrote:
Hi !
On W-XP / W2K, you can solve the problem with :
- change the font of the console to "Lucida handwriter"
- change your script to :

# -*- coding: cp1252 -*-

import os
ecran=os.popen('MODE CON: CP SELECT=1252').readlines()

s1 = 'néz'
s2 = raw_input('Please type in "néz":')
print repr(s1)
print repr(s2)
Okay, I understand now.
This is a fault of the win32 console - it defaults to a different
encoding than other parts of the Windows system.
This is messy but we cannot do anything about it. :-(
*sorry for my bad english*

Not bad at all.
Thanks for your help.

Laci 2.0
Jul 18 '05 #3
Gandalf wrote:
This is a fault of the win32 console - it defaults to a different
encoding than other parts of the Windows system.
This is messy but we cannot do anything about it. :-(


It's better than you think. Python, starting with 2.3, will do the
right thing for

# -*- coding: cp1252 -*-
print u"néz"

It determines that this is a Windows console, determines its encoding,
and converts the Unicode string to that encoding. Of course, this
requires the string to be a Unicode literal. So you'ld expect that

bildschirm = raw_input(u"néz")

works, but unfortunately, it doesn't, as raw_input does not support
Unicode. However, the encoding Python has determined is available
as sys.stdout.encoding, so you can do

bildschirm = raw_input(u"néz".encode(sys.stdout.encoding))

This works even if the user has done chcp in the window, as Python
queries the window what its encoding is, during Python startup.

HTH,
Martin
Jul 18 '05 #4
works, but unfortunately, it doesn't, as raw_input does not support
Unicode. However, the encoding Python has determined is available
as sys.stdout.encoding, so you can do

bildschirm = raw_input(u"néz".encode(sys.stdout.encoding))

This works even if the user has done chcp in the window, as Python
queries the window what its encoding is, during Python startup.

import sys
sys.stdout.encoding 'cp852'sys.stdin.encoding

'cp852'

It is way strange!
I understand that we need to do encoding for the output because strings
in the program files needs to be encoded to the terminal's encoding
before printing.
However, the input (the result of raw_input) will be in the correct
encoding (iso-8859-2 in my case) without any conversion.
I do not understand why is that? The stdin encoding is cp852, not
iso-8859-2.

Jul 18 '05 #5
Gandalf schrieb:
However, the input (the result of raw_input) will be in the correct
encoding (iso-8859-2 in my case) without any conversion.
I do not understand why is that? The stdin encoding is cp852, not
iso-8859-2.


Why do you say that the input will be in the correct encoding? In
your original message, you said that you got this:

'n\xe9z'
'n\x82z'

where the first string was repr(s1) (i.e. in the source encoding,
iso-8859-2). The second string (repr(s2)) is the one that you got
from raw_input, so it is *not* in iso-8859-2. Why do you say it
is?

Regards,
Martin
Jul 18 '05 #6
Bonsoir !
Le code-page de l' iso-8859-2 doit être le 912 (d'après Google)

iso-8859-2 <==> cp912
@-salutations
--
Michel Claveau


Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

2
by: Iain | last post by:
Folks, I'm having a problem with charset encodings that I desparately need some help with. I don't even pretend to know the basics about charsets, so please forgive my ignorance. I am...
1
by: OKI | last post by:
Hi. I have this XML and XSL: SALDO.XML ========= <?xml-stylesheet type="text/xsl" href="saldo.xsl" encoding="iso-8859-1" ?> <Operacion tipo="Saldo Tarjetas" res="ERROR"> <Datos/>
1
by: Neet | last post by:
Hi, I am currently working on sending an XML request to a server and it responds with an xml response. This request and response use ISO-8859-1 encoding. The xml request and response need to match....
2
by: Guillermo Rosich Capablanca | last post by:
I have a problem with utf-8 enconding and I don't know what to do in order to make it work. I want to open a new window with excel data so the user can choose to save it local. Everything...
13
by: bussiere bussiere | last post by:
hi i'am making a program for formatting string, or i've added : #!/usr/bin/python # -*- coding: utf-8 -*- in the begining of my script but str = str.replace('Ç', 'C') str =...
0
by: kellner | last post by:
Hello, I'm parsing a chunk of XML code and would like to add attribute values to individual tags if these are lacking. This is with perl 5.8.6, libxml2 2.6.17, XML::LibXML 1.58. Basically, I...
4
by: WH | last post by:
Hi, I use a Streamreader in VB2005 to read text files I made before in NotePad, but it can't read characters like "é" or "ç". They are omitted, or replaced by other characters. I tried different...
10
by: amygdala | last post by:
Hi all, Another problem that has been bugging me for a while now, but which I swept under the rug too long too now is a mail encoding problem at my (shared) webhost. The problem is that on...
15
by: Bexm | last post by:
Hello I have searched through this forum and it seems some people are having similar problems to me but none of the fixes are fixing mine..! :( I have a table in my database that has two xml...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
by: emmanuelkatto | last post by:
Hi All, I am Emmanuel katto from Uganda. I want to ask what challenges you've faced while migrating a website to cloud. Please let me know. Thanks! Emmanuel
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.