473,840 Members | 1,632 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Faulty encoding settings

How do I cope with faulty encoding settings?

I'm writing an application that needs all internal character data
to be stored in iso-8859-1. It also must allow input and output
using stdin and stdout.

This works just fine with the Windows binary of Python.
sys.stdin.encod ing is correctly set to the encoding of the
current terminal ('cp437').

s = sys.stdin.readl ine()
# Convert to iso-8859-1.
s = s.decode(sys.st din.encoding).e ncode('iso-8859-1')

Granted, users are constrained to entering characters in the
cp437 charset, but that's better than the following.

The Cygwin binary I have (2.4.3) reports sys.stdin.encod ing as
'US-ASCII', which is quite wrong. A Cygwin terminal uses, as far
as I can tell, iso-8859-1. This renders the above construction
useless if the user enters any character codes above 128.
Using raw_input instead of readline addresses the problem by making
it impossible to enter non-ascii text.

Please advise.

This is only a temporary problem, as eventually this application
will use Tkinter as an interface instead. But of course then I'll
probably have a bunch of new problems. ;)

--
Neil Cerutti
Oct 17 '06 #1
3 1675
In <sl************ *******@FIAD06. norwich.edu>, Neil Cerutti wrote:
I'm writing an application that needs all internal character data
to be stored in iso-8859-1. It also must allow input and output
using stdin and stdout.

This works just fine with the Windows binary of Python.
sys.stdin.encod ing is correctly set to the encoding of the
current terminal ('cp437').

s = sys.stdin.readl ine()
# Convert to iso-8859-1.
s = s.decode(sys.st din.encoding).e ncode('iso-8859-1')

Granted, users are constrained to entering characters in the
cp437 charset, but that's better than the following.

The Cygwin binary I have (2.4.3) reports sys.stdin.encod ing as
'US-ASCII', which is quite wrong. A Cygwin terminal uses, as far
as I can tell, iso-8859-1. This renders the above construction
useless if the user enters any character codes above 128.
Using raw_input instead of readline addresses the problem by making
it impossible to enter non-ascii text.

Please advise.
Give the user the ability to explicitly give an encoding. Using the
encoding attribute of files is quite fragile. If you redirect stdin or
stdout the encoding is set to None for example because the interpreter
can't tell what encoding the "other side" of the redirection produces or
expects.

BTW the US-ASCII isn't wrong but just limiting as everything in the ASCII
range is the same in ISO-8859-1.

Ciao,
Marc 'BlackJack' Rintsch
Oct 17 '06 #2
Neil Cerutti schrieb:
The Cygwin binary I have (2.4.3) reports sys.stdin.encod ing as
'US-ASCII', which is quite wrong. A Cygwin terminal uses, as far
as I can tell, iso-8859-1. This renders the above construction
useless if the user enters any character codes above 128.
Using raw_input instead of readline addresses the problem by making
it impossible to enter non-ascii text.

Please advise.
In principle, setting the LANG environment variable should help.
Unfortunately, Cygwin doesn't implement locales correctly (neither
in the Unix way, nor in the Windows way), hence Python's machinery
fails.

If you believe that a Cygwin terminal always uses Latin-1 (try
entering , though - it could be windows-1252 instead), you should
be able to hard-code that, by determining that it is a Cygwin
Python, or that you are running in a Cygwin terminal.

Regards,
Martin
Oct 17 '06 #3
On 2006-10-17, Marc 'BlackJack' Rintsch <bj****@gmx.net wrote:
In <sl************ *******@FIAD06. norwich.edu>, Neil Cerutti wrote:
>I'm writing an application that needs all internal character data
to be stored in iso-8859-1. It also must allow input and output
using stdin and stdout.

Give the user the ability to explicitly give an encoding.
Using the encoding attribute of files is quite fragile. If you
redirect stdin or stdout the encoding is set to None for
example because the interpreter can't tell what encoding the
"other side" of the redirection produces or expects.
Thanks for that sensible idea.

On the other hand, if Python's implementors couldn't figure out
what the encoding is, I doubt the average user has a prayer. ;-)

--
Neil Cerutti
Oct 17 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
14266
by: PAN | last post by:
I need some guidance here I've written this HTML code using the Windows Notebook: <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EL"> <html> <head> <title>This is a Greek language title -> </title> </head> <body>
9
5800
by: Joe Blow | last post by:
Strange problem, Web pages encoded in utf-8 are appearing on customers' browsers as iso-western european. This means that characters like the British symbol get messed up. No amount of tinkering with browser settings or the server settings has helped. Any ideas what the root of the problem might be? The page in question is www.itchy-coo.com (displays fine on my browser, and other North American browsers, seemingly). I'd appreciate...
2
4366
by: jmhmaine | last post by:
During the course of development cycle I receive HTML files from designers that use Macs and PCs, but use tools other then Visual Studio. So these files sometimes are not UTF-8 Encoded. I see that Visual Studio creates a globalization tag with UTF-8 as the requestEndcoding and responseEncoding. I have three questions regarding this: 1. Does the globalization tag convert an ANSI encoded file into UTF-8 when it complies the ASPX and...
8
5076
by: aine_canby | last post by:
The following line in my code is failing because sys.stdin.encoding is Null. This has only started happening since I started working with Pydef in Eclipse SDK. Any ideas? uni=unicode(word,sys.stdin.encoding) Thanks, Aine.
0
9864
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, well explore What is ONU, What Is Router, ONU & Routers main usage, and What is the difference between ONU and Router. Lets take a closer look ! Part I. Meaning of...
0
10609
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
9445
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development projectplanning, coding, testing, and deploymentwithout human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
1
7846
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7024
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5691
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
5879
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
2
4081
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3140
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.