472,975 Members | 1,329 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,975 software developers and data experts.

raw_input() and utf-8 formatted chars

s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut

s = raw_input('Enter: ') #A\xcc\x88
print s #displays A\xcc\x88

print len(input) #9
It looks like every character of the string I enter in utf-8 is being
interpreted literally as 9 separate characters rather than one
character. How do I enter a capital A with an umlaut so that python
treats it as one character?

Oct 12 '07 #1
6 4183
On Oct 12, 1:53 pm, 7stud <bbxx789_0...@yahoo.comwrote:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut

s = raw_input('Enter: ') #A\xcc\x88
print s #displays A\xcc\x88

print len(input) #9

It looks like every character of the string I enter in utf-8 is being
interpreted literally as 9 separate characters rather than one
character. How do I enter a capital A with an umlaut so that python
treats it as one character?
I don't know. This works for me:
>>x = raw_input('Enter: ')
Enter: ä
>>len(x)
1
>>>
I'm using Python 2.4 with Default Source Encoding set to None on
Windows XP SP2.

Mike

Oct 12 '07 #2
On Oct 12, 1:18 pm, kyoso...@gmail.com wrote:
On Oct 12, 1:53 pm, 7stud <bbxx789_0...@yahoo.comwrote:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
s = raw_input('Enter: ') #A\xcc\x88
print s #displays A\xcc\x88
print len(input) #9
It looks like every character of the string I enter in utf-8 is being
interpreted literally as 9 separate characters rather than one
character. How do I enter a capital A with an umlaut so that python
treats it as one character?

I don't know. This works for me:
>x = raw_input('Enter: ')
Enter: ä
>len(x)
1

I'm using Python 2.4 with Default Source Encoding set to None on
Windows XP SP2.

Mike
Yeah, but what happens when you enter A\xcc\x88? And what is it that
your keyboard enters to produce an 'a' with an umlaut?

Oct 12 '07 #3
On Fri, 12 Oct 2007 13:18:35 -0700, 7stud wrote:
On Oct 12, 1:18 pm, kyoso...@gmail.com wrote:
>On Oct 12, 1:53 pm, 7stud <bbxx789_0...@yahoo.comwrote:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
s = raw_input('Enter: ') #A\xcc\x88
print s #displays A\xcc\x88
print len(input) #9
It looks like every character of the string I enter in utf-8 is being
interpreted literally as 9 separate characters rather than one
character. How do I enter a capital A with an umlaut so that python
treats it as one character?

I don't know. This works for me:
>>x = raw_input('Enter: ')
Enter:
>>len(x)
1

I'm using Python 2.4 with Default Source Encoding set to None on
Windows XP SP2.

Mike

Yeah, but what happens when you enter A\xcc\x88?
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()` does not "interpret" the input in any way.
And what is it that your keyboard enters to produce an 'a' with an umlaut?
*I* just hit the ä key. The one right next to the ö key. ;-)

Ciao,
Marc 'BlackJack' Rintsch
Oct 12 '07 #4
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()` does not "interpret" the input in any way.
Then why don't I end up with the same situation as this:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
And what is it that your keyboard enters to produce an 'a' with an umlaut?

*I* just hit the ä key. The one right next to the ö key. ;-)
....and what if you don't have an a-with-umlaut key?

Oct 13 '07 #5
On Fri, 12 Oct 2007 19:09:46 -0700, 7stud wrote:
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
>You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()` does not "interpret" the input in any way.

Then why don't I end up with the same situation as this:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
I don't get the question!? In string literals in source code the
backslash has a special meaning, like I wrote above. When Python compiles
that above snippet you end up with a string of three bytes, one with the
ASCII value of an 'A' and two bytes where you typed in the byte value in
hexadecimal:

In [191]: s = 'A\xcc\x88'

In [192]: len(s)
Out[192]: 3

In [193]: map(ord, s)
Out[193]: [65, 204, 136]

In [194]: print s
Ä

The last works this way only if the receiving/displaying program expected
UTF-8 as encoding. Otherwise something other than an Ä would have been
shown.

If you type in that text when asked by `raw_input()` then you get exactly
what you typed because there is no Python source code compiled:

In [195]: s = raw_input()
A\xcc\x88

In [196]: len(s)
Out[196]: 9

In [197]: map(ord, s)
Out[197]: [65, 92, 120, 99, 99, 92, 120, 56, 56]

In [198]: print s
A\xcc\x88
And what is it that your keyboard enters to produce an 'a' with an
umlaut?

*I* just hit the key. The one right next to the ö key. ;-)
...and what if you don't have an a-with-umlaut key?
I find other means to enter it. <Alt+ some magic number on the numeric
keypad in windows, or <Compose>, <a>, <"on Unix/Linux. Some text editors
offer special sequences too. If all fails there are character map
programs that show all unicode characters to choose from and copy'n'paste
them.

Ciao,
Marc 'BlackJack' Rintsch
Oct 13 '07 #6
On Oct 13, 3:09 am, 7stud <bbxx789_0...@yahoo.comwrote:
On Oct 12, 2:43 pm, Marc 'BlackJack' Rintsch <bj_...@gmx.netwrote:
You mean literally!? Then of course I get A\xcc\x88 because that's what I
entered. In string literals in source code the backslash has a special
meaning but `raw_input()` does not "interpret" the input in any way.

Then why don't I end up with the same situation as this:
s = 'A\xcc\x88' #capital A with umlaut
print s #displays capital A with umlaut
And what is it that your keyboard enters to produce an 'a' with an umlaut?
*I* just hit the ä key. The one right next to the ö key. ;-)

...and what if you don't have an a-with-umlaut key?
raw_input() returns the string exactly as you entered it. You can
decode that into the actual UTF-8 string with decode("string_escape"):

s = raw_input('Enter: ') #A\xcc\x88
s = s.decode("string_escape")

It looks like your system already understands UTF-8 and will decode
the UTF-8 string you print to the Unicode character.

Oct 13 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Hugh | last post by:
I am using python 2.3 through the PythonWin program on windows. I would like to create a console-based interactive session. The program raw_input is almost exactly what I'd like, except that...
5
by: Helmut Jarausch | last post by:
Hi when using an interactive Python script, I'd like the prompt given by raw_input to go to stderr since stdout is redirected to a file. How can I change this (and suggest making this the...
2
by: J. W. McCall | last post by:
I'm working on a MUD server and I have a thread that gets keyboard input so that you can enter commands from the command line while it's in its main server loop. Everything works fine except that...
0
by: dale | last post by:
Python newbie disclaimer on I am running an app with Tkinter screen in one thread and command-line input in another thread using raw_input(). First question - is this legal, should it run...
1
by: JerryKreps | last post by:
Hi, folks -- I'm a Python pup. As you can see from the session copied at the end of this post, I have the latest version of Python, and I've been using the Editor-Shell of the latest version of...
21
by: planetthoughtful | last post by:
Hi All, As always, my posts come with a 'Warning: Newbie lies ahead!' disclaimer... I'm wondering if it's possible, using raw_input(), to provide a 'default' value with the prompt? I would...
2
by: tim | last post by:
I want to write a program that looks into a given folder, groups files that have a certain part of the filename in common and then copy those groups one at a time to another place, using the...
17
by: Stuart McGraw | last post by:
In the announcement for Python-2.3 http://groups.google.com/group/comp.lang.python/msg/287e94d9fe25388d?hl=en it says "raw_input(): can now return Unicode objects". But I didn't see anything...
7
by: Mike Kent | last post by:
It's often useful for debugging to print something to stderr, and to route the error output to a file using '2>filename' on the command line. However, when I try that with a python script, all...
8
by: Dox33 | last post by:
I ran into a very strange behaviour of raw_input(). I hope somebody can tell me how to fix this. (Or is this a problem in the python source?) I will explain the problem by using 3 examples....
0
by: lllomh | last post by:
Define the method first this.state = { buttonBackgroundColor: 'green', isBlinking: false, // A new status is added to identify whether the button is blinking or not } autoStart=()=>{
2
by: DJRhino | last post by:
Was curious if anyone else was having this same issue or not.... I was just Up/Down graded to windows 11 and now my access combo boxes are not acting right. With win 10 I could start typing...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 4 Oct 2023 starting at 18:00 UK time (6PM UTC+1) and finishing at about 19:15 (7.15PM) The start time is equivalent to 19:00 (7PM) in Central...
0
by: Aliciasmith | last post by:
In an age dominated by smartphones, having a mobile app for your business is no longer an option; it's a necessity. Whether you're a startup or an established enterprise, finding the right mobile app...
0
tracyyun
by: tracyyun | last post by:
Hello everyone, I have a question and would like some advice on network connectivity. I have one computer connected to my router via WiFi, but I have two other computers that I want to be able to...
2
by: giovanniandrean | last post by:
The energy model is structured as follows and uses excel sheets to give input data: 1-Utility.py contains all the functions needed to calculate the variables and other minor things (mentions...
1
by: Teri B | last post by:
Hi, I have created a sub-form Roles. In my course form the user selects the roles assigned to the course. 0ne-to-many. One course many roles. Then I created a report based on the Course form and...
0
isladogs
by: isladogs | last post by:
The next online meeting of the Access Europe User Group will be on Wednesday 6 Dec 2023 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, Mike...
4
by: GKJR | last post by:
Does anyone have a recommendation to build a standalone application to replace an Access database? I have my bookkeeping software I developed in Access that I would like to make available to other...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.