473,769 Members | 5,757 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Foreign Character Problems In Python 2.5 and Tkinter

Hi,

I'm writing a small text editor type application with Python 2.5 and
Tkinter. I'm using the Tk text widget for input and output, and the
problem is that when I try to save its contents to a .txt file, any
Scandinavian letters such as "äöå ÄÖÅ" are saved incorrectly and show up
as a mess when I open the .txt file in Windows Notepad.

It seems that the characters will only get mixed if the user has typed
them into the widget, but if the program has outputted them, they are
saved correctly.

The function that is saving the file is as follows:

try:
file = open(self.curre ntSaveFile, 'w+')
file.write(self .text.get(0.0, END))
except IOError:
tkMessageBox.sh owwarning('Save File', 'An error occurred while trying to
save \"' + self.currentSav eFile + '\"', parent=self.fra me)
finally:
file.close()

Sometimes its output in the file is "äöå ÄÖÅ" for "äöå ÄÖÅ" and
sometimes it gives me the error: "UnicodeEncodeE rror: 'ascii' codec
can't encode characters in position 33-35: ordinal not in range(128)"
I have tried changing it to:

try:
file = codecs.open(sav efilename, 'w+', 'utf-8', 'ignore')
file.write(unic ode(self.text.g et(0.0, END), 'utf-8', 'ignore'))
self.currentSav eFile = savefilename
except IOError:
tkMessageBox.sh owwarning('Save File', 'An error occurred while trying to
save \"' + self.currentSav eFile + '\"', parent=self.fra me)
finally:
file.close()

Which does save the user-typed characters correctly, but loses any
newlines and "äöå" characters outputted by the program.

I have no idea how solve this problem, and would appreciate any help.
Oct 13 '07 #1
6 2377
Juha S. kirjoitti:
problem is that when I try to save its contents to a .txt file, any
Scandinavian letters such as "äöå ÄÖÅ" are saved incorrectly and show up
as a mess when I open the .txt file in Windows Notepad.

It seems that the characters will only get mixed if the user has typed
them into the widget, but if the program has outputted them, they are
saved correctly.
Did you define the encoding for the source file and
put u (for unicode) in front of your strings. The
following piece produces proper UTF-8. Couldn't test with
Notepad though, no Windows here.

Note this message is also encoded in UTF-8, so should be
your editor. I can't believe we are still messing with this
stuff in 2007. In old bad days it was easy, you should
only learn to read { as ä, | as ö etc... and vice versa
with localized terminals -- C code looked rather exotic
with a-umlauts everywhere ;)
#!/usr/bin/python
# -*- coding: utf-8 -*-

from Tkinter import *
import codecs

class Application(Fra me):
def save(self):
FILE = codecs.open("ou tfile.txt", "w", "utf-8")
FILE.write(u"ST ART - åäöÅÄÖ\n" )
FILE.write(self .text_field.get (0.0, END))
FILE.write(u"EN D - åäöÅÄÖ\n" )
FILE.close()
self.quit()

def __init__(self, master=None):
Frame.__init__( self, master)
self.grid()

self.text_field = Text(self, width=40, height=10)
self.text_field .grid()

self.save_butto n = Button(self, text="save and exit", command=self.sa ve)
self.save_butto n.grid()

if __name__ == "__main__":
app = Application()
app.mainloop()

Oct 13 '07 #2
Thanks for the reply. I made changes to my code according to your
example. Now any Scandinavian characters that are outputted by the
program are missing in the Tk text box.

I'm using a loading function like this to load the data that is to be
outputted by the program:

def loadWords(self, filename):
ret = []

try:
file = codecs.open(fil ename, 'r', 'utf-8', 'ignore')
for line in file:
if line.isspace() == False: #Must skip blank lines (read
only lines that contain text).
line = line.replace(u' \n', u'')
ret.append(line )
except IOError:
tkMessageBox.sh owwarning(u'Ope n File', u'An error occurred
wile trying to load \"' + filename + u'\"', parent=self.fra me)
finally:
file.close()

return ret
Also, the newlines are still lost when saving the text widget contents
to a file. I'm inserting the program generated text to the text widget
through "text.insert(EN D, txt + u'\n\n')".
Janne Tuukkanen wrote:
Juha S. kirjoitti:
>problem is that when I try to save its contents to a .txt file, any
Scandinavian letters such as "äöå ÄÖÅ" are saved incorrectly and show up
as a mess when I open the .txt file in Windows Notepad.

It seems that the characters will only get mixed if the user has typed
them into the widget, but if the program has outputted them, they are
saved correctly.

Did you define the encoding for the source file and
put u (for unicode) in front of your strings. The
following piece produces proper UTF-8. Couldn't test with
Notepad though, no Windows here.

Note this message is also encoded in UTF-8, so should be
your editor. I can't believe we are still messing with this
stuff in 2007. In old bad days it was easy, you should
only learn to read { as ä, | as ö etc... and vice versa
with localized terminals -- C code looked rather exotic
with a-umlauts everywhere ;)
#!/usr/bin/python
# -*- coding: utf-8 -*-

from Tkinter import *
import codecs

class Application(Fra me):
def save(self):
FILE = codecs.open("ou tfile.txt", "w", "utf-8")
FILE.write(u"ST ART - åäöÅÄÖ\n" )
FILE.write(self .text_field.get (0.0, END))
FILE.write(u"EN D - åäöÅÄÖ\n" )
FILE.close()
self.quit()

def __init__(self, master=None):
Frame.__init__( self, master)
self.grid()

self.text_field = Text(self, width=40, height=10)
self.text_field .grid()

self.save_butto n = Button(self, text="save and exit", command=self.sa ve)
self.save_butto n.grid()

if __name__ == "__main__":
app = Application()
app.mainloop()

Oct 13 '07 #3
Sat, 13 Oct 2007 16:13:21 +0300, Juha S. kirjoitti:
Thanks for the reply. I made changes to my code according to your
example. Now any Scandinavian characters that are outputted by the
program are missing in the Tk text box.
file = codecs.open(fil ename, 'r', 'utf-8', 'ignore')
Remove that 'ignore'. If you then get error which complains,
that utf-8 codec can't handle the file, you've found the culprit.
The file might be in iso-8859-1.
JanneT

Oct 13 '07 #4
Thanks! Opening and saving the file with the iso-8859-1 codec seems to
handle the characters correctly. Now the only problem left are the
missing newlines in the output file. I tried googling for the iso code
for newline and entering it in a Python string as '\x0A' but it doesn't
work in the output file which still loses the newlines.
Janne Tuukkanen wrote:
Sat, 13 Oct 2007 16:13:21 +0300, Juha S. kirjoitti:

>Thanks for the reply. I made changes to my code according to your
example. Now any Scandinavian characters that are outputted by the
program are missing in the Tk text box.


> file = codecs.open(fil ename, 'r', 'utf-8', 'ignore')

Remove that 'ignore'. If you then get error which complains,
that utf-8 codec can't handle the file, you've found the culprit.
The file might be in iso-8859-1.
JanneT

Oct 13 '07 #5
On Oct 13, 5:22 pm, "Juha S." <jusa...@gmail. comwrote:
Thanks! Opening and saving the file with the iso-8859-1 codec seems to
handle the characters correctly. Now the only problem left are the
missing newlines in the output file. I tried googling for the iso code
for newline and entering it in a Python string as '\x0A' but it doesn't
work in the output file which still loses the newlines.

Janne Tuukkanen wrote:
Sat, 13 Oct 2007 16:13:21 +0300, Juha S. kirjoitti:
Thanks for the reply. I made changes to my code according to your
example. Now any Scandinavian characters that are outputted by the
program are missing in the Tk text box.
file = codecs.open(fil ename, 'r', 'utf-8', 'ignore')
Remove that 'ignore'. If you then get error which complains,
that utf-8 codec can't handle the file, you've found the culprit.
The file might be in iso-8859-1.
JanneT
As a noob I've struggled a bit, but basically what I've come up with
is =if the information is strings and especially strings stored in
any style of list/dict, it takes a loop to write the lines to file
myfile[ i ] + '\n' to keep each line for Python I/O purposes. If
you're done with Python manipulation and want WIN, MAC, or UNIX to
begin file I/O, then, you need the consideration of <newline-char>
from the os module, or code it in yourself, e.g. '\r\n'. The fact you
are using codec iso-latin-1 (or iso-8859-1) doesn't change the '\n'
from Python's viewpoint -- that is: '\n' is still '\n'. When your
efforts are I/O with binary encoding the data, it's all Python's
viewpoint.

Oct 13 '07 #6
ni************* *******@yahoo.c om wrote:
As a noob I've struggled a bit, but basically what I've come up with
is =if the information is strings and especially strings stored in
any style of list/dict, it takes a loop to write the lines to file
myfile[ i ] + '\n' to keep each line for Python I/O purposes. If
you're done with Python manipulation and want WIN, MAC, or UNIX to
begin file I/O, then, you need the consideration of <newline-char>
from the os module, or code it in yourself, e.g. '\r\n'. The fact you
are using codec iso-latin-1 (or iso-8859-1) doesn't change the '\n'
from Python's viewpoint -- that is: '\n' is still '\n'. When your
efforts are I/O with binary encoding the data, it's all Python's
viewpoint.

Ah, it was so simple. I replaced any '\n' characters with 'os.linesep'
in the source as you suggested, and now everything works beautifully.
Thanks for the help, guys!
Oct 15 '07 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
1397
by: black | last post by:
Howdy~ I am trying to code with a text editor, during that i got some problems and i hope anyone could supply any clue, they are laied right below: 1 how can we undo and redo action ? This command can be commonly located at Edit-Undo and Edit-Redo for most of editors. 2 can we communicate with clipboard of current system ? 3 i build a textfield by Text class of Tkinter module. i found some shortcut keys are automatically activated but they...
1
1361
by: stewart | last post by:
where is the best place to discuss GUI development problems? Thanks!
0
1505
by: Adelein and Jeremy | last post by:
I have just installed Python 2.3.3, after first installing Tcl/Tk 8.4.6. After successful installation of Tcl/Tk (including passing configure the '--enable-shared' option), and testing both Tcl and Tk, I unpacked Python2.3 and began a frustrating installation. After standard configuration, make told me it was renaming module _tkinter because the module was not found (sorry I am omitting the exact warning message) - I don't understand...
1
3598
by: Michael Yanowitz | last post by:
Hello: Below I have included a stripped down version of the GUI I am working on. It contains 2 dialog boxes - one main and one settings. It has the following problems, probably all related, that I am hoping someone knows what I am doing wrong: 1) Pressing the Settings.. Button multiple times, brings up many instances of the Settings Panel. I just want it to bring up one. Is there an easy way to do that?
0
959
by: Thomas P. | last post by:
Hello folks, my name is Thomas, and I am new to this newsgroup. So first I want to say hello. :-) ....done! Now, I have a problem concerning my new python2.5 install. With python2.3 and 2.4, Tkinter was no problem.
3
1981
by: Thomas Ploch | last post by:
Hello folks, Since this is my first post on the list, a brief introduction of myself. My name is Thomas, I am 26 years old, I am a student of Computational Linguistics and I am a python user. :-) Now my problem: I have Tcl/Tk 8.4.4 installed:
1
3185
by: jmalone | last post by:
I have a python script that I need to freeze on AIX 5.1 (customer has AIX and does not want to install Python). The python script is pretty simple (the only things it imports are sys and socket). The README file in the Tools/freeze directory of the Python-2.4.4 distribution says the following (and many other things): Previous versions of Freeze used a pretty simple-minded algorithm to
34
5368
by: Alexnb | last post by:
Gerhard Häring wrote: No, it didn't work, but it gave me some interesting feedback when I ran it in the shell. Heres what it told me: Traceback (most recent call last): File "<pyshell#10>", line 1, in <module> os.startfile("C:\Documents and Settings\Alex\My Documents\My
8
3288
by: karthikbalaguru | last post by:
Hi, One of my python program needs tkinter to be installed to run successfully. I am using Redhat 9.0 and hence tried installing by copying the tkinter-2.2.2-36.i386.rpm alone from the CD 3 to my pc. But, it is not getting installed and is failing by throwing the below errors. Should i need to configure / install any specific files for resolving this issue ?
0
9423
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10210
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10039
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
9990
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9860
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8869
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5445
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
3955
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3560
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.