472,353 Members | 2,034 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,353 software developers and data experts.

Unicode troubles

Hi!
I'm finishing a multiplatform collaborative realtime text editor (something
like SubEthaEdit but multiplatform and opensource) develloped using
Python+Twisted as a plugin for Leo.

Of course as the software run in different platforms in different places,
text encoding compatibility is an issue.
So the obvious choice was Tkencoding for client gui, unicode for system
internals and utf-8 for web outputs.
But I'm getting serious trouble using Tk and Unicode internals.

The system, being a text editor use string lenghts and position in the text
widget as parameters of most of the function critical algorithms.
Unfortunatelly I had discovered recently that some encoding does not provide
and equivalence between
num_of_chars/length_of_string/position_in_text_widget. As a result each time
someone press a non ascii key, the references are lose and the other clients
receive a soup of letters.

I had read on internet that Unicode was supposed to keep the relation
num_of_char/string_lenght (and thus the relation
string_length/num_of_char/position_in_text_widget). But this relation does
not occurs on all my machines.

Sometimes I get len(u"eló") = 3 (the good result) and other times
len(u"eló") = 4 (wrong result). These seems indiferent of the OS.

Could someone explain me this issue ? How I'm supposed to manage this
problem ? Do I have to compile python with special params to get unicode
chars and one length unit ?

Thanks.
Rodrigo Benenson.
Jul 18 '05 #1
1 1606
Rodrigo Benenson wrote:
Sometimes I get len(u"eló") = 3 (the good result) and other times
len(u"eló") = 4 (wrong result). These seems indiferent of the OS.


There are different ways to express "special" characters.
E.g. you can describe "ó" as a single character,
or as accent + "o".
What you want is the "canonical form".
Take a look at unicodedata.normalize (well, it came
new with Python 2.3)

http://www.python.org/doc/current/li...icodedata.html

Hope this helps,

Michael Radziej

Jul 18 '05 #2

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: Rodrigo Benenson | last post by:
Hi! I'm finishing a multiplatform collaborative realtime text editor (something like SubEthaEdit but multiplatform and opensource) develloped using...
0
by: Irmen de Jong | last post by:
Hello I'm having some troubles with PySqlite (0.4.3, windows/linux). I want to store unicode strings in the database. So I open a connection...
8
by: Bill Eldridge | last post by:
I'm trying to grab a document off the Web and toss it into a MySQL database, but I keep running into the various encoding problems with Unicode...
4
by: webdev | last post by:
lo all, some of the questions i'll ask below have most certainly been discussed already, i just hope someone's kind enough to answer them again...
2
by: Neil Schemenauer | last post by:
python-dev@python.org.] The PEP has been rewritten based on a suggestion by Guido to change str() rather than adding a new built-in function. ...
1
by: shorti | last post by:
Hello, I am running DB2 UDB 8.2 on AIX 5.3. I am running some tests on converting several tables on an existing database to Unicode. The database...
9
by: thijs.braem | last post by:
Hi everyone, I'm having quite some troubles trying to convert Unicode to String (for use in psycopg, which apparently doesn't know how to cope...
11
by: John Nagle | last post by:
I passed a dict for the "env" variable to Popen with Unicode strings for the dictionary values. Got: File "D:\Python24\lib\subprocess.py",...
6
by: gita ziabari | last post by:
Hello All, The following code does not work for unicode characters: keyword = dict() kw = 'ÇÅÎÓËÉÈ' keyword.setdefault(key, ).append (kw) ...
0
by: antdb | last post by:
Ⅰ. Advantage of AntDB: hyper-convergence + streaming processing engine In the overall architecture, a new "hyper-convergence" concept was...
0
by: Matthew3360 | last post by:
Hi there. I have been struggling to find out how to use a variable as my location in my header redirect function. Here is my code. ...
0
by: AndyPSV | last post by:
HOW CAN I CREATE AN AI with an .executable file that would suck all files in the folder and on my computerHOW CAN I CREATE AN AI with an .executable...
0
hi
by: WisdomUfot | last post by:
It's an interesting question you've got about how Gmail hides the HTTP referrer when a link in an email is clicked. While I don't have the specific...
0
by: Matthew3360 | last post by:
Hi, I have been trying to connect to a local host using php curl. But I am finding it hard to do this. I am doing the curl get request from my web...
0
Oralloy
by: Oralloy | last post by:
Hello Folks, I am trying to hook up a CPU which I designed using SystemC to I/O pins on an FPGA. My problem (spelled failure) is with the...
0
by: Carina712 | last post by:
Setting background colors for Excel documents can help to improve the visual appeal of the document and make it easier to read and understand....
0
BLUEPANDA
by: BLUEPANDA | last post by:
At BluePanda Dev, we're passionate about building high-quality software and sharing our knowledge with the community. That's why we've created a SaaS...
0
by: Rahul1995seven | last post by:
Introduction: In the realm of programming languages, Python has emerged as a powerhouse. With its simplicity, versatility, and robustness, Python...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.