By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
458,166 Members | 1,370 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 458,166 IT Pros & Developers. It's quick & easy.

Starting point for unicode conversion

P: n/a
I *foolishly* started a Python project (3 years ago) with considering
Unicode issues. Now, I want to resolve future problems with international
versions of my software.

The key point here is Tkinter. I believe (from reading this list) that I
can expect that SOME returned text may be Unicode (depending on content and
Windows locale settings).

Would it be best to just (somehow) force all text into Unicode or would it
be "better" to handle specific instances?

I also have the problem of embedded text in data files I create that I have
to store as *something* that I can fully recover and convert back to
something reasonable even if the locale changes.

Any thoughts welcome .... this is something I am NOT looking forward to.

Thanks
Jul 18 '05 #1
Share this Question
Share on Google+
1 Reply


P: n/a
Howard Lightstone <ho****@eegsoftware.com> writes:
The key point here is Tkinter. I believe (from reading this list) that I
can expect that SOME returned text may be Unicode (depending on content and
Windows locale settings).
Yes, and no. Yes, some returned text may be Unicode, but no, it won't
depend on the locale settings. Instead, Tkinter will return a byte
string if the result contains only ASCII characters, and return a
Unicode string if there are non-ASCII characters.
Would it be best to just (somehow) force all text into Unicode or would it
be "better" to handle specific instances?
If you are prepared to deal with Unicode, it would be best to force
that throughout. I was contemplating to make this an option in
_tkinter, but that has not been implemented - contributions are
welcome.

Meanwhile, you can use

s = unicode(s)

on all strings returned from Tkinter: if s is an ASCII string, the
default encoding should happily convert it to a Unicode object; if s
is a Unicode string, unicode(s) will be a no-op.
I also have the problem of embedded text in data files I create that I have
to store as *something* that I can fully recover and convert back to
something reasonable even if the locale changes.


Don't worry about the locale; it does not matter here.

Regards,
Martin
Jul 18 '05 #2

This discussion thread is closed

Replies have been disabled for this discussion.