473,467 Members | 1,487 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

i18n hell

I just spent hours trying to figure out why even after I set my SQL
table attributes to UTF-8 only garbage kept adding into the database.
Apparently you need to execute "SET NAMES 'utf8'" before inserting into
the tables.

Does anyone have experience working with other languages using Django
or Turbogears? I just need to be able to retrieve and enter text to
the database from my page without it being mangled. I know these
frameworks employ ORM so you don't need to write SQL and that worries
me because I tried this on Rails and it wouldn't work.

Thanks.

Apr 24 '06 #1
3 1606
fyleow wrote:
I just spent hours trying to figure out why even after I set my SQL
table attributes to UTF-8 only garbage kept adding into the database.
Apparently you need to execute "SET NAMES 'utf8'" before inserting into
the tables.

Does anyone have experience working with other languages using Django
or Turbogears? I just need to be able to retrieve and enter text to
the database from my page without it being mangled. I know these
frameworks employ ORM so you don't need to write SQL and that worries
me because I tried this on Rails and it wouldn't work.


Frequently asked question to people who are burning in i18n hell: are
you using unicode strings or byte strings? Unicode string means that
type(your_string) is unicode, it does not mean you keep utf-8 encoded
text in python byte strings.

AFAIK Ruby has only byte strings that have the same set of
issues/problems/traps as Python byte strings.

Apr 24 '06 #2
On 24 Apr 2006 00:38:42 -0700, Serge Orlov <Se*********@gmail.com> wrote:
fyleow wrote:
I just spent hours trying to figure out why even after I set my SQL
table attributes to UTF-8 only garbage kept adding into the database.
Apparently you need to execute "SET NAMES 'utf8'" before inserting into
the tables.

Does anyone have experience working with other languages using Django
or Turbogears? I just need to be able to retrieve and enter text to
the database from my page without it being mangled. I know these
frameworks employ ORM so you don't need to write SQL and that worries
me because I tried this on Rails and it wouldn't work.


Frequently asked question to people who are burning in i18n hell: are
you using unicode strings or byte strings? Unicode string means that
type(your_string) is unicode, it does not mean you keep utf-8 encoded
text in python byte strings.


I used to live i18n hell, a while ago, until I understood this:
everytime you keep a reference to some kind of string object, ALWAYS
ALWAYS ALWAYS be AWARE of whether it is not encoded (a unicode object)
or an encoding string (a str object), and if so, which encoding it is
in. Then deal with the conversion between the two domains EXPLICITLY
(e.g. encode(), decode()). If you hold onto a str or unicode object
and you don't know which it is, you are inevitably bound to face
unicode hell at some point. You can use a prefix convention if that
makes it easier for you, but the point is that you CANNOT just "wing
it". Python makes it too easy to just "wing it" and that creates a
lot of surprises, especially since some methods hide the conversions,
e.g. str.join.

w.r.t. to DB storage, that depends on the specific database you're
using and the DBAPI module you're using, read up on it, write a few
tests on your corresponding DBAPI (simple tests, easy peasy), know
what kinds of strings you're sending in and reading back. I'm using
PostgreSQL often and my configuration always stores strings in UTF-8
in the database. I have a lightweight mapping module that
disambiguiates and does the encoding/decoding automatically in a
consistent way (that decision belongs in the client code for now,
unfortunately, but is centralized using my table declaration that
lists the desired conversions for each column). See
http://furius.ca/antiorm/ for something simple that works well.

cheers,


--
Martin
Furius Python Training -- http://furius.ca/training/
Apr 24 '06 #3
Martin Blais napisał(a):
See
http://furius.ca/antiorm/ for something simple that works well.


I'd like to know what is this module/library good for *before* I start
downloading it. "Almost like ORM but not exactly" is rather vague term
and can denote anything. Is it dishwasher? Or microwave oven?

BTW, I don't have any problems without character encodings since I
started using unicode objects internally in my programs. Database is the
same kind of data source, as regular files, sockets or ttys -- you have
to know client encoding before you start receiving data. Then decode it
to unicode and you are fine.

--
Jarek Zgoda
http://jpa.berlios.de/
Apr 24 '06 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: ProgDario | last post by:
HI, I downloaded and installed the I18N pear package, but the link on the doc referring to the DB is broken. Where can I find the I18N DB? Without it I can't make it work! Thanks in...
4
by: Logan | last post by:
Is it possible to tell the wxPython widgets (e.g. file dialogs) to use another language (instead of English)? Thanks in advance for any hints! -- mailto: logan@phreaker(NoSpam).net
10
by: Albretch | last post by:
.. Can you define the Character Set for particular tables instead of databases? . Which DBMSs would let you do that? . How do you store in a DBMS i18n'ed users' from input, coming over the web...
13
by: Guido Wesdorp | last post by:
Hi! I've just released a JavaScript library to allow internationalizing JavaScript code and/or to do HTML translation from JavaScript. It's a first release, and it doesn't have all the features...
0
by: Laszlo Zsolt Nagy | last post by:
Hello, I wonder if there is a standard for making i18n in Python projects. I have several Python projects that are internationalized. I also have Python packages with i18n. But it is still not...
3
by: Darren Davison | last post by:
Hi, I have a documentation tool based on Java and XSLT that I want to add i18n capability to. There are around 8 stylesheets that process a Source generated by the Java code and some of the...
8
by: Alan J. Flavell | last post by:
OK, I guess I'm about ready to expose this page for public discussion: http://ppewww.ph.gla.ac.uk/~flavell/charset/i18n-weft.html Please concentrate on the content. I'm well aware that my old...
0
by: i18n-bounces | last post by:
Your mail to 'I18n' with the subject Mail Delivery (failure i18n@mova.org) Is being held until the list moderator can review it for approval. The reason it is being held: Post by...
8
by: CptDondo | last post by:
I have a small, embedded app that uses a webserver to serve up pages showing status, etc. Right now all the pages are hard-coded in English. We need to provide multi-lingual support. All of...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers,...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing,...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.