By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
424,837 Members | 1,206 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 424,837 IT Pros & Developers. It's quick & easy.

Unicode -> Python -> DBAPI -> PyPgSQL -> PostgreSQL

P: n/a
I can't seem to find any way to specify the character encoding with the DB
API implementation of PyPgSQL. There is no mention of encoding and Unicode
in the DB API v2.0 spec and the PyPgSQL README.

When I have Unicode strings in Python and store it in a PostgreSQL Unicode
database, will the data automatically be correctly encoded? Or do I need
to specify the UTF-8 client encoding on the database connection somehow?

I'm using the current packages of Debian stable (woody):
Python 2.2
PyPgSQL 2.0
PostgreSQL 7.2 (database created with UNICODE / UTF-8 encoding)

--
René Pijlman
Jul 18 '05 #1
Share this Question
Share on Google+
2 Replies


P: n/a
Rene Pijlman wrote:
I can't seem to find any way to specify the character encoding with the DB
API implementation of PyPgSQL. There is no mention of encoding and Unicode
in the DB API v2.0 spec and the PyPgSQL README. [...]


See section 2.2.5 in the pyPgSQL README:

pyPgSQL has a few extensions that make it possible to insert Unicode strings
into PostgreSQL and fetch unicode strings instead of byte strings from the
database.

The module-level connect() function has two Unicode-related parameters:

- client_encoding
- unicode_results

*client_encoding* accepts the same parameters as the encode method
of Unicode strings. If you also want to set a policy for encoding
errors, set client_encoding to a tuple, like ("koi8-r", "replace")

Note that you still must make sure that the PostgreSQL client is
using the same encoding as set with the client_encoding parameter.
This is typically done by issuing a "SET CLIENT_ENCODING TO ..."
SQL statement immediately after creating the connection.

If you also want to fetch Unicode strings from the database, set
*unicode_results* to 1.

For example, assuming a database created with *createdb mydb -E UNICODE*
and a
table *TEST(V VARCHAR(50))*:
from pyPgSQL import PgSQL
cx = PgSQL.connect(database="mydb", client_encoding="utf-8", unicode_results=1) cu = cx.cursor()
cu.execute("set client_encoding to unicode")
cu.execute("insert into test(v) values (%s)", (u'\x99sterreich',))
cu.execute("select v from test")
cu.fetchone() [u'\x99sterreich']

-- Gerhard
Jul 18 '05 #2

P: n/a
Gerhard Häring:
Rene Pijlman:
I can't seem to find any way to specify the character encoding with the DB
API implementation of PyPgSQL. There is no mention of encoding and Unicode
in the DB API v2.0 spec and the PyPgSQL README. [...]


See section 2.2.5 in the pyPgSQL README:


Well, its not in this README I found in the Debian package python2.2-pgsql
with pyPgSQL 2.0:
#ident "@(#) $Id: README,v 1.20 2001/11/05 01:18:12 ghaering Exp $"
pyPgSQL - v2.0: Python DB-API 2.0 Compliant Interface Module for
PostgreSQL.

But this tells me I probably need to upgrade to pyPgSQL 2.3 or 2.4:
"Q: I’ve heard of Unicode support for pyPgSQL. What’s the current status?
A: It’s integrated in pyPgSQL 2.3."
http://pypgsql.sourceforge.net/pypgsql-faq.pdf

Thanks a lot Gerhard, you've put me on the right track.

--
René Pijlman
Jul 18 '05 #3

This discussion thread is closed

Replies have been disabled for this discussion.