By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
438,710 Members | 1,887 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 438,710 IT Pros & Developers. It's quick & easy.

UNICODE-encoded database does not accept umlaut-characters.

P: n/a
Created a new 7.4 database.
# create database foo with encoding = UNICODE;
Then tried to restore my dump from pg 7.2 which was SQL-ASCII or Latin1
encoded (cant tell which of the two, only got the dump of the old database
left after upgrading postgresql.)
Succeeded creating the objects, but no data was restored, instead
postgresql complained about illegal UNICODE characters. Also export of an
MS-Access Database with pgAdmin 1.6 failed with the same errors.

Created a new database with encoding = Latin1. Everything worked fine.

This is a shame, as UNICODE was chosen to avoid future problems with exotic
characters. But I cant get it working because of this very problem!

Is this a bug? Any workarounds?

Regards & TIA
Erwin

--
no z in my mail.
Nov 22 '05 #1
Share this Question
Share on Google+
3 Replies


P: n/a
Erwin Brandstetter <Br***@gmzx.at> writes:
Created a new 7.4 database.
# create database foo with encoding = UNICODE;
Then tried to restore my dump from pg 7.2 which was SQL-ASCII or Latin1
encoded (cant tell which of the two, only got the dump of the old database
left after upgrading postgresql.)
Succeeded creating the objects, but no data was restored, instead
postgresql complained about illegal UNICODE characters.


Yeah; by default PG will assume that you are sending it UNICODE data
if that's what the database encoding is.

You can arrange for conversion to occur by adding
set client_encoding = latin1;
at the top of the dump file.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 22 '05 #2

P: n/a
Hi Tom, hi NG!

Tom Lane wrote:
Yeah; by default PG will assume that you are sending it UNICODE data
if that's what the database encoding is.

You can arrange for conversion to occur by adding
set client_encoding = latin1;
at the top of the dump file.


First, thanx for the hint! Tried it, but no success. Here is what I did:

I altered my dump-file, so that it looks like this now:

--
-- pg_dumpall (7.2.1)
--
set client_encoding = latin1;

\connect template1
(...)

Then I created a new db cluster with initdb:

# initdb --encoding = UNICODE

Then I created each database, there was before, but now with UNICODE
encoding.
Then I tried to restore:

# psql template1 < my_dump_file
I got UNICODE errores again:
ERROR: Unicode characters greater than or equal to 0x10000 are not
supported
ERROR: invalid byte sequence for encoding "UNICODE": 0xe46e67
What am I doing wrong?
I am using the postgresql 7.4 debian woody backport, provided by Oliver
Elphick. That is, i was using it, i have messed it all up while trying to
reinstall - this goes into another post ..
Regards
Erwin Brandstetter

--
no z in my mail.
Nov 22 '05 #3

P: n/a
Erwin Brandstetter (myself) wrote in message
First, thanx for the hint! Tried it, but no success. Here is what I did:

(...)

After facing various problems with my installation of postgresql 7.4 I
decided to do a complete re-install (got strange errors when trying to
vacuum, for one).
Now, that finally everything is up and running again, I tried it again
the way Tom has pointed out to me, and .. voilá: it works. Goes off
without a hitch.

One hint, if u should be in a similar situation: Don't forget to set
client_encoding = Latin1 (or whatever is appropriate) for any client
client that does not make use of UNICODE.

Thanks once more to Tom & regards!
Erwin Brandstetter
Nov 22 '05 #4

This discussion thread is closed

Replies have been disabled for this discussion.