473,385 Members | 1,661 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,385 software developers and data experts.

bug?!: lower() breaks unicode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi there,

I have a database with encoding UNICODE, when I run lower() on a column with
non ascii code the code for "sharp s" gets garbled. This happens with
postgres 7.3.[234] under Linux (Debian woody and SuSE 7.2). I have made sure
that LC_CTYPE=C for the server:

create table t ( a text, b text );
\encoding latin1 // my terminal is latin1
insert into t (a) values( 'Fuß' );
update t set b = lower(a);
select * from t;

ERROR: Could not convert UTF-8 to ISO8859-1

Apparantly the utf-8 special codes get lowercased, as the following selects
yield different results:

\encoding unicode // show me all
select a from t
select b from t

Fuà // select a from t
fu // select b from t, should be "fuÃ"
The JDBC code breaks even more baldy.

java.lang.ArrayIndexOutOfBoundsException: 9
at org.postgresql.core.Encoding.decodeUTF8(Encoding.j ava:254)

- From the release docs of 7.4 it does not seems that this issue has not been
adressed. The database has been initialized to "de_DE@euro", but this
shouldn't matter, should it?

Mit freundlichem Gruß / With kind regards
Holger Klawitter
- --
lists <at> klawitter <dot> de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQE/w0lx1Xdt0HKSwgYRAieWAJ9GHr/CAmh7mXYrM99LNzYimQa+qgCeIlKR
D0+YgVkdlbQtXbAEd9T8/eE=
=NO8s
-----END PGP SIGNATURE-----
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 12 '05 #1
0 1331

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

23
by: Hallvard B Furuseth | last post by:
Has someone got a Python routine or module which converts Unicode strings to lowercase (or uppercase)? What I actually need to do is to compare a number of strings in a case-insensitive manner,...
3
by: Phelim | last post by:
Hi. Im trying to form xml where the content is all unicode, but the xml seems to break at regular intervals, and will not see the content from that point on as unicode. It seems to add some extra...
7
by: John Angel | last post by:
Hi, I have a pretty wierd situation. I have a javascript function that runs when I submit a page. The onsubmit code does quite a bit - it goes through a big textarea and replaces tabs with...
2
by: Ara Anjargolian | last post by:
Just a quick question. Am I mistaken, or was multi-language support for upper(), lower() and friends recently added to 7.5? I could have sworn I saw mention of this on hackers, but, the TODO is...
1
by: peter pilsl | last post by:
postgres 7.4 on linux, glibc 2.2.4-6 I've a table containing unicode-data and the lower()-function does not work proper. While it lowers standard letters like A->a,B->b ... it fails on special...
4
by: pginfo | last post by:
Hi , I tesetd the lower/upper support and cyrillic and it looks broken. My system: pg 7.4.3 freebsd 5.2.1. I created my database with 'UNICODE'.
3
by: Weiping | last post by:
Hi, while upgrade to 8.0 (beta3) we got some problem: we have a database which encoding is UNICODE, when we do queries like: select upper('ÖÐÎÄ'); --select some multibyte character, then...
19
by: Eric Lindsay | last post by:
Should HTML 4.01 Strict markup be done in upper case or in lower case? I understand that HTML allows either upper or lower case. I also notice that XHTML apparently requires lower case. However I...
4
by: Alexey Moskvin | last post by:
Hi! I have a set of strings (all letters are capitalized) at utf-8, russian language. I need to lower it, but my_string.lower(). Doesn't work. See sample script: # -*- coding: utf-8 -*- s1 =...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: aa123db | last post by:
Variable and constants Use var or let for variables and const fror constants. Var foo ='bar'; Let foo ='bar';const baz ='bar'; Functions function $name$ ($parameters$) { } ...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.