By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
464,420 Members | 1,242 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 464,420 IT Pros & Developers. It's quick & easy.

Sorting in Unicode not working

P: n/a
Hi,

Our database contains Hindi as well as English characters. We
have specified the encoding to be unicode during initdb as well
as createdb commands.

Unfortunately sorting of the Hindi fields is not working. For
e.g. we have a person table and the query "SELECT * FROM PERSON
ORDERY BY LASTNAME" returns all the rows but the records are not
being sorted by the last name.

We tried this on Postgresql 7.3.2 as well as 7.4.1 but with no
luck. The OS we tried were Mandrake 9.1 and Fedora Core 1.

Do we need to do anything special to get Hindi/Unicode sorting
working in Postgresql?

Please help,

Hitesh
__________________________________
Do you Yahoo!?
Yahoo! Small Business $15K Web Design Giveaway
http://promotions.yahoo.com/design_giveaway/

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #1
Share this Question
Share on Google+
8 Replies

P: n/a
Hitesh Bagadiya <ba******@yahoo.com> writes:
Our database contains Hindi as well as English characters. We
have specified the encoding to be unicode during initdb as well
as createdb commands.
Unfortunately sorting of the Hindi fields is not working.


You need to make sure you initdb with the right locale, not only
the right encoding. I dunno which locale you want ... but if
sort(1) sorts the way you want then Postgres should too.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #2

P: n/a
We did set the locate to hi_IN at initdb but sorting is not
working. One thing is that linux system is running on en_US
locale. Does this makes any difference?

hitesh
--- Tom Lane <tg*@sss.pgh.pa.us> wrote:
Hitesh Bagadiya <ba******@yahoo.com> writes:
Our database contains Hindi as well as English characters.

We
have specified the encoding to be unicode during initdb as

well
as createdb commands.
Unfortunately sorting of the Hindi fields is not working.


You need to make sure you initdb with the right locale, not
only
the right encoding. I dunno which locale you want ... but if
sort(1) sorts the way you want then Postgres should too.

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

__________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online by April 15th
http://taxes.yahoo.com/filing.html

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to ma*******@postgresql.org)

Nov 23 '05 #3

P: n/a
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

You need to make sure you initdb with the right locale, not only
the right encoding.


So in other words, all databases inside postgres must have the same (or at
least a compatible) encoding+locale in order to allow proper sorting or other
locale dependant things?

Mit freundlichem Gruß / With kind regards
Holger Klawitter
- --
lists <at> klawitter <dot> de
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.2 (GNU/Linux)

iD8DBQFAd9fd1Xdt0HKSwgYRApjMAJ0aMilcPWmVSocljLD9PC 2PpAXgvgCfRI+H
I438s/mxrVmUHLwMOceMw6E=
=yV1G
-----END PGP SIGNATURE-----
---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #4

P: n/a
Hitesh Bagadiya <ba******@yahoo.com> writes:
We did set the locate to hi_IN at initdb but sorting is not
working.


You should check that you have selected a database encoding that matches
what the locale expects. Also double-check that you really do have that
locale selected (use pg_controldata, or in 7.4 just "show lc_collate").

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #5

P: n/a
Holger Klawitter <li***@klawitter.de> writes:
So in other words, all databases inside postgres must have the same (or at
least a compatible) encoding+locale


Yup. strcoll()'s locale setting implicitly assumes a particular
encoding (at least on the platforms I'm familiar with), and so selecting
a database encoding that's incompatible with that will give you bizarre
sorting behavior. The apparent freedom to select a per-database
encoding is really illusory in the current PG system, at least if you
have specific ideas about what you want the sort order to be. You
pretty much have to get it right at initdb time.

There was a thread just a day or two back on pgsql-hackers about
generalizing our locale support, which would fix this problem among
others. I'm not sure how soon it will really happen though...

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

Nov 23 '05 #6

P: n/a
Thanks for your help. The command shows that the locale is
en_US. Now I will try to figure out how to correctly set locale
to hi_IN.

hitesh

--- Tom Lane <tg*@sss.pgh.pa.us> wrote:
Hitesh Bagadiya <ba******@yahoo.com> writes:
We did set the locate to hi_IN at initdb but sorting is not
working.


You should check that you have selected a database encoding
that matches
what the locale expects. Also double-check that you really do
have that
locale selected (use pg_controldata, or in 7.4 just "show
lc_collate").

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

__________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online by April 15th
http://taxes.yahoo.com/filing.html

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
joining column's datatypes do not match

Nov 23 '05 #7

P: n/a
I set the locale to hi_IN during initdb. pg_controldata and show
lc_collate both show that locale is hi_IN. But postgresql is not
returning sorted records.

What can I do next to get sorting working on postgresql?

--- Tom Lane <tg*@sss.pgh.pa.us> wrote:
Hitesh Bagadiya <ba******@yahoo.com> writes:
We did set the locate to hi_IN at initdb but sorting is not
working.


You should check that you have selected a database encoding
that matches
what the locale expects. Also double-check that you really do
have that
locale selected (use pg_controldata, or in 7.4 just "show
lc_collate").

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

__________________________________
Do you Yahoo!?
Yahoo! Tax Center - File online by April 15th
http://taxes.yahoo.com/filing.html

---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to ma*******@postgresql.org

Nov 23 '05 #8

P: n/a
Hitesh Bagadiya <ba******@yahoo.com> writes:
I set the locale to hi_IN during initdb. pg_controldata and show
lc_collate both show that locale is hi_IN. But postgresql is not
returning sorted records.


There's still the other point about whether the database's character
set encoding matches what the locale setting requires.

For that matter, are you certain the locale itself works? Have you
checked that sort(1) produces the sort order you are expecting when
LC_ALL=hi_IN?

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to ma*******@postgresql.org so that your
message can get through to the mailing list cleanly

Nov 23 '05 #9

This discussion thread is closed

Replies have been disabled for this discussion.