473,839 Members | 1,362 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

About upper() and lower to handle multibyte char

Hi,

while upgrade to 8.0 (beta3) we got some problem:

we have a database which encoding is UNICODE,
when we do queries like:
select upper('中文'); --select some multibyte character,
then postgresql response:

ERROR: invalid multibyte character for locale

but when we do it in a SQL_ASCII encoding database,
it's ok and return unchanged string, that's what we think correct result.

I've searched the archive and found that in 8.0, the upper()/lower()
function have been changed to could handle multibyte character,
but, what's the expected behavior of these two function in coping with
multibyte character?

Another question: from the archive, I know that on system with
<wctype.h> toupper/tolower functions, the postgresql would support
multibyte upper/lower function; my system (slackware 10) got <wctype.h>,
but why still I get the ERROR? How can I check if my postgresql installation
come with multibyte upper/lower support?

The problem make us very difficlut when using upper/lower to deal with
columns with more then one encoding char, like Chinese and English char
in Unicode
database, because the transaction would abort with the error above, that
breaks
our application a lot.

Thanks and any help would be appreciated

Laser
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #1
3 4237
Weiping <la***@qmail.zh engmai.net.cn> writes:
we have a database which encoding is UNICODE,
when we do queries like:
select upper('中文'); --select some multibyte character,
then postgresql response: ERROR: invalid multibyte character for locale


What locale did you initdb in? The most likely explanation for this
is that the LC_CTYPE setting is not unicode-compatible.

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #2
Tom Lane wrote:
What locale did you initdb in? The most likely explanation for this
is that the LC_CTYPE setting is not unicode-compatible.

emm, I initdb --no-locale, which means LC_CTYPE=C, but if I don't use it
there are
some other issue in multibyte comparing (= operator) operation, will try
again.

Thanks!

Laser

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html

Nov 23 '05 #3
Weiping wrote:
Tom Lane wrote:
What locale did you initdb in? The most likely explanation for this
is that the LC_CTYPE setting is not unicode-compatible.

finally I get it work, while initdb, we should use matched locale
setting and database encoding, like:

initdb --locale=zh_CN.ut f8 -E UNICODE ...

then everything ok (on my platform: slackware 10 and RH9).

Emm, I think it's better to add some words in our docs to tell the uesr
to do so,
because we always to use --no-locale while initdb, because the default
locale
setting of many Linux destro (normally en_US), would cause the multibyte
character compare operaction fail (like "select '一' = '二'", that's
"select 'one'='two'" in Chinese,
but it return true), and we use UNICODE as database encoding to store
multi-language characters
(like Japanese and Korean), don't know if the locale setting
(zh_CN.utf8) would conflict with
those setting.

Any better suggestion?

Thanks

Laser


---------------------------(end of broadcast)---------------------------
TIP 7: don't forget to increase your free space map settings

Nov 23 '05 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
1993
by: Werner Merkl | last post by:
Hi, my problem is, I try to use a py2exe compiled python script from a CD-ROM. This should work, BUT the tool I use (or have to use) generates only upper case file names on the CD-ROM. For py2exe 0.4.1 and python 2.2 this worked fine... but the current releases do not.
3
6861
by: yazan jab | last post by:
Is it true that Multibyte characters are : char arrays (witch represent a string from the basic characters set). In this case Wide characters are the way for encoding characters from the extended characters set. or Multibyte characters are: characters from the extended character set which need more than one byte to encode. And in this case wide
17
11240
by: Janice | last post by:
char* line = "abcd"; How to convert the line to upper case and print? Any option for printf to do this? Thanx
6
12564
by: Manish | last post by:
In my application there is need for only upper case chars.. Currently I am making entry to upper case when user leaves focus of the text control. I want to do some modification here...When user enters any char in small case...during that entry it should convert in upper case... It is very easy in VB6..In VB6 you just need to convert the Ascii value to char ---> then to Ucase ---> then char to Ascii again...this can be done in keypress...
5
2418
by: Sean Kirkpatrick | last post by:
As part of my ongoing effort to provide a set of .Net wrappers for DAO, I'm writing a simple parser in VB.Net to search collection of VB6 source files to add explicit qualifiers to existing variables existing: dim DB as Database ... DB(0) new: DB.Tabledefs(0)
8
20393
by: csanjith | last post by:
Hi, i have a situaion where i need to convert the characters entered in an text field to upper case using C. The configuration id utf8 environment in which user can enter any character (single , double, triple byte etc). I need to convert to upper case only those characters which has got upper case. ie if an user enter bot english and japanese characters in the text field, then I should convert only english characters, not japanese.
9
2799
by: B Williams | last post by:
I have written some code that will take in a string and print out the reverse, but I also want it to check for upper and lower case and swap them. Will someone assist me? include <iostream> using std::cout; using std::cin; using std::endl; #include <string>
10
9091
by: Dancefire | last post by:
Hi, everyone, I'm writing a program using wstring(wchar_t) as internal string. The problem is raised when I convert the multibyte char set string with different encoding to wstring(which is Unicode, UCS-2LE(BMP) in Win32, and UCS4 in Linux?). I have 2 ways to do the job:
3
3466
by: alessio211734 | last post by:
How can I check in c++ string if a character is lower or upper? Exist a function in c++ to convert string to upper characters? Thanks in advance.
0
9698
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can effortlessly switch the default language on Windows 10 without reinstalling. I'll walk you through it. First, let's disable language synchronization. With a Microsoft account, language settings sync across devices. To prevent any complications,...
0
10910
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
0
10589
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
1
10654
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
1
7833
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupr who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
7021
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
0
5683
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
1
4493
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
3
3136
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.