473,808 Members | 2,838 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

choosing a server codeset

Are there advantages to choosing, say, IBM-1252 over UTF-8? If my PC
application uses code page 1252 will it perform better because no code page
translation is required? I assume so. What type of performance hit might I
expect when connecting to a UTF-8 database? What advantages would I get by
using a UTF-8 database? Obviously it can store the entire Unicode 'plane'
(or whatever that's called), but if my PC can't display it anyway what do I
really care? And I guess that storing XML data requires UTF-8? But I don't
think we plan on utilizing this.

What else should we know to make our decision?

Thanks,
Frank

Jan 16 '08 #1
3 2486
Hi Frank!

If the database contains national characters other than A-Z, a-z, using
UTF-8, a table column declared as Char(8) will
have room for 4-8 characters, since Characters lika ÅÄÖÉÜ takes 2 bytes in
UTF-8. If you don't work with multiple national languages go for a character
set that suits your situation. If you need to work with XML-data put them in
separate database.
/dg

"Frank Swarbrick" <Fr************ *@efirstbank.co mwrote in message
news:47******** **********@efir stbank.com...
Are there advantages to choosing, say, IBM-1252 over UTF-8? If my PC
application uses code page 1252 will it perform better because no code
page
translation is required? I assume so. What type of performance hit might
I
expect when connecting to a UTF-8 database? What advantages would I get
by
using a UTF-8 database? Obviously it can store the entire Unicode 'plane'
(or whatever that's called), but if my PC can't display it anyway what do
I
really care? And I guess that storing XML data requires UTF-8? But I
don't
think we plan on utilizing this.

What else should we know to make our decision?

Thanks,
Frank

Jan 16 '08 #2
Frank Swarbrick wrote:
Are there advantages to choosing, say, IBM-1252 over UTF-8? If my PC
application uses code page 1252 will it perform better because no code
page
translation is required? I assume so. What type of performance hit might
I
expect when connecting to a UTF-8 database? What advantages would I get
by
using a UTF-8 database? Obviously it can store the entire Unicode 'plane'
(or whatever that's called), but if my PC can't display it anyway what do
I
really care? And I guess that storing XML data requires UTF-8? But I
don't think we plan on utilizing this.

What else should we know to make our decision?

Thanks,
Frank
Hi

Some characters that may be single byte in 1252 are mult-byte in UTF-8. With
a standard UK keyboard I think that there are 3 or 4 characters that are
multi-byte in UTF-8.

I like and prefere UTF-8 but the applications must coded for UTF-8. E.g. if
you have an 8 byte character column and an 8 byte (1252) entry field and
fill the entry field using at least 1 of the UTF-8 multibyte characters you
will get a data truncation error. Also you need to be careful about the
number of characters in a column as the byte count is not necessarily the
character count.

Things are becoming much more global. I have moved to France but still have
some accounts and investments in the UK. I also purchase some things from
the UK and my address contans accents
Colin
Jan 16 '08 #3
>>On 1/16/2008 at 3:40 PM, in message <fm**********@n ews.tiscali.fr> ,
Colin
Booth<co******* **@gmail.comwro te:
Frank Swarbrick wrote:
>Are there advantages to choosing, say, IBM-1252 over UTF-8? If my PC
application uses code page 1252 will it perform better because no code
page
translation is required? I assume so. What type of performance hit
might
>I
expect when connecting to a UTF-8 database? What advantages would I get
by
using a UTF-8 database? Obviously it can store the entire Unicode
'plane'
>(or whatever that's called), but if my PC can't display it anyway what
do
>I
really care? And I guess that storing XML data requires UTF-8? But I
don't think we plan on utilizing this.

What else should we know to make our decision?

Thanks,
Frank

Hi

Some characters that may be single byte in 1252 are mult-byte in UTF-8.
With
a standard UK keyboard I think that there are 3 or 4 characters that are
multi-byte in UTF-8.

I like and prefere UTF-8 but the applications must coded for UTF-8. E.g.
if
you have an 8 byte character column and an 8 byte (1252) entry field and
fill the entry field using at least 1 of the UTF-8 multibyte characters
you
will get a data truncation error. Also you need to be careful about the
number of characters in a column as the byte count is not necessarily
the
character count.

Things are becoming much more global. I have moved to France but still
have
some accounts and investments in the UK. I also purchase some things
from
the UK and my address contans accents
I question your comment "the applications must coded for UTF-8". I just
wrote an OpenCobol application with imbedded DB2. No special "UTF-8"
coding, whatever that might mean. All it does is connect to the database,
retrieve the "string" and "hex" values of a set of VARCHAR(25) columns, and
displays those values.

I run this against two databases:
TEST1 is a database defined as codeset IBM-1252.
UTFDB is a database defined as codeset UTF-8.

Here are the results:

CONNECT TO test1
5B544553545D
+0006: [TEST]
7C544553547C
+0006: |TEST|
A654455354A6
+0006: ¦TEST¦
80
+0001: €

CONNECT TO utfdb
5B544553545D
+0006: [TEST]
7C544553547C
+0006: |TEST|
C2A654455354C2A 6
+0006: ¦TEST¦
E282AC
+0001: €

(+0001: € <== that actually shows as the euro symbol in Notepad.)

As you can see, for the UTF-8 database the euro symbol was stored as
x'E282AC'. But since my application used code page 1252 DB2 was smart
enough to translate it to x'80', which is the value for euro in code page
1252.

Now of course when there is a symbol that exists in UTF-8 and not in 1252
then there will be a problem.

I guess your point is, and it's a good one, that if a CHAR or VARCHAR column
is defined in a UTF-8 database then you, in a sense, have to "over define"
the length to take in to account the possibility of multi-byte characters?
For instance, a 1 character field that could possibly contain a multi-byte
UTF-8 character (such as the euro symbol) would have to be defined in the
database as, say, CHAR(3).

This does bring to mind a question I have been pondering. Is there any harm
in defining 'string' fields to be much larger than the largest string length
that you would ever expect? Like an address line. It might be 50 or so
characters. Is there harm in defining it as VARCHAR(250) or even
VARCHAR(32000)? Does it waste space or any other resource?

Thanks for your help.

Frank
Jan 18 '08 #4

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
3590
by: Nuff Said | last post by:
When I type the following code in the interactive python shell, I get 'UTF-8'; but if I put the code into a Python script and run the script - in the same terminal on my Linux box in which I opened the python shell before -, I get 'ANSI_X3.4-1968'. How does that come? Thanks in advance for your answers! Nuff.
4
2384
by: Jonas Hei | last post by:
I need to decided between Standard and Enterprise Edition (Cost is a criteria - but its secondary to performance - <!--and I am not paying for it myself-->) The server spec under consideration: Dual Xeon, 1GB RAM, 36GB - RAID 1 (Dell PowerEdge 1850). Application: Windows 2003 Std Server, ASP.NET, MS SQL Server 2000 based data driven web application.
13
1762
by: nospam | last post by:
DEAR MICROSOFT: WOULD YOU PLEASE put up a web page showing the price list of EXPECTED COSTS for MOM & POP when choosing between MySql/PHP and .NET. FIRST: Show INITIAL COSTS for like a 5-10 page web site that can take a few orders... INCLUDE THE ISP charges..developer, software...
15
1446
by: Ant | last post by:
Hi, This might seem like a strange question but I'm wondering how other developers go about choosing the appropriate Exception objects to use in their catch statements. Currently, I choose them only when they are returned in the error message of an unhandled exception, then set a general exeption handler to deal with anything else that might happen. This doesn't seem the best way, but maybe it is(?). Should I get to know the names of all...
4
1260
by: Madi | last post by:
Dear all, Im in a confusion about choosing a job offer.Right now im working in ..Net 3.0 components(Workflows,WCF),asp.net 2.0 webparts and all.Im having a good exposure here but with a very less salary which i cant live with.Now i got an offer from an MNC with 5 times of current salary but to work in Sharepoint server 2007,Infopath,SQL Reporting services and all those with c# coding a bit.They said i hav to go to client site to gather...
38
2683
by: ifti_crazy | last post by:
I am VB6 programmer and wants to start new programming language but i am unable to deciced. i have read about Python, Ruby and Visual C++. but i want to go through with GUI based programming language like VB.net so will you please guide me which GUI based language has worth with complete OOPS Characteristics will wait for the answer
19
2580
by: hansBKK | last post by:
Upfront disclaimer - I am a relative newbie, just starting out learning about PHP, mostly by researching, installing and playing with different scripts. I am looking for a host that will provide the right environment for this - running a wide variety of PHP applications. I realise that security is also important, but for now flexibility is more important to me. Note that I'm **not** looking for people to recommend hosting companies, I...
1
2205
by: vijayakumar | last post by:
hi all I'm beginner in CORBA Server-Client application development. My server- client application was worked well and i have tested it too. Due to some Network problem we have rebooted our dedicated server , then i restart my corba service, application in the server it started running without any exception. when the client application tried to connect with the server it go to wait_to_connect mode and gives TRANSIENT ERROR
0
2874
by: caesarkim | last post by:
I need to connect to the db (created with "IBM-943" codeset) on DB2 AIX . I am having a problem retrieving data with japanese character in 'where' clause something like this. SELECT * FROM \"TEST\".\"TABLE1\" WHERE COL2 like '%㈱%' It returns nothing. So I want to try to set the codepage(the same code page that db uses) in JDBC level to see if it retrieves something. Here is what I am trying. But I am not sure if I am using valid...
0
9721
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10628
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10374
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
10113
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
9195
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
6880
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert into image. Globals.ThisAddIn.Application.ActiveDocument.Select();...
1
4331
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3859
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
3011
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.