473,382 Members | 1,710 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,382 software developers and data experts.

Codepage Win1252

I am missing this codepage quiet some time, but I
was able to patch another unneeded mapping to my needs.
Unfortunately I wasn't able to add a complete new mapping.
Maybe someone of you can do this better... :-)

I added some tiny scripts that generate at least the
needed mappings in the src/backend/utils/mb/Unicode/*.map files.

Hope this helps to get PostgreSQL support more codepages.

Jörg

jschulz@opal:~/programme/postgresql/pgmaps> cat README

Do a copy and paste from a codepage reference under
http://www.microsoft.com/globaldev/r...ce/cphome.mspx

For example win1252 was copied from
http://www.microsoft.com/globaldev/r.../sbcs/1252.htm

then type e.g. make_pgmaps win1252 ...

jschulz@opal:~/programme/postgresql/pgmaps> cat make_pgmaps
#!/bin/bash

for f in $*; do
echo -e "${f}: ${f}_to_utf8.map...\c"
./codepage_to_utf8 ${f} > ${f}_to_utf8.map
echo -e "ok utf8_to_${f}.map...\c"
./utf8_to_codepage ${f} > utf8_to_${f}.map
echo "ok"
done
jschulz@opal:~/programme/postgresql/pgmaps> cat codepage_to_utf8
#!/bin/bash

while read l;
do
cp=`echo "$l" | cut -c1-2`
u16=`echo "$l" | cut -c8-11`
u8=`echo "0x$u16" | recode utf-16/x4..utf-8/x4`
echo " {0x00$cp, $u8},"
done < $1 | awk '{print tolower($0)}'
jschulz@opal:~/programme/postgresql/pgmaps> cat utf8_to_codepage
#!/bin/bash

while read l;
do
cp=`echo "$l" | cut -c1-2`
u16=`echo "$l" | cut -c8-11`
u8=`echo "0x$u16" | recode utf-16/x4..utf-8/x4`
echo " {$u8, 0x00$cp},"
done < $1 | awk '{print tolower($0)}' | sort
jschulz@opal:~/programme/postgresql/pgmaps> cat win1252
80 = U+20AC : EURO SIGN
82 = U+201A : SINGLE LOW-9 QUOTATION MARK
83 = U+0192 : LATIN SMALL LETTER F WITH HOOK
84 = U+201E : DOUBLE LOW-9 QUOTATION MARK
85 = U+2026 : HORIZONTAL ELLIPSIS
86 = U+2020 : DAGGER
87 = U+2021 : DOUBLE DAGGER
88 = U+02C6 : MODIFIER LETTER CIRCUMFLEX ACCENT
89 = U+2030 : PER MILLE SIGN
8A = U+0160 : LATIN CAPITAL LETTER S WITH CARON
8B = U+2039 : SINGLE LEFT-POINTING ANGLE QUOTATION MARK
8C = U+0152 : LATIN CAPITAL LIGATURE OE
8E = U+017D : LATIN CAPITAL LETTER Z WITH CARON
91 = U+2018 : LEFT SINGLE QUOTATION MARK
92 = U+2019 : RIGHT SINGLE QUOTATION MARK
93 = U+201C : LEFT DOUBLE QUOTATION MARK
94 = U+201D : RIGHT DOUBLE QUOTATION MARK
95 = U+2022 : BULLET
96 = U+2013 : EN DASH
97 = U+2014 : EM DASH
98 = U+02DC : SMALL TILDE
99 = U+2122 : TRADE MARK SIGN
9A = U+0161 : LATIN SMALL LETTER S WITH CARON
9B = U+203A : SINGLE RIGHT-POINTING ANGLE QUOTATION MARK
9C = U+0153 : LATIN SMALL LIGATURE OE
9E = U+017E : LATIN SMALL LETTER Z WITH CARON
9F = U+0178 : LATIN CAPITAL LETTER Y WITH DIAERESIS
A0 = U+00A0 : NO-BREAK SPACE
A1 = U+00A1 : INVERTED EXCLAMATION MARK
A2 = U+00A2 : CENT SIGN
A3 = U+00A3 : POUND SIGN
A4 = U+00A4 : CURRENCY SIGN
A5 = U+00A5 : YEN SIGN
A6 = U+00A6 : BROKEN BAR
A7 = U+00A7 : SECTION SIGN
A8 = U+00A8 : DIAERESIS
A9 = U+00A9 : COPYRIGHT SIGN
AA = U+00AA : FEMININE ORDINAL INDICATOR
AB = U+00AB : LEFT-POINTING DOUBLE ANGLE QUOTATION MARK
AC = U+00AC : NOT SIGN
AD = U+00AD : SOFT HYPHEN
AE = U+00AE : REGISTERED SIGN
AF = U+00AF : MACRON
B0 = U+00B0 : DEGREE SIGN
B1 = U+00B1 : PLUS-MINUS SIGN
B2 = U+00B2 : SUPERSCRIPT TWO
B3 = U+00B3 : SUPERSCRIPT THREE
B4 = U+00B4 : ACUTE ACCENT
B5 = U+00B5 : MICRO SIGN
B6 = U+00B6 : PILCROW SIGN
B7 = U+00B7 : MIDDLE DOT
B8 = U+00B8 : CEDILLA
B9 = U+00B9 : SUPERSCRIPT ONE
BA = U+00BA : MASCULINE ORDINAL INDICATOR
BB = U+00BB : RIGHT-POINTING DOUBLE ANGLE QUOTATION MARK
BC = U+00BC : VULGAR FRACTION ONE QUARTER
BD = U+00BD : VULGAR FRACTION ONE HALF
BE = U+00BE : VULGAR FRACTION THREE QUARTERS
BF = U+00BF : INVERTED QUESTION MARK
C0 = U+00C0 : LATIN CAPITAL LETTER A WITH GRAVE
C1 = U+00C1 : LATIN CAPITAL LETTER A WITH ACUTE
C2 = U+00C2 : LATIN CAPITAL LETTER A WITH CIRCUMFLEX
C3 = U+00C3 : LATIN CAPITAL LETTER A WITH TILDE
C4 = U+00C4 : LATIN CAPITAL LETTER A WITH DIAERESIS
C5 = U+00C5 : LATIN CAPITAL LETTER A WITH RING ABOVE
C6 = U+00C6 : LATIN CAPITAL LETTER AE
C7 = U+00C7 : LATIN CAPITAL LETTER C WITH CEDILLA
C8 = U+00C8 : LATIN CAPITAL LETTER E WITH GRAVE
C9 = U+00C9 : LATIN CAPITAL LETTER E WITH ACUTE
CA = U+00CA : LATIN CAPITAL LETTER E WITH CIRCUMFLEX
CB = U+00CB : LATIN CAPITAL LETTER E WITH DIAERESIS
CC = U+00CC : LATIN CAPITAL LETTER I WITH GRAVE
CD = U+00CD : LATIN CAPITAL LETTER I WITH ACUTE
CE = U+00CE : LATIN CAPITAL LETTER I WITH CIRCUMFLEX
CF = U+00CF : LATIN CAPITAL LETTER I WITH DIAERESIS
D0 = U+00D0 : LATIN CAPITAL LETTER ETH
D1 = U+00D1 : LATIN CAPITAL LETTER N WITH TILDE
D2 = U+00D2 : LATIN CAPITAL LETTER O WITH GRAVE
D3 = U+00D3 : LATIN CAPITAL LETTER O WITH ACUTE
D4 = U+00D4 : LATIN CAPITAL LETTER O WITH CIRCUMFLEX
D5 = U+00D5 : LATIN CAPITAL LETTER O WITH TILDE
D6 = U+00D6 : LATIN CAPITAL LETTER O WITH DIAERESIS
D7 = U+00D7 : MULTIPLICATION SIGN
D8 = U+00D8 : LATIN CAPITAL LETTER O WITH STROKE
D9 = U+00D9 : LATIN CAPITAL LETTER U WITH GRAVE
DA = U+00DA : LATIN CAPITAL LETTER U WITH ACUTE
DB = U+00DB : LATIN CAPITAL LETTER U WITH CIRCUMFLEX
DC = U+00DC : LATIN CAPITAL LETTER U WITH DIAERESIS
DD = U+00DD : LATIN CAPITAL LETTER Y WITH ACUTE
DE = U+00DE : LATIN CAPITAL LETTER THORN
DF = U+00DF : LATIN SMALL LETTER SHARP S
E0 = U+00E0 : LATIN SMALL LETTER A WITH GRAVE
E1 = U+00E1 : LATIN SMALL LETTER A WITH ACUTE
E2 = U+00E2 : LATIN SMALL LETTER A WITH CIRCUMFLEX
E3 = U+00E3 : LATIN SMALL LETTER A WITH TILDE
E4 = U+00E4 : LATIN SMALL LETTER A WITH DIAERESIS
E5 = U+00E5 : LATIN SMALL LETTER A WITH RING ABOVE
E6 = U+00E6 : LATIN SMALL LETTER AE
E7 = U+00E7 : LATIN SMALL LETTER C WITH CEDILLA
E8 = U+00E8 : LATIN SMALL LETTER E WITH GRAVE
E9 = U+00E9 : LATIN SMALL LETTER E WITH ACUTE
EA = U+00EA : LATIN SMALL LETTER E WITH CIRCUMFLEX
EB = U+00EB : LATIN SMALL LETTER E WITH DIAERESIS
EC = U+00EC : LATIN SMALL LETTER I WITH GRAVE
ED = U+00ED : LATIN SMALL LETTER I WITH ACUTE
EE = U+00EE : LATIN SMALL LETTER I WITH CIRCUMFLEX
EF = U+00EF : LATIN SMALL LETTER I WITH DIAERESIS
F0 = U+00F0 : LATIN SMALL LETTER ETH
F1 = U+00F1 : LATIN SMALL LETTER N WITH TILDE
F2 = U+00F2 : LATIN SMALL LETTER O WITH GRAVE
F3 = U+00F3 : LATIN SMALL LETTER O WITH ACUTE
F4 = U+00F4 : LATIN SMALL LETTER O WITH CIRCUMFLEX
F5 = U+00F5 : LATIN SMALL LETTER O WITH TILDE
F6 = U+00F6 : LATIN SMALL LETTER O WITH DIAERESIS
F7 = U+00F7 : DIVISION SIGN
F8 = U+00F8 : LATIN SMALL LETTER O WITH STROKE
F9 = U+00F9 : LATIN SMALL LETTER U WITH GRAVE
FA = U+00FA : LATIN SMALL LETTER U WITH ACUTE
FB = U+00FB : LATIN SMALL LETTER U WITH CIRCUMFLEX
FC = U+00FC : LATIN SMALL LETTER U WITH DIAERESIS
FD = U+00FD : LATIN SMALL LETTER Y WITH ACUTE
FE = U+00FE : LATIN SMALL LETTER THORN
FF = U+00FF : LATIN SMALL LETTER Y WITH DIAERESIS
jschulz@opal:~/programme/postgresql/pgmaps> cat utf8_to_win1252.map
{0xc2a0, 0x00a0},
{0xc2a1, 0x00a1},
{0xc2a2, 0x00a2},
{0xc2a3, 0x00a3},
{0xc2a4, 0x00a4},
{0xc2a5, 0x00a5},
{0xc2a6, 0x00a6},
{0xc2a7, 0x00a7},
{0xc2a8, 0x00a8},
{0xc2a9, 0x00a9},
{0xc2aa, 0x00aa},
{0xc2ab, 0x00ab},
{0xc2ac, 0x00ac},
{0xc2ad, 0x00ad},
{0xc2ae, 0x00ae},
{0xc2af, 0x00af},
{0xc2b0, 0x00b0},
{0xc2b1, 0x00b1},
{0xc2b2, 0x00b2},
{0xc2b3, 0x00b3},
{0xc2b4, 0x00b4},
{0xc2b5, 0x00b5},
{0xc2b6, 0x00b6},
{0xc2b7, 0x00b7},
{0xc2b8, 0x00b8},
{0xc2b9, 0x00b9},
{0xc2ba, 0x00ba},
{0xc2bb, 0x00bb},
{0xc2bc, 0x00bc},
{0xc2bd, 0x00bd},
{0xc2be, 0x00be},
{0xc2bf, 0x00bf},
{0xc380, 0x00c0},
{0xc381, 0x00c1},
{0xc382, 0x00c2},
{0xc383, 0x00c3},
{0xc384, 0x00c4},
{0xc385, 0x00c5},
{0xc386, 0x00c6},
{0xc387, 0x00c7},
{0xc388, 0x00c8},
{0xc389, 0x00c9},
{0xc38a, 0x00ca},
{0xc38b, 0x00cb},
{0xc38c, 0x00cc},
{0xc38d, 0x00cd},
{0xc38e, 0x00ce},
{0xc38f, 0x00cf},
{0xc390, 0x00d0},
{0xc391, 0x00d1},
{0xc392, 0x00d2},
{0xc393, 0x00d3},
{0xc394, 0x00d4},
{0xc395, 0x00d5},
{0xc396, 0x00d6},
{0xc397, 0x00d7},
{0xc398, 0x00d8},
{0xc399, 0x00d9},
{0xc39a, 0x00da},
{0xc39b, 0x00db},
{0xc39c, 0x00dc},
{0xc39d, 0x00dd},
{0xc39e, 0x00de},
{0xc39f, 0x00df},
{0xc3a0, 0x00e0},
{0xc3a1, 0x00e1},
{0xc3a2, 0x00e2},
{0xc3a3, 0x00e3},
{0xc3a4, 0x00e4},
{0xc3a5, 0x00e5},
{0xc3a6, 0x00e6},
{0xc3a7, 0x00e7},
{0xc3a8, 0x00e8},
{0xc3a9, 0x00e9},
{0xc3aa, 0x00ea},
{0xc3ab, 0x00eb},
{0xc3ac, 0x00ec},
{0xc3ad, 0x00ed},
{0xc3ae, 0x00ee},
{0xc3af, 0x00ef},
{0xc3b0, 0x00f0},
{0xc3b1, 0x00f1},
{0xc3b2, 0x00f2},
{0xc3b3, 0x00f3},
{0xc3b4, 0x00f4},
{0xc3b5, 0x00f5},
{0xc3b6, 0x00f6},
{0xc3b7, 0x00f7},
{0xc3b8, 0x00f8},
{0xc3b9, 0x00f9},
{0xc3ba, 0x00fa},
{0xc3bb, 0x00fb},
{0xc3bc, 0x00fc},
{0xc3bd, 0x00fd},
{0xc3be, 0x00fe},
{0xc3bf, 0x00ff},
{0xc592, 0x008c},
{0xc593, 0x009c},
{0xc5a0, 0x008a},
{0xc5a1, 0x009a},
{0xc5b8, 0x009f},
{0xc5bd, 0x008e},
{0xc5be, 0x009e},
{0xc692, 0x0083},
{0xcb86, 0x0088},
{0xcb9c, 0x0098},
{0xe28093, 0x0096},
{0xe28094, 0x0097},
{0xe28098, 0x0091},
{0xe28099, 0x0092},
{0xe2809a, 0x0082},
{0xe2809c, 0x0093},
{0xe2809d, 0x0094},
{0xe2809e, 0x0084},
{0xe280a0, 0x0086},
{0xe280a1, 0x0087},
{0xe280a2, 0x0095},
{0xe280a6, 0x0085},
{0xe280b0, 0x0089},
{0xe280b9, 0x008b},
{0xe280ba, 0x009b},
{0xe282ac, 0x0080},
{0xe284a2, 0x0099},
jschulz@opal:~/programme/postgresql/pgmaps> cat win1252_to_utf8.map
{0x0080, 0xe282ac},
{0x0082, 0xe2809a},
{0x0083, 0xc692},
{0x0084, 0xe2809e},
{0x0085, 0xe280a6},
{0x0086, 0xe280a0},
{0x0087, 0xe280a1},
{0x0088, 0xcb86},
{0x0089, 0xe280b0},
{0x008a, 0xc5a0},
{0x008b, 0xe280b9},
{0x008c, 0xc592},
{0x008e, 0xc5bd},
{0x0091, 0xe28098},
{0x0092, 0xe28099},
{0x0093, 0xe2809c},
{0x0094, 0xe2809d},
{0x0095, 0xe280a2},
{0x0096, 0xe28093},
{0x0097, 0xe28094},
{0x0098, 0xcb9c},
{0x0099, 0xe284a2},
{0x009a, 0xc5a1},
{0x009b, 0xe280ba},
{0x009c, 0xc593},
{0x009e, 0xc5be},
{0x009f, 0xc5b8},
{0x00a0, 0xc2a0},
{0x00a1, 0xc2a1},
{0x00a2, 0xc2a2},
{0x00a3, 0xc2a3},
{0x00a4, 0xc2a4},
{0x00a5, 0xc2a5},
{0x00a6, 0xc2a6},
{0x00a7, 0xc2a7},
{0x00a8, 0xc2a8},
{0x00a9, 0xc2a9},
{0x00aa, 0xc2aa},
{0x00ab, 0xc2ab},
{0x00ac, 0xc2ac},
{0x00ad, 0xc2ad},
{0x00ae, 0xc2ae},
{0x00af, 0xc2af},
{0x00b0, 0xc2b0},
{0x00b1, 0xc2b1},
{0x00b2, 0xc2b2},
{0x00b3, 0xc2b3},
{0x00b4, 0xc2b4},
{0x00b5, 0xc2b5},
{0x00b6, 0xc2b6},
{0x00b7, 0xc2b7},
{0x00b8, 0xc2b8},
{0x00b9, 0xc2b9},
{0x00ba, 0xc2ba},
{0x00bb, 0xc2bb},
{0x00bc, 0xc2bc},
{0x00bd, 0xc2bd},
{0x00be, 0xc2be},
{0x00bf, 0xc2bf},
{0x00c0, 0xc380},
{0x00c1, 0xc381},
{0x00c2, 0xc382},
{0x00c3, 0xc383},
{0x00c4, 0xc384},
{0x00c5, 0xc385},
{0x00c6, 0xc386},
{0x00c7, 0xc387},
{0x00c8, 0xc388},
{0x00c9, 0xc389},
{0x00ca, 0xc38a},
{0x00cb, 0xc38b},
{0x00cc, 0xc38c},
{0x00cd, 0xc38d},
{0x00ce, 0xc38e},
{0x00cf, 0xc38f},
{0x00d0, 0xc390},
{0x00d1, 0xc391},
{0x00d2, 0xc392},
{0x00d3, 0xc393},
{0x00d4, 0xc394},
{0x00d5, 0xc395},
{0x00d6, 0xc396},
{0x00d7, 0xc397},
{0x00d8, 0xc398},
{0x00d9, 0xc399},
{0x00da, 0xc39a},
{0x00db, 0xc39b},
{0x00dc, 0xc39c},
{0x00dd, 0xc39d},
{0x00de, 0xc39e},
{0x00df, 0xc39f},
{0x00e0, 0xc3a0},
{0x00e1, 0xc3a1},
{0x00e2, 0xc3a2},
{0x00e3, 0xc3a3},
{0x00e4, 0xc3a4},
{0x00e5, 0xc3a5},
{0x00e6, 0xc3a6},
{0x00e7, 0xc3a7},
{0x00e8, 0xc3a8},
{0x00e9, 0xc3a9},
{0x00ea, 0xc3aa},
{0x00eb, 0xc3ab},
{0x00ec, 0xc3ac},
{0x00ed, 0xc3ad},
{0x00ee, 0xc3ae},
{0x00ef, 0xc3af},
{0x00f0, 0xc3b0},
{0x00f1, 0xc3b1},
{0x00f2, 0xc3b2},
{0x00f3, 0xc3b3},
{0x00f4, 0xc3b4},
{0x00f5, 0xc3b5},
{0x00f6, 0xc3b6},
{0x00f7, 0xc3b7},
{0x00f8, 0xc3b8},
{0x00f9, 0xc3b9},
{0x00fa, 0xc3ba},
{0x00fb, 0xc3bb},
{0x00fc, 0xc3bc},
{0x00fd, 0xc3bd},
{0x00fe, 0xc3be},
{0x00ff, 0xc3bf},
Nov 11 '05 #1
0 5560

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

4
by: Steve Kushubar | last post by:
I originally posted this in the internet.iis group and it was suggested that I repost it here. I am trying to use the Response.CodePage method on a Windows 2000 IIS Server. The MSDN clearly...
1
by: Alex | last post by:
Dear all, I am now creating an CMS which support 3 languages (English, Traditional Chinese and Simplified Chinese). The user can change their languages dynamically; however, I would like to...
10
by: Christopher H. Laco | last post by:
Long story longer. I need to get web user input into a backend system that a) only grocks single byte encoding, b) expectes the data transer to be 1 bytes = 1 character, and c) uses the HP Roman-6...
0
by: Jörg Schulz | last post by:
I am missing this codepage quiet some time, but I was able to patch another unneeded mapping to my needs. Unfortunately I wasn't able to add a complete new mapping. Maybe someone of you can do...
8
by: Kim Bundgaard | last post by:
Hi Anyone know where I can look for problem with codepage conversion between DB2 UDB Connect EE V8.2 (fixpak 3) and DB2 UDB z/OS V7. With DB2 UDB Connect EE V7.2 (fixpak 7) i get Ebcdic X'5A'...
0
by: bruce_pullen | last post by:
DB28.1 FP10 on AIX 5.3. SYSCAT.PACKAGES returns CODEPAGE=819 for certain packages. The database was created with CODEPAGE=1208. Should I be concerned? Thanks for any advice. Bruce
4
by: Ram | last post by:
Dear All, Good Day I am trying to convert a file which is generated on AS400 with codepage 00420 (arabic & English data combination) with no success. But using the same code( and changing 20420...
0
by: asklucas | last post by:
Hi there, I got an MS Access DB, which is causing problems when the client PC is running using a Traditional Chinese codepage. The DB was probably created with Access 97, Western European...
0
by: Guido/RM/ITALY | last post by:
Hi ! i need configure codepage for DB2 Connect. i have this environment: - DB2 Server on Z/OS with codepage 500 - DB2 Connect personal Edition 8.2 on Windows 2003 - connection to DB2 via...
0
by: SiliconJaltz | last post by:
Hi, I'm trying to convert some RTF text to Unicode UTF-16. The characters I'm converting are double byte Japanese. The character set is fcharset128 which should be codepage 932. I am looking...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
0
by: Faith0G | last post by:
I am starting a new it consulting business and it's been a while since I setup a new website. Is wordpress still the best web based software for hosting a 5 page website? The webpages will be...
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...
0
by: ryjfgjl | last post by:
In our work, we often need to import Excel data into databases (such as MySQL, SQL Server, Oracle) for data analysis and processing. Usually, we use database tools like Navicat or the Excel import...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
0
BarryA
by: BarryA | last post by:
What are the essential steps and strategies outlined in the Data Structures and Algorithms (DSA) roadmap for aspiring data scientists? How can individuals effectively utilize this roadmap to progress...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.