Connecting Tech Pros Worldwide Help | Site Map

Howto: utf2code()

  #1  
Old July 17th, 2005, 06:43 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a
I have found a nice code to convert ordinal values to utf (code2utf()
<http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
reverse of that. The uniord() function found at usernotes
<http://in.php.net/ord#42778> is bit close, but it uses mbstring.
Anyone have any idea to do it without using mbstring? TIA

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
  #2  
Old July 17th, 2005, 06:44 AM
Asgeir Frimannsson
Guest
 
Posts: n/a

re: Howto: utf2code()


R. Rajesh Jeba Anbiah wrote:
[color=blue]
> I have found a nice code to convert ordinal values to utf (code2utf()
> <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> reverse of that. The uniord() function found at usernotes
> <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> Anyone have any idea to do it without using mbstring? TIA
>[/color]

Here's a way of doing it in php:
http://www.randomchaos.com/document....hp_and_unicode

hope that's what you're looking for.

asgeir
  #3  
Old July 17th, 2005, 06:44 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a

re: Howto: utf2code()


Asgeir Frimannsson <a.frimannsson@student.qut.edu.au> wrote in message news:<40c82398$0$29804$5a62ac22@freenews.iinet.net .au>...[color=blue]
> R. Rajesh Jeba Anbiah wrote:
>[color=green]
> > I have found a nice code to convert ordinal values to utf (code2utf()
> > <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> > reverse of that. The uniord() function found at usernotes
> > <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> > Anyone have any idea to do it without using mbstring? TIA
> >[/color]
>
> Here's a way of doing it in php:
> http://www.randomchaos.com/document....hp_and_unicode[/color]

Sorry... I'm looking for the ord() function that can accept utf-8
string as it's parameter and returns the ordinal value in integer.

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
  #4  
Old July 17th, 2005, 07:47 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a

re: Howto: utf2code()


Polite cross-post to comp.programming. Original thread was in
comp.lang.php <http://groups.google.com/groups?threadm=abc4d8b8.0406090601.66a05a58%40post ing.google.com>
I post here to get some logic or idea.

<Previous Post>
ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406090601.66a05a58@posting.google. com>...[color=blue]
> I have found a nice code to convert ordinal values to utf (code2utf()
> <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> reverse of that. The uniord() function found at usernotes
> <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> Anyone have any idea to do it without using mbstring? TIA[/color]
</Previous Post>

So, I'm here looking for how to get ordinal value for given Unicode
character. For example, if we pass integer 'A' to the function, it has
to return 65; if we pass 'க' to the function it has to return
0x0B95; and so on.

Any help on logic or concept is highly appreciated. TIA

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
  #5  
Old July 17th, 2005, 07:48 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a

re: Howto: utf2code()


ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406280310.1fa72985@posting.google. com>...[color=blue]
> Polite cross-post to comp.programming. Original thread was in
> comp.lang.php <http://groups.google.com/groups?threadm=abc4d8b8.0406090601.66a05a58%40post ing.google.com>
> I post here to get some logic or idea.
>
> <Previous Post>
> ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406090601.66a05a58@posting.google. com>...[color=green]
> > I have found a nice code to convert ordinal values to utf (code2utf()
> > <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> > reverse of that. The uniord() function found at usernotes
> > <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> > Anyone have any idea to do it without using mbstring? TIA[/color]
> </Previous Post>
>
> So, I'm here looking for how to get ordinal value for given Unicode
> character. For example, if we pass integer 'A' to the function, it has
> to return 65; if we pass 'க' to the function it has to return
> 0x0B95; and so on.
>
> Any help on logic or concept is highly appreciated. TIA[/color]

Never mind, after scratching my head last night, I have found it
is nothing but UTF-8 to Unicode conversion. And found a nice algorithm
at <http://www1.tip.nl/~t876506/utf8tbl.html> Based on that I have
also written a code which seems to work fine.

<?php
//http://www1.tip.nl/~t876506/utf8tbl.html
//ordinal value for given Unicode character (in UTF-8)
//Logic: UTF-8 to Unicode conversion
function uniord($c)
{
$ud = 0;
if (ord($c{0})>=0 && ord($c{0})<=127) {
//If z is between and including 0 - 127, then there is 1 byte z. The
decimal Unicode value ud = the value of z.
$ud = $c{0};
}
if (ord($c{0})>=192 && ord($c{0})<=223) {
//If z is between and including 192 - 223, then there are 2 bytes z
y; ud = (z-192)*64 + (y-128)
$ud = (ord($c{0})-192)*64 + (ord($c{1})-128);
}
if (ord($c{0})>=224 && ord($c{0})<=239) {
//If z is between and including 224 - 239, then there are 3 bytes z y
x; ud = (z-224)*4096 + (y-128)*64 + (x-128)
$ud = (ord($c{0})-224)*4096 + (ord($c{1})-128)*64 +
(ord($c{2})-128);
}
if (ord($c{0})>=240 && ord($c{0})<=247) {
//If z is between and including 240 - 247, then there are 4 bytes z y
x w; ud = (z-240)*262144 + (y-128)*4096 + (x-128)*64 + (w-128)
$ud = (ord($c{0})-240)*262144 + (ord($c{1})-128)*4096 +
(ord($c{2})-128)*64 + (ord($c{3})-128);
}
if (ord($c{0})>=248 && ord($c{0})<=251) {
//If z is between and including 248 - 251, then there are 5 bytes z y
x w v; ud = (z-248)*16777216 + (y-128)*262144 + (x-128)*4096 +
(w-128)*64 + (v-128)
$ud = (ord($c{0})-248)*16777216 + (ord($c{1})-128)*262144 +
(ord($c{2})-128)*4096 + (ord($c{3})-128)*64 + (ord($c{4})-128);
}
if (ord($c{0})>=252 && ord($c{0})<=253) {
//If z is 252 or 253, then there are 6 bytes z y x w v u; ud =
(z-252)*1073741824 + (y-128)*16777216 + (x-128)*262144 + (w-128)*4096
+ (v-128)*64 + (u-128)
$ud = (ord($c{0})-252)*1073741824 + (ord($c{1})-128)*16777216 +
(ord($c{2})-128)*262144 + (ord($c{3})-128)*4096 + (ord($c{4})-128)*64
+ (ord($c{5})-128);
}
if (ord($c{0})>=254 && ord($c{0})<=255) {
//If z = 254 or 255 then there is something wrong!
// die('Error');
$ud = false;
}
return $ud;
}
?>

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
Closed Thread