Connecting Tech Pros Worldwide Help | Site Map

Howto: utf2code()

 
LinkBack Thread Tools Search this Thread
  #1  
Old July 17th, 2005, 05:43 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a
Default Howto: utf2code()

I have found a nice code to convert ordinal values to utf (code2utf()
<http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
reverse of that. The uniord() function found at usernotes
<http://in.php.net/ord#42778> is bit close, but it uses mbstring.
Anyone have any idea to do it without using mbstring? TIA

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com

  #2  
Old July 17th, 2005, 05:44 AM
Asgeir Frimannsson
Guest
 
Posts: n/a
Default Re: Howto: utf2code()

R. Rajesh Jeba Anbiah wrote:
[color=blue]
> I have found a nice code to convert ordinal values to utf (code2utf()
> <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> reverse of that. The uniord() function found at usernotes
> <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> Anyone have any idea to do it without using mbstring? TIA
>[/color]

Here's a way of doing it in php:
http://www.randomchaos.com/document....hp_and_unicode

hope that's what you're looking for.

asgeir
  #3  
Old July 17th, 2005, 05:44 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a
Default Re: Howto: utf2code()

Asgeir Frimannsson <a.frimannsson@student.qut.edu.au> wrote in message news:<40c82398$0$29804$5a62ac22@freenews.iinet.net .au>...[color=blue]
> R. Rajesh Jeba Anbiah wrote:
>[color=green]
> > I have found a nice code to convert ordinal values to utf (code2utf()
> > <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> > reverse of that. The uniord() function found at usernotes
> > <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> > Anyone have any idea to do it without using mbstring? TIA
> >[/color]
>
> Here's a way of doing it in php:
> http://www.randomchaos.com/document....hp_and_unicode[/color]

Sorry... I'm looking for the ord() function that can accept utf-8
string as it's parameter and returns the ordinal value in integer.

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
  #4  
Old July 17th, 2005, 06:47 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a
Default [Polite cross-post]Re: Howto: utf2code()

Polite cross-post to comp.programming. Original thread was in
comp.lang.php <http://groups.google.com/groups?threadm=abc4d8b8.0406090601.66a05a58%40post ing.google.com>
I post here to get some logic or idea.

<Previous Post>
ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406090601.66a05a58@posting.google. com>...[color=blue]
> I have found a nice code to convert ordinal values to utf (code2utf()
> <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> reverse of that. The uniord() function found at usernotes
> <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> Anyone have any idea to do it without using mbstring? TIA[/color]
</Previous Post>

So, I'm here looking for how to get ordinal value for given Unicode
character. For example, if we pass integer 'A' to the function, it has
to return 65; if we pass 'க' to the function it has to return
0x0B95; and so on.

Any help on logic or concept is highly appreciated. TIA

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
  #5  
Old July 17th, 2005, 06:48 AM
R. Rajesh Jeba Anbiah
Guest
 
Posts: n/a
Default Re: [Polite cross-post]Re: Howto: utf2code()

ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406280310.1fa72985@posting.google. com>...[color=blue]
> Polite cross-post to comp.programming. Original thread was in
> comp.lang.php <http://groups.google.com/groups?threadm=abc4d8b8.0406090601.66a05a58%40post ing.google.com>
> I post here to get some logic or idea.
>
> <Previous Post>
> ng4rrjanbiah@rediffmail.com (R. Rajesh Jeba Anbiah) wrote in message news:<abc4d8b8.0406090601.66a05a58@posting.google. com>...[color=green]
> > I have found a nice code to convert ordinal values to utf (code2utf()
> > <http://in2.php.net/utf8_encode#34214> ). But, couldn't find the
> > reverse of that. The uniord() function found at usernotes
> > <http://in.php.net/ord#42778> is bit close, but it uses mbstring.
> > Anyone have any idea to do it without using mbstring? TIA[/color]
> </Previous Post>
>
> So, I'm here looking for how to get ordinal value for given Unicode
> character. For example, if we pass integer 'A' to the function, it has
> to return 65; if we pass 'க' to the function it has to return
> 0x0B95; and so on.
>
> Any help on logic or concept is highly appreciated. TIA[/color]

Never mind, after scratching my head last night, I have found it
is nothing but UTF-8 to Unicode conversion. And found a nice algorithm
at <http://www1.tip.nl/~t876506/utf8tbl.html> Based on that I have
also written a code which seems to work fine.

<?php
//http://www1.tip.nl/~t876506/utf8tbl.html
//ordinal value for given Unicode character (in UTF-8)
//Logic: UTF-8 to Unicode conversion
function uniord($c)
{
$ud = 0;
if (ord($c{0})>=0 && ord($c{0})<=127) {
//If z is between and including 0 - 127, then there is 1 byte z. The
decimal Unicode value ud = the value of z.
$ud = $c{0};
}
if (ord($c{0})>=192 && ord($c{0})<=223) {
//If z is between and including 192 - 223, then there are 2 bytes z
y; ud = (z-192)*64 + (y-128)
$ud = (ord($c{0})-192)*64 + (ord($c{1})-128);
}
if (ord($c{0})>=224 && ord($c{0})<=239) {
//If z is between and including 224 - 239, then there are 3 bytes z y
x; ud = (z-224)*4096 + (y-128)*64 + (x-128)
$ud = (ord($c{0})-224)*4096 + (ord($c{1})-128)*64 +
(ord($c{2})-128);
}
if (ord($c{0})>=240 && ord($c{0})<=247) {
//If z is between and including 240 - 247, then there are 4 bytes z y
x w; ud = (z-240)*262144 + (y-128)*4096 + (x-128)*64 + (w-128)
$ud = (ord($c{0})-240)*262144 + (ord($c{1})-128)*4096 +
(ord($c{2})-128)*64 + (ord($c{3})-128);
}
if (ord($c{0})>=248 && ord($c{0})<=251) {
//If z is between and including 248 - 251, then there are 5 bytes z y
x w v; ud = (z-248)*16777216 + (y-128)*262144 + (x-128)*4096 +
(w-128)*64 + (v-128)
$ud = (ord($c{0})-248)*16777216 + (ord($c{1})-128)*262144 +
(ord($c{2})-128)*4096 + (ord($c{3})-128)*64 + (ord($c{4})-128);
}
if (ord($c{0})>=252 && ord($c{0})<=253) {
//If z is 252 or 253, then there are 6 bytes z y x w v u; ud =
(z-252)*1073741824 + (y-128)*16777216 + (x-128)*262144 + (w-128)*4096
+ (v-128)*64 + (u-128)
$ud = (ord($c{0})-252)*1073741824 + (ord($c{1})-128)*16777216 +
(ord($c{2})-128)*262144 + (ord($c{3})-128)*4096 + (ord($c{4})-128)*64
+ (ord($c{5})-128);
}
if (ord($c{0})>=254 && ord($c{0})<=255) {
//If z = 254 or 255 then there is something wrong!
// die('Error');
$ud = false;
}
return $ud;
}
?>

--
| Just another PHP saint |
Email: rrjanbiah-at-Y!com
 

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On

Popular Articles

What is Bytes?

We are a network of experts and professionals in IT and software development that help one another with answers to tough questions and share insights. Get the best answers to your questions from over 220,989 network members.