By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
455,043 Members | 1,179 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 455,043 IT Pros & Developers. It's quick & easy.

Character conversion

P: n/a
Hi all,

in a string from a form post, I sometimes get characters like ő
(õ) as raw data (not masked). ( "ő", if Your mail program
supports this).

It appears that Mozilla converts the input automatically to ő, so I
end up with ő in my database (MSIE does not convert it).

My problem is that I need to export these data to a csv-file, where
ő is no good.

I've tried to use preg_replace_callback() with chr() to convert the
values back when exporting, but I get inconsistent results for this one
character (others seem to work):

------------ TEST ------------

<?php
function replaceChars( $matches )
{
return chr( $matches[1] );
}
?>
<form method="POST" action="#">
<input type="text" name="test">
</form>

<pre>
raw: <?=$_POST['test'];?>

htmlentities(): <?=htmlentities( $_POST['test'] );?>

chr: <?=preg_replace_callback( "/&#([0-9]+);/",
replaceChars,
$_POST['test'] );
?>
&amp;#337;: ő

chr(337): <?=chr(337);?>
</pre>

------------ /TEST ------------

Am I making a dumb mistake? Is the value in &#(int); not the value for
the chr()-argument?

Thanks,
Rudi
Jul 17 '05 #1
Share this Question
Share on Google+
6 Replies


P: n/a
Rudolf Horbas wrote:
<form method="POST" action="#">


Try this:

<form method="POST" action="#" accept-charset="iso-8859-1">
info @ http://www.w3.org/TR/html4/interact/forms.html

--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #2

P: n/a
Pedro Graca wrote:
Try this:

<form method="POST" action="#" accept-charset="iso-8859-1">


Thanks Pedro -- this fixed my problem of the data getting in.

For the export part (and out of curiosity):

Why isn't this working as expected?:

<?php
function replaceChars( $matches )
{
return chr( $matches[1] );
}
echo preg_replace_callback( "/&#([0-9]+);/",
replaceChars,
"ő" );
?>

.... which of course sums up in <?=chr(337)?>

<?=chr(123)?> == {

but

<?=chr(337)?> != ő

Rudi
Jul 17 '05 #3

P: n/a
Rudolf Horbas wrote:
<?=chr(123)?> == {

but

<?=chr(337)?> != ő

chr(x) is the same as chr( x % 256 )

so chr(123) is chr(123) :)

but chr(337) is chr(81) :(
Maybe iconv functions would help you
http://www.php.net/iconv
--
USENET would be a better place if everybody read: : mail address :
http://www.catb.org/~esr/faqs/smart-questions.html : is valid for :
http://www.netmeister.org/news/learn2quote2.html : "text/plain" :
http://www.expita.com/nomime.html : to 10K bytes :
Jul 17 '05 #4

P: n/a
Rudolf Horbas <rh*****@gmx.net> wrote:
<form method="POST" action="#" accept-charset="iso-8859-1">
Thanks Pedro -- this fixed my problem of the data getting in.


In what characterset are pages being served? If the encoding sent by the
server is UTF-8, a clients response should be UTF-8.
... which of course sums up in <?=chr(337)?>

<?=chr(123)?> == {

but

<?=chr(337)?> != ő


http://nl2.php.net/chr:
<q>
Description
string chr ( int ascii)

Returns a one-character string containing the character specified by
ascii.
</q>

ő is the unicode character at index 337. The first 256 characters
of unicode are compatible with iso-8859-1 (which includes ascii).

--

Daniel Tryba

Jul 17 '05 #5

P: n/a
Pedro Graca wrote:
chr(x) is the same as chr( x % 256 )

so chr(123) is chr(123) :)

but chr(337) is chr(81) :(
That was the dumb mistake I was afraid I'm making :-)

I get it chr() does not serve my purpose here ...
Maybe iconv functions would help you
http://www.php.net/iconv


No, that's _way_ too much hassle (to install on our server).

I only get a couple Hungarians who sign up for a congress; and only few
of them bring in these special chars (it's ő, Ő, ű,
Ű) from the ISO 8859-2 charset:
(http://en.wikipedia.org/wiki/Hungari...Writing_system)

I'm doing a dumb str_replace() on them to , , , , which are very
similar (mentioned in the wikipedia article.

Thanks Pedro (and Daniel) -- another lesson learned.

Rudi
Jul 17 '05 #6

P: n/a
Rudolf Horbas <rh*****@gmx.net> wrote:
Maybe iconv functions would help you
http://www.php.net/iconv


No, that's _way_ too much hassle (to install on our server).


Maybe you should take a look at PHPs multibyte string support,
http://nl2.php.net/mbstring

After a colleague found this beauty, we decided to use it to swith the
whole website to UTF-8. This solved some headaches for the application
(yet another webmail thingy) concerning encodings of the content. For
example you can "mix" multiple character sets to together, so we can now
show some nice korean spam mail in the correct font and at the same time
display an ad for spam filters using the euro symbol :)

--

Daniel Tryba

Jul 17 '05 #7

This discussion thread is closed

Replies have been disabled for this discussion.