By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
440,581 Members | 2,023 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 440,581 IT Pros & Developers. It's quick & easy.

No mbstring function for finding suitable encoding.

P: n/a
Hello,

I am making some site, where I use UTF-8 encoding.
From PHP I send mail. But, if possible I want to send
the mail in ISO-8859-1 or KOI8-R (because still some
mailers have problem with UTF-8), but if not, just in
UTF-8.

If I look in the documentation, there is no function that
can check if a UTF-8 string can be encoded in another
encoding without loss of characters.

The function mb_check_encoding and mb_detect_encoding
have a different purpose.

Or, do I miss something?

So, I want a function:

bool mb_encoding_possible(string str, string to_encoding, string
from_encoding)

which returns TRUE if mb_convert_encoding is possible, without loss.

Regards,

Lucas Kruijswijk
Dec 11 '06 #1
Share this Question
Share on Google+
2 Replies


P: n/a
"Lucas Kruijswijk" <L.************@inter.nl.netwrote:
So, I want a function:

bool mb_encoding_possible(string str, string to_encoding, string
from_encoding)

which returns TRUE if mb_convert_encoding is possible, without loss.

/*. boolean .*/ function mb_encoding_possible(
/*. string .*/ $original,
/*. string .*/ $to_encoding,
/*. string .*/ $from_encoding)
/*
Returns TRUE if $original can be converted from $from_encoding
to $to_encoding without loosing neither char.

Very inefficient. It would be useful to return the $converted text
back to the caller, to that it has not to repeat the conversion again.
*/
{
$converted = mb_convert_encoding($original, $to_encoding, $from_encoding);
$original2 = mb_convert_encoding($converted, $from_encoding, $to_encoding);
return $original2 === $original;
}
Some optimizations requires the knowledge of the specific encodings used.
mb_substitute_character() might be useful to mark characters that cannot
be converted. For example, we can choose an ASCII control character
(most encodings do not use them for regular texts) or any other char
that do not appear in the original string, and use this char to mark
characters that cannot be converted. With only one conversion, the
presence of this char in the resulting text would mean we lost something.
Regards,
___
/_|_\ Umberto Salsi
\/_\/ www.icosaedro.it

Dec 11 '06 #2

P: n/a
Or add an additional parameter to 'mb_convert_encoding'.

substitute: String, will replace any character that can not be converted.
NULL - Function will return NULL, if there is at least one
character
that can not be converted.

Default is empty string.

Regards,

Lucas
"Umberto Salsi" <sa***@icosaedro.italiaschreef in bericht
news:el**********@nnrp.ngi.it...
"Lucas Kruijswijk" <L.************@inter.nl.netwrote:
>So, I want a function:

bool mb_encoding_possible(string str, string to_encoding, string
from_encoding)

which returns TRUE if mb_convert_encoding is possible, without loss.


/*. boolean .*/ function mb_encoding_possible(
/*. string .*/ $original,
/*. string .*/ $to_encoding,
/*. string .*/ $from_encoding)
/*
Returns TRUE if $original can be converted from $from_encoding
to $to_encoding without loosing neither char.

Very inefficient. It would be useful to return the $converted text
back to the caller, to that it has not to repeat the conversion again.
*/
{
$converted = mb_convert_encoding($original, $to_encoding, $from_encoding);
$original2 = mb_convert_encoding($converted, $from_encoding,
$to_encoding);
return $original2 === $original;
}
Some optimizations requires the knowledge of the specific encodings used.
mb_substitute_character() might be useful to mark characters that cannot
be converted. For example, we can choose an ASCII control character
(most encodings do not use them for regular texts) or any other char
that do not appear in the original string, and use this char to mark
characters that cannot be converted. With only one conversion, the
presence of this char in the resulting text would mean we lost something.
Regards,
___
/_|_\ Umberto Salsi
\/_\/ www.icosaedro.it

Dec 11 '06 #3

This discussion thread is closed

Replies have been disabled for this discussion.