469,572 Members | 1,307 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,572 developers. It's quick & easy.

Non-ascii email subject and header encoding

Hi all,

I need to mail() emails with user input that does contain non-ascii
(umlauts, accents) and non-latin (cyrillic) characters in the
"Subject:" and "From:" headers. I understand that they are typically
encoded in UTF8 like this:

=?UTF-8?B?w5Z0emkg0J/RgNC40LLQtdGC?=

but I cannot find a PHP function to encode the input string in this
way. utf8_encode gives me garbled char soup, so what do you use?

Thanks.

Mar 27 '07 #1
4 17360
On Mar 27, 11:55 am, "Ciuin" <c...@gmx.dewrote:
Hi all,

I need to mail() emails with user input that does contain non-ascii
(umlauts, accents) and non-latin (cyrillic) characters in the
"Subject:" and "From:" headers. I understand that they are typically
encoded in UTF8 like this:

=?UTF-8?B?w5Z0emkg0J/RgNC40LLQtdGC?=

but I cannot find a PHP function to encode the input string in this
way. utf8_encode gives me garbled char soup, so what do you use?

Thanks.
Ah well, bad thinking on my part.

Instead of utf8_encode I need base64_encode, of course (as the "...?
B?..." in the code tells me). So this:

$from = "From: =?UTF-8?B?" . base64_encode($_POST['name']) . "?= <" . $
$_POST['email'] . ">\n";

produces a correctly encoded header. Same for subject.

Hope it helps someone.

Mar 27 '07 #2
Ciuin wrote:
On Mar 27, 11:55 am, "Ciuin" <c...@gmx.dewrote:
>Hi all,

I need to mail() emails with user input that does contain non-ascii
(umlauts, accents) and non-latin (cyrillic) characters in the
"Subject:" and "From:" headers. I understand that they are typically
encoded in UTF8 like this:

=?UTF-8?B?w5Z0emkg0J/RgNC40LLQtdGC?=

but I cannot find a PHP function to encode the input string in this
way. utf8_encode gives me garbled char soup, so what do you use?

Thanks.

Ah well, bad thinking on my part.
Was it?
Not really.

Encoding/charsets/headers/content-type/UTF/unicode/etc ALWAYS gives me a
headache. :-/
Confusing stuff, especially when you have to consider a whole range of
receiving clients (different browsers, emailclients, etc).

Regards,
Erwin Moller
>
Instead of utf8_encode I need base64_encode, of course (as the "...?
B?..." in the code tells me). So this:

$from = "From: =?UTF-8?B?" . base64_encode($_POST['name']) . "?= <" . $
$_POST['email'] . ">\n";

produces a correctly encoded header. Same for subject.

Hope it helps someone.

Mar 27 '07 #3
"Ciuin" <ci***@gmx.dewrote:
>
Instead of utf8_encode I need base64_encode, of course (as the "...?
B?..." in the code tells me). So this:

$from = "From: =?UTF-8?B?" . base64_encode($_POST['name']) . "?= <" . $
$_POST['email'] . ">\n";

produces a correctly encoded header. Same for subject.
For completeness, allow me to point out that you can also use
quoted-printable encoding here (you'd use =?utf-8?Q? instead of ...?B?).
Quoted-printable encoding has the "advantage" that ASCII characters survive
unchanged, so if there are ASCII words, they can be read even in their
encoded form.

On the other hand, strings with many non-ASCII characters grow more in
quoted-printable than in base64. Plus, there is no
"quoted_printable_encode" in the standard library, although sources are
available.
--
Tim Roberts, ti**@probo.com
Providenza & Boekelheide, Inc.
Mar 28 '07 #4
Hello,

on 03/27/2007 06:55 AM Ciuin said the following:
Hi all,

I need to mail() emails with user input that does contain non-ascii
(umlauts, accents) and non-latin (cyrillic) characters in the
"Subject:" and "From:" headers. I understand that they are typically
encoded in UTF8 like this:

=?UTF-8?B?w5Z0emkg0J/RgNC40LLQtdGC?=

but I cannot find a PHP function to encode the input string in this
way. utf8_encode gives me garbled char soup, so what do you use?
That is binary q-encoding. It is not quoted-printable but it is similar
for message headers. There is a whole RFC on that subject.

Yoy may want to try the MIME message that can be used to compose and
send messages and supports headers with non-ASCII characters encoded as
UTF-8 or any other encoding.

Take a look at the test_multibyte_message.php example script. It
explains how to send messages in Japanese with encoding ISO-2022-JP, but
you can change that for UTF-8 to support characters of all idioms.

http://www.phpclasses.org/mimemessage
--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Mar 29 '07 #5

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

12 posts views Thread by lothar | last post: by
3 posts views Thread by Mario | last post: by
25 posts views Thread by Yves Glodt | last post: by
32 posts views Thread by Adrian Herscu | last post: by
8 posts views Thread by Bern McCarty | last post: by
14 posts views Thread by Patrick Kowalzick | last post: by
12 posts views Thread by puzzlecracker | last post: by
4 posts views Thread by guiromero | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.