By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
444,058 Members | 1,217 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 444,058 IT Pros & Developers. It's quick & easy.

mail() encoding problems

P: n/a
Hi all,

Another problem that has been bugging me for a while now, but which I swept
under the rug too long too now is a mail encoding problem at my (shared)
webhost.

The problem is that on different occassions when I send the exact same mail
(same e-mail address, same name, same body content) through my site's
contactform it produces the wellknown strange characters instead of the
intended diacritical characters. This problem also shows up at the
contactform for a client of mine, which site is hosted on the same webhost.
I've contacted my webhost a few times already, but they don't know what
could be the cause of the problem either.

This is the setup I'm using on my webhost:

The host runs fast-cgi PHP5 with customizable php.ini files per
domain/website on Slackware Linux.

All my php files are saved as 1252 ANSI. All my HTML output have

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />

meta tags. The difference in these two encodings shouldn't be a problem is
my guess. But perhaps I'm wrong?

And my Mailer class uses the following code to send mail:

public function send()
{
if( is_null( $this->getErrors() ) )
{
$cc = implode( ', ', $this->cc );
$bcc = implode( ', ', $this->bcc );
$to = implode( ', ', $this->to );
$headers = 'From: ' . $this->from . LF;
if ( !empty( $cc ) ) $headers .= 'Cc: ' . $cc . LF;
if ( !empty( $bcc ) ) $headers .= 'Bcc: ' . $bcc . LF;
$headers .= 'MIME-Version: 1.0' . LF;
$headers .= 'X-Mailer: ' . $this->mailerName . LF;
$headers .= 'Content-Type: text/plain; charset=iso-8859-1' . LF;
$headers .= 'Content-Transfer-Encoding: 8bit' . LF;
$subject = $this->subject;
$body = $this->body;
if( mail( $to, $subject, $body, $headers ) )
{
return true;
}
$this->setError( 'general', 'error', basename( __FILE__, '.php' ),
__LINE__ );
return false;
}
else
{
return false;
}
}

Sometimes, when I run in to the problem I change the following lines in my
send() function to utf8 encoding

$headers .= 'Content-Type: text/plain; charset=utf-8' . LF;

$body = utf8_encode( $this->body );

Then, it seems to work for a while. But then all of a sudden it shows the
same problem again. Eventhough I send the same test mail.

The body content I send is:

test

Which shows up as:

test ëèï

Does anybody have any idea what might be causing this weird problem?

Thanks
Aug 19 '07 #1
Share this Question
Share on Google+
10 Replies


P: n/a
amygdala wrote:
Hi all,
<snip>
>
The body content I send is:

test

Which shows up as:

test ëèï
Alright, I've done another test. I sent the exact same mail twice in a row
(only seconds apart). One ends up as 'weird', the other normal.

Using webmail (roundcube) at my webhost , I am able to view the whole
sourcecode of the mail. I saved both as textfiles, and analyzed them with
UltraEdit's compare file function.

No differences (apart from the usual message-id's, etc.). But the strange
part is, UltraEdit's UltraCompare determines one mail sourcecode to be UTF-8
encoded, and the other as being ANSI encoded.

I am really stunned here. What could be going on here?
Aug 19 '07 #2

P: n/a
Hello,

on 08/19/2007 11:16 AM amygdala said the following:
Sometimes, when I run in to the problem I change the following lines in my
send() function to utf8 encoding

$headers .= 'Content-Type: text/plain; charset=utf-8' . LF;

$body = utf8_encode( $this->body );

Then, it seems to work for a while. But then all of a sudden it shows the
same problem again. Eventhough I send the same test mail.

The body content I send is:

test

Which shows up as:

test ëèï

Does anybody have any idea what might be causing this weird problem?
Those are the characters you typed encoded as UTF-8. It seems correct
but pointless. If you have only windows-1252 characters, there is no
need to convert them into utf-8. It is not wrong but it is useless.

The only thing really wrong is that you should not send 8 bit encoded
messages as many mail gateways do not supported. Instead of 8 bit you
should use quoted-printable encoding.

You can also have 8 bit characters in the headers but they must be
encoded with q-encoding to avoid the same problem with the message body.

This is a bit complicated to encode by hand. I use this MIME message
composing and sending class to take care of all that for me.

http://www.phpclasses.org/mimemessage
--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Aug 21 '07 #3

P: n/a

"Manuel Lemos" <ml****@acm.orgschreef in bericht
news:fa**********@aioe.org...
Hello,
Hi Manuel,

Thanks for the response.

<snip>
The only thing really wrong is that you should not send 8 bit encoded
messages as many mail gateways do not supported. Instead of 8 bit you
should use quoted-printable encoding.
Alright, I didn't know this. This makes sense. Cause like I said, even
without me encoding it as utf-8, the problem randomly occured. Perhaps the
apparent random appearance of strange characters and/or missing of
diactrical characters could be explained due to the fact that subsequent
mail messages can get send through different mail gateways to their
end-destination, just like a tcp packets? Or is that not correct? Not very
imporant for me to get answered, but it would help me get a better
understanding of things. ;-)
You can also have 8 bit characters in the headers but they must be
encoded with q-encoding to avoid the same problem with the message body.
I probably won't be needing that at this point in time. But that's good to
know also.
This is a bit complicated to encode by hand. I use this MIME message
composing and sending class to take care of all that for me.

http://www.phpclasses.org/mimemessage
Although I only had a quick glance at your class, and it probably does the
job well, it looks like a bit of overkill for my purposes. So in conclusion,
is it fair to say that I only need to change:

$headers .= 'Content-Transfer-Encoding: 8bit' . LF;

to

$headers .= 'Content-Transfer-Encoding: quoted-printable' . LF;

and leave

$headers .= 'Content-Type: text/plain; charset=iso-8859-1' . LF;

as is?

Thank you in advance.

Aug 22 '07 #4

P: n/a

"amygdala" <no*****@noreply.comschreef in bericht
news:46***********************@news.kpnplanet.nl.. .
>
<snip>
Although I only had a quick glance at your class, and it probably does the
job well, it looks like a bit of overkill for my purposes. So in
conclusion, is it fair to say that I only need to change:

$headers .= 'Content-Transfer-Encoding: 8bit' . LF;

to

$headers .= 'Content-Transfer-Encoding: quoted-printable' . LF;

and leave

$headers .= 'Content-Type: text/plain; charset=iso-8859-1' . LF;

as is?
To answer my own question, it doesn't suffice. Without charset=utf-8, my
test mail (sometimes) still shows up as

test ëèï
Aug 22 '07 #5

P: n/a
Hello,

on 08/22/2007 04:00 PM amygdala said the following:
<snip>
>The only thing really wrong is that you should not send 8 bit encoded
messages as many mail gateways do not supported. Instead of 8 bit you
should use quoted-printable encoding.

Alright, I didn't know this. This makes sense. Cause like I said, even
without me encoding it as utf-8, the problem randomly occured. Perhaps the
apparent random appearance of strange characters and/or missing of
diactrical characters could be explained due to the fact that subsequent
mail messages can get send through different mail gateways to their
end-destination, just like a tcp packets? Or is that not correct? Not very
imporant for me to get answered, but it would help me get a better
understanding of things. ;-)
No, the diacritical characters appear when transform your text to UTF-8
.. UTF-8 still uses 8 bits per character. If you read the mail message
that is sent you see those characters because whatever console or text
display program you are using does not decode UTF-8 and show the correct
characters.

>You can also have 8 bit characters in the headers but they must be
encoded with q-encoding to avoid the same problem with the message body.

I probably won't be needing that at this point in time. But that's good to
know also.
>This is a bit complicated to encode by hand. I use this MIME message
composing and sending class to take care of all that for me.

http://www.phpclasses.org/mimemessage

Although I only had a quick glance at your class, and it probably does the
job well, it looks like a bit of overkill for my purposes. So in conclusion,
is it fair to say that I only need to change:

$headers .= 'Content-Transfer-Encoding: 8bit' . LF;

to

$headers .= 'Content-Transfer-Encoding: quoted-printable' . LF;

and leave

$headers .= 'Content-Type: text/plain; charset=iso-8859-1' . LF;

as is?
No, you need actually encode your body data using quoted-printable. Just
changing the headers does not do it. Using quoted-printable 8 bit and
non-printable characters are transformed in escaped sequences of ASCII
(7 bit) characters.
--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Aug 22 '07 #6

P: n/a

"Manuel Lemos" <ml****@acm.orgschreef in bericht
news:fa**********@aioe.org...
Hello,

on 08/22/2007 04:00 PM amygdala said the following:
><snip>
>>The only thing really wrong is that you should not send 8 bit encoded
messages as many mail gateways do not supported. Instead of 8 bit you
should use quoted-printable encoding.

Alright, I didn't know this. This makes sense. Cause like I said, even
without me encoding it as utf-8, the problem randomly occured. Perhaps
the
apparent random appearance of strange characters and/or missing of
diactrical characters could be explained due to the fact that subsequent
mail messages can get send through different mail gateways to their
end-destination, just like a tcp packets? Or is that not correct? Not
very
imporant for me to get answered, but it would help me get a better
understanding of things. ;-)

No, the diacritical characters appear when transform your text to UTF-8
. UTF-8 still uses 8 bits per character. If you read the mail message
that is sent you see those characters because whatever console or text
display program you are using does not decode UTF-8 and show the correct
characters.
But what could be causing my characters to randomly be transformed to UTF-8?
Do you have any idea? I don't do it anywhere in my application (except for
the test I stated in my first mail)? I've changed everything back to how it
was though, and now it still shows

test ëèï

and sometimes

test

without me changing anything in the programs code? That is weird, isn't it?
>>You can also have 8 bit characters in the headers but they must be
encoded with q-encoding to avoid the same problem with the message body.

I probably won't be needing that at this point in time. But that's good
to
know also.
>>This is a bit complicated to encode by hand. I use this MIME message
composing and sending class to take care of all that for me.

http://www.phpclasses.org/mimemessage

Although I only had a quick glance at your class, and it probably does
the
job well, it looks like a bit of overkill for my purposes. So in
conclusion,
is it fair to say that I only need to change:

$headers .= 'Content-Transfer-Encoding: 8bit' . LF;

to

$headers .= 'Content-Transfer-Encoding: quoted-printable' . LF;

and leave

$headers .= 'Content-Type: text/plain; charset=iso-8859-1' . LF;

as is?

No, you need actually encode your body data using quoted-printable. Just
changing the headers does not do it. Using quoted-printable 8 bit and
non-printable characters are transformed in escaped sequences of ASCII
(7 bit) characters.
Alright, I'll give that a try, and report back. Thanks.
Aug 23 '07 #7

P: n/a
Hello,

on 08/23/2007 04:44 AM amygdala said the following:
>>>The only thing really wrong is that you should not send 8 bit encoded
messages as many mail gateways do not supported. Instead of 8 bit you
should use quoted-printable encoding.
Alright, I didn't know this. This makes sense. Cause like I said, even
without me encoding it as utf-8, the problem randomly occured. Perhaps
the
apparent random appearance of strange characters and/or missing of
diactrical characters could be explained due to the fact that subsequent
mail messages can get send through different mail gateways to their
end-destination, just like a tcp packets? Or is that not correct? Not
very
imporant for me to get answered, but it would help me get a better
understanding of things. ;-)
No, the diacritical characters appear when transform your text to UTF-8
. UTF-8 still uses 8 bits per character. If you read the mail message
that is sent you see those characters because whatever console or text
display program you are using does not decode UTF-8 and show the correct
characters.

But what could be causing my characters to randomly be transformed to UTF-8?
Do you have any idea? I don't do it anywhere in my application (except for
the test I stated in my first mail)? I've changed everything back to how it
was though, and now it still shows

test ëèï

and sometimes

test

without me changing anything in the programs code? That is weird, isn't it?
If you use utf8_encode, you transform iso-8859-1 text in utf-8.

Since you only have text in one encoding that only uses 8 bit, utf-8 is
not useful for you.
--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Aug 23 '07 #8

P: n/a
Manuel Lemos wrote:
Hello,
<snip>
>>
But what could be causing my characters to randomly be transformed
to UTF-8? Do you have any idea? I don't do it anywhere in my
application (except for the test I stated in my first mail)? I've
changed everything back to how it was though, and now it still shows

test ëèï

and sometimes

test

without me changing anything in the programs code? That is weird,
isn't it?

If you use utf8_encode, you transform iso-8859-1 text in utf-8.

Since you only have text in one encoding that only uses 8 bit, utf-8
is not useful for you.
I understand this. But the point I am trying to bring across, most likely I
wasn't too clear on that, is that I don't use UTF-8 encoding anywhere
anymore. I only used it once or twice, to see if that would solve my
problem. But to be very clear about it: I removed all utf8_encode()
functions, and my code is therefor back to it's usual state. But my problem
remains the same:

Sometimes the unintended

ëèï

sometimes the intended



It seems to happen pretty random.

Therefor I am still stunned as to what could be causing this problem. It
looks as if something (maybe some mail gateway?) is transforming my e-mails
to UTF-8.

Do you have any other idea of what might be going on here? Your insights are
very welcome.
Aug 23 '07 #9

P: n/a
amygdala wrote:
Manuel Lemos wrote:
>Hello,

<snip>
>>>
But what could be causing my characters to randomly be transformed
to UTF-8? Do you have any idea? I don't do it anywhere in my
application (except for the test I stated in my first mail)? I've
changed everything back to how it was though, and now it still shows

test ëèï

and sometimes

test

without me changing anything in the programs code? That is weird,
isn't it?

If you use utf8_encode, you transform iso-8859-1 text in utf-8.

Since you only have text in one encoding that only uses 8 bit, utf-8
is not useful for you.

I understand this. But the point I am trying to bring across, most
likely I wasn't too clear on that, is that I don't use UTF-8 encoding
anywhere anymore. I only used it once or twice, to see if that would
solve my problem. But to be very clear about it: I removed all
utf8_encode() functions, and my code is therefor back to it's usual
state. But my problem remains the same:

Sometimes the unintended

ëèï

sometimes the intended



It seems to happen pretty random.

Therefor I am still stunned as to what could be causing this problem.
It looks as if something (maybe some mail gateway?) is transforming
my e-mails to UTF-8.

Do you have any other idea of what might be going on here? Your
insights are very welcome.
To add to this: the malformed e-mails have an earlier timestamp than that of
the correct e-mails, eventhough I have sent the malformed e-mail later then
the correct one. This leads me to believe that the incorrect e-mail indeed
get send with another mailserver or get send through a different
mail-gateway 'route' if such a thing exists. I'll contact my webhost once
more, and tell them about this new finding. Perhaps, with this new
information, they suddenly know what might cause the problem.

Cheers
Aug 23 '07 #10

P: n/a
Hello,

on 08/23/2007 04:27 PM amygdala said the following:
I understand this. But the point I am trying to bring across, most likely I
wasn't too clear on that, is that I don't use UTF-8 encoding anywhere
anymore. I only used it once or twice, to see if that would solve my
problem. But to be very clear about it: I removed all utf8_encode()
functions, and my code is therefor back to it's usual state. But my problem
remains the same:

Sometimes the unintended

ëèï

sometimes the intended



It seems to happen pretty random.

Therefor I am still stunned as to what could be causing this problem. It
looks as if something (maybe some mail gateway?) is transforming my e-mails
to UTF-8.

Do you have any other idea of what might be going on here? Your insights are
very welcome.
Could you be taking the text for the message from user submitted forms?

If so, make sure you set the encoding of the page that exhibit the forms
to an explicit value. If you do not do that, keep in mind that different
browsers assume different default character encodings. That could
explain why sometimes you get the encoding right and other times you don't.

If that is not the problem, consider using the class that I recommended
you and see if you still have the problem.

http://www.phpclasses.org/mimemessage
--

Regards,
Manuel Lemos

Metastorage - Data object relational mapping layer generator
http://www.metastorage.net/

PHP Classes - Free ready to use OOP components written in PHP
http://www.phpclasses.org/
Aug 24 '07 #11

This discussion thread is closed

Replies have been disabled for this discussion.