473,388 Members | 1,600 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,388 software developers and data experts.

PHP - using mail() and unicode text - text gets disturbed

I have the following problem. On a website there's a (simple) feedback
form. This is used also by Polish visitors who (of course) type Polish
text using special characters.

However, when I receive the text in my mailbox, all special characters
have been turned into mess......

For example: "wspólprace" is turned into "współprace".

It seems PHP is handling the Unicode-8 strings quite well (when I
'echo' the strings on the site, I see the text correctly), until the
point that it is send by using mail().

Is this a server configuration issue? Or something else?

How can I get my text to remain in Unicode?

I have this problem both on my testserver (Apache 1.3.28, PHP 4.3.2 on
Windows XP) as on my providers server (Apache under Linux).
Hope anybody can help.

Many thanks,
Edo.
Jul 17 '05 #1
6 10172
For example: "wspólprace" is turned into "wspóÅ,prace".

It seems PHP is handling the Unicode-8 strings quite well


are you setting up the headers of the email to state something such as

Content-Type: text/html;charset=iso-8859-15
Jul 17 '05 #2
It's an encoding issue. One way to deal with this is to escape the UTF-8
text using imap_8bit() and set the charset in the email header to UTF-8.
Many email clients don't handle this correctly though. I would recommend
sending multipart mails. In the plaintext part, remove the accent marks
(solidarnos'c' -> solidarnosc). In the HTML part, encoding the special
characters as HTML entities (doka,d => dokąd). This will ensure that
everyone see something that's readable. The same strategy is used by Outlook
Express. It'll be helpful if you send yourself a test email and look at the
source.

Here are a couple functions that do what I suggested:

$pl_markless_tr = array(
"\xC4\x85" => "a",
"\xC4\x87" => "c",
"\xC4\x99" => "e",
"\xC5\x82" => "l",
"\xC5\x84" => "n",
"\xC5\x9b" => "s",
"\xC5\xba" => "z",
"\xC5\xbc" => "z");

$pl_uni_entities_tr = array(
"\xC4\x85" => "ą",
"\xC4\x87" => "ć",
"\xC4\x99" => "ę",
"\xC5\x82" => "ł",
"\xC5\x84" => "ń",
"\xC5\x9b" => "ś",
"\xC5\xba" => "ź",
"\xC5\xbc" => "ż");

function remove_polish_marks($s) {
global $pl_markless_tr;
return strtr($s, $pl_markless_tr);
}

function escape_polish_marks($s) {
global $pl_uni_entities_tr;
return strtr($s, $pl_uni_entities_tr);
}
Uzytkownik "Edo van der Zouwen"
<ez*****@dithiervoorisdomainenhetisbijdemonkenners wetenwattedoen.nl> napisal
w wiadomosci news:jm********************************@4ax.com...
I have the following problem. On a website there's a (simple) feedback
form. This is used also by Polish visitors who (of course) type Polish
text using special characters.

However, when I receive the text in my mailbox, all special characters
have been turned into mess......

For example: "wspólprace" is turned into "współprace".

It seems PHP is handling the Unicode-8 strings quite well (when I
'echo' the strings on the site, I see the text correctly), until the
point that it is send by using mail().

Is this a server configuration issue? Or something else?

How can I get my text to remain in Unicode?

I have this problem both on my testserver (Apache 1.3.28, PHP 4.3.2 on
Windows XP) as on my providers server (Apache under Linux).
Hope anybody can help.

Many thanks,
Edo.

Jul 17 '05 #3
On Sun, 1 Feb 2004 15:33:30 -0000, "Filth" <p.*********@blueyonder.co.uk>
wrote:
For example: "wspólprace" is turned into "wspóÅ,prace".

It seems PHP is handling the Unicode-8 strings quite well


are you setting up the headers of the email to state something such as

Content-Type: text/html;charset=iso-8859-15


Content-Type: text/plain;charset=utf-8

... sounds like the more appropriate header to send in this case.

--
Andy Hassall <an**@andyh.co.uk> / Space: disk usage analysis tool
<http://www.andyh.co.uk> / <http://www.andyhsoftware.co.uk/space>
Jul 17 '05 #4
On Sun, 1 Feb 2004 15:33:30 -0000, "Filth"
<p.*********@blueyonder.co.uk> wrote:
For example: "wspólprace" is turned into "wspóÅ,prace".

It seems PHP is handling the Unicode-8 strings quite well


are you setting up the headers of the email to state something such as

Content-Type: text/html;charset=iso-8859-15

Thanks, this did the trick, except the header should contain:

"Content-Type: text/html; charset=UNICODE-1-1-UTF-8"

Cheers,
Edo.
Jul 17 '05 #5
On Sun, 1 Feb 2004 12:06:26 -0500, "Chung Leong"
<ch***********@hotmail.com> wrote:
It's an encoding issue. One way to deal with this is to escape the UTF-8
text using imap_8bit() and set the charset in the email header to UTF-8.
Many email clients don't handle this correctly though. I would recommend
sending multipart mails. In the plaintext part, remove the accent marks
(solidarnos'c' -> solidarnosc). In the HTML part, encoding the special
characters as HTML entities (doka,d => dokąd). This will ensure that
everyone see something that's readable. The same strategy is used by Outlook
Express. It'll be helpful if you send yourself a test email and look at the
source.

Here are a couple functions that do what I suggested:

$pl_markless_tr = array(
"\xC4\x85" => "a",
"\xC4\x87" => "c",
"\xC4\x99" => "e",
"\xC5\x82" => "l",
"\xC5\x84" => "n",
"\xC5\x9b" => "s",
"\xC5\xba" => "z",
"\xC5\xbc" => "z");

$pl_uni_entities_tr = array(
"\xC4\x85" => "ą",
"\xC4\x87" => "ć",
"\xC4\x99" => "ę",
"\xC5\x82" => "ł",
"\xC5\x84" => "ń",
"\xC5\x9b" => "ś",
"\xC5\xba" => "ź",
"\xC5\xbc" => "ż");

function remove_polish_marks($s) {
global $pl_markless_tr;
return strtr($s, $pl_markless_tr);
}

function escape_polish_marks($s) {
global $pl_uni_entities_tr;
return strtr($s, $pl_uni_entities_tr);
}


Thanks, very interesting method. For the time being, the email client
used by the receiver of the webforms is capable of handling the
unicode text, so I'll stick to just using a header which enables
Unicode text.

However, I'll definiately save and check your method, might be very
useful in the future.

Dziekuje i do wiedzenia :-)
Edo.
Jul 17 '05 #6
On Sun, 01 Feb 2004 18:20:19 +0000, Andy Hassall <an**@andyh.co.uk>
wrote:

Content-Type: text/plain;charset=utf-8

... sounds like the more appropriate header to send in this case.


Thx, found that out myself, but appreciate your input.

Edo.
Jul 17 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

8
by: Bill Eldridge | last post by:
I'm trying to grab a document off the Web and toss it into a MySQL database, but I keep running into the various encoding problems with Unicode (that aren't a problem for me with GB2312, BIG 5,...
1
by: Irmen de Jong | last post by:
Hi I'm trying to create e-mail content using the email.MIMEText module. It basically works, until I tried to send mail in non-ascii format. What I did, to test both iso-8859-15 and UTF-8...
11
by: Grasshopper | last post by:
Hi, I am automating Access reports to PDF using PDF Writer 6.0. I've created a DTS package to run the reports and schedule a job to run this DTS package. If I PC Anywhere into the server on...
0
by: arkam | last post by:
Hi, I tryed to send an email with the smtpmail class in the .net framework and I got an error in Outlook. Here is my code: System.Web.Mail.MailMessage mail = new...
0
by: David Dvali | last post by:
Hello. I have a problem with sending Unicode text in mail message. So what I do: First of all I have some template file like this: ================================= <html> <head><title>Test...
6
by: ransoma22 | last post by:
I developing an application that receive SMS from a connected GSM handphone, e.g Siemens M55, Nokia 6230,etc through the data cable. The application(VB.NET) will receive the SMS automatically,...
1
by: Roberto Rocco | last post by:
Hello, I'm using VS 2005 and I need to send a mail body which contains german umlauts (ä,ö,ü). When I receive the mail in Outlook 2003 (english operating system) I always get a '|' or other...
0
by: David | last post by:
Hi Vlasta. I had a look at your original mail. I think your simpler (than XML) format is a good idea for now. At a later stage you could change it to something like this: <CUSTOM_TAG KC=12...
3
by: =?Utf-8?B?anAybXNmdA==?= | last post by:
I've got some code that is chopping my subject line on the messages I receive, and I do not know why. Our server is running Active Directory and SQL Server 2000 Enterprise. The email is sent,...
0
by: taylorcarr | last post by:
A Canon printer is a smart device known for being advanced, efficient, and reliable. It is designed for home, office, and hybrid workspace use and can also be used for a variety of purposes. However,...
0
by: ryjfgjl | last post by:
If we have dozens or hundreds of excel to import into the database, if we use the excel import function provided by database editors such as navicat, it will be extremely tedious and time-consuming...
0
by: ryjfgjl | last post by:
In our work, we often receive Excel tables with data in the same format. If we want to analyze these data, it can be difficult to analyze them because the data is spread across multiple Excel files...
1
by: nemocccc | last post by:
hello, everyone, I want to develop a software for my android phone for daily needs, any suggestions?
1
by: Sonnysonu | last post by:
This is the data of csv file 1 2 3 1 2 3 1 2 3 1 2 3 2 3 2 3 3 the lengths should be different i have to store the data by column-wise with in the specific length. suppose the i have to...
0
by: Hystou | last post by:
There are some requirements for setting up RAID: 1. The motherboard and BIOS support RAID configuration. 2. The motherboard has 2 or more available SATA protocol SSD/HDD slots (including MSATA, M.2...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.