472,145 Members | 1,624 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 472,145 software developers and data experts.

CDONTS or CDOSYS UTF-8 Email

Jed
I have a form that needs to handle international characters withing the UTF-8
character set. I have tried all the recommended strategies for getting utf-8
characters from form input to email message and I cannot get it to work. I
need to stay with classic asp for this.

Here are some things I tried:

'CDONTS
Call msg.SetLocaleIDs(65001)

'CDOSYS
msg.HTMLBodyPart.Charset = "utf-8"

I included the following meta tag in the email HTML:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I also tried modifying the CharSet and CodePage of all involved Request and
Responses.

I was able to Response.Write the form content on post back to the screen and
it was properly rendered. However, none of my efforts can get the email to
render with the correct codebase. I have tried opening the email in Outlook
and Thunderbird. Neither one picks up on the UTF-8 charset meta tag.

Any help or link to tutorial would help so much.

Thanks.

Nov 8 '06 #1
10 19250

"Jed" <je****@newsgroups.nospamwrote in message
news:B0**********************************@microsof t.com...
I have a form that needs to handle international characters withing the
UTF-8
character set. I have tried all the recommended strategies for getting
utf-8
characters from form input to email message and I cannot get it to work.
I
need to stay with classic asp for this.

Here are some things I tried:

'CDONTS
Call msg.SetLocaleIDs(65001)

'CDOSYS
msg.HTMLBodyPart.Charset = "utf-8"

I included the following meta tag in the email HTML:
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

I also tried modifying the CharSet and CodePage of all involved Request
and
Responses.

I was able to Response.Write the form content on post back to the screen
and
it was properly rendered. However, none of my efforts can get the email
to
render with the correct codebase. I have tried opening the email in
Outlook
and Thunderbird. Neither one picks up on the UTF-8 charset meta tag.

Any help or link to tutorial would help so much.
Mixing charsets in a message is a real mine field. Try this before writing
any content to the message:-

oMsg.BodyPart.charset = "UTF-8"

Where oMsg is a CDOSYS message object (CDONTS is deprecated don't write new
code against it).

That will make all text parts use UTF-8 encoding.

Anthony.

Nov 8 '06 #2
Jed
Hey, Anthony,

Thanks for the suggestion. I was optimistic about its potential, but it
doesn't seem to make a difference.

Here is my code:
msg.BodyFormat = 0 'Set body text to HTML=0 TEXT=1
msg.MailFormat = 0 'Set format to MIME=0 TEXT=1
'Call msg.SetLocaleIDs(65001)
msg.Body = Message
msg.Send
'Try writing the contents to the browser to see if the string is bad
Response.Clear
'Response.CodePage = 65001
Response.CharSet = "utf-8"
Response.Write Message
Response.End

Basically if I post some a character like ú [u with an accent mark] it will
render fine in the browser, but the email it will appear as ú [A with tilde
over it, followed by a superscript o] I think that is the ANSII equivalent
or something.

I have read "The Absolute Minimum Every Software Developer Absolutely,
Positively Must Know About Unicode and Character Sets (No Excuses!)"
[http://www.joelonsoftware.com/articles/Unicode.html] but it doesn't seemed
to shed any light on why this isn't working.

Hmm..

"Anthony Jones" wrote:
Mixing charsets in a message is a real mine field. Try this before writing
any content to the message:-

oMsg.BodyPart.charset = "UTF-8"

Where oMsg is a CDOSYS message object (CDONTS is deprecated don't write new
code against it).

That will make all text parts use UTF-8 encoding.

Anthony.

Nov 8 '06 #3
Jed
Actually, this is the CDOSYS code I tried.

msg.BodyPart.Charset = "utf-8"
msg.HTMLBody = Message
msg.HTMLBodyPart.Charset = "utf-8"
msg.Send

I accidentally copied the CDONTS code in the last post.
"Jed" wrote:
Hey, Anthony,

Thanks for the suggestion. I was optimistic about its potential, but it
doesn't seem to make a difference.

Here is my code:
msg.BodyFormat = 0 'Set body text to HTML=0 TEXT=1
msg.MailFormat = 0 'Set format to MIME=0 TEXT=1
'Call msg.SetLocaleIDs(65001)
msg.Body = Message
msg.Send
'Try writing the contents to the browser to see if the string is bad
Response.Clear
'Response.CodePage = 65001
Response.CharSet = "utf-8"
Response.Write Message
Response.End

Basically if I post some a character like ú [u with an accent mark] it will
render fine in the browser, but the email it will appear as ú [A with tilde
over it, followed by a superscript o] I think that is the ANSII equivalent
or something.

I have read "The Absolute Minimum Every Software Developer Absolutely,
Positively Must Know About Unicode and Character Sets (No Excuses!)"
[http://www.joelonsoftware.com/articles/Unicode.html] but it doesn't seemed
to shed any light on why this isn't working.

Hmm..

"Anthony Jones" wrote:
Mixing charsets in a message is a real mine field. Try this before writing
any content to the message:-

oMsg.BodyPart.charset = "UTF-8"

Where oMsg is a CDOSYS message object (CDONTS is deprecated don't write new
code against it).

That will make all text parts use UTF-8 encoding.

Anthony.


Nov 8 '06 #4
Hello,

Is the CDOSYS code executed in an ASP application? You may try send a plain
text email intstead of the HTML email like:

msg.BodyPart.Charset = "UTF-8"
msg.TextBody = Message
msg.TextBodyPart.Charset = "UTF-8"
msg.Send

Can you receive correct charactors in the email for plain text format?

Sincerely,

Luke Zhang

Microsoft Online Community Support
==================================================
Get notification to my posts through email? Please refer to
http://msdn.microsoft.com/subscripti...ult.aspx#notif
ications.

Note: The MSDN Managed Newsgroup support offering is for non-urgent issues
where an initial response from the community or a Microsoft Support
Engineer within 1 business day is acceptable. Please note that each follow
up response may take approximately 2 business days as the support
professional working with you may need further investigation to reach the
most efficient resolution. The offering is not appropriate for situations
that require urgent, real-time or phone-based interactions or complex
project analysis and dump analysis issues. Issues of this nature are best
handled working with a dedicated Microsoft Support Engineer by contacting
Microsoft Customer Support Services (CSS) at
http://msdn.microsoft.com/subscripti...t/default.aspx.
==================================================

This posting is provided "AS IS" with no warranties, and confers no rights.

Nov 9 '06 #5

"Jed" <je****@newsgroups.nospamwrote in message
news:C6**********************************@microsof t.com...
Actually, this is the CDOSYS code I tried.

msg.BodyPart.Charset = "utf-8"
msg.HTMLBody = Message
msg.HTMLBodyPart.Charset = "utf-8"
msg.Send

I accidentally copied the CDONTS code in the last post.
Try this in a VBScript file:-

Option Explicit

Const cdoSendUsingMethod =
"http://schemas.microsoft.com/cdo/configuration/sendusing"
Const cdoFlushBuffersOnWrite =
"http://schemas.microsoft.com/cdo/configuration/flushbuffersonwrite"
Const cdoSMTPServerPickupDirectory =
"http://schemas.microsoft.com/cdo/configuration/smtpserverpickupdirectory"
Const cdoSendUsingPickup = 1

Dim oMsg : Set oMsg = CreateObject("CDO.Message")

Set oMsg.Configuration = CreateObject("CDO.Configuration")

With oMsg.Configuration.Fields
.Item(cdoSendUsingMethod) = cdoSendUsingPickup
.Item(cdoFlushBuffersOnWrite) = True
.Item(cdoSMTPServerPickupDirectory) = "G:\temp\pickup" '*** change this
.Update
End With

oMsg.BodyPart.charset = "UTF-8"

oMsg.From = "Du**@somewhere.com"
oMsg.To = "Bl***@elsewhere.com"
oMsg.Subject = "Testing"
oMsg.HTMLBody = "<html><body>£</body></html>"

oMsg.Send

MsgBox "Done"
Change the pick folder to a temp folder on your macine.

When executed open the resulting eml file in Outlook Express (double click
it). Does the £ appear correctly without other strange characters?

Open the eml file in notepad you should see something like:-

X-Receiver: Bl***@elsewhere.com
X-Sender: Du**@somewhere.com
From: <Du**@somewhere.com>
To: <Bl***@elsewhere.com>
Subject: Testing
Date: Sun, 12 Nov 2006 19:46:27 -0000
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0001_01C70693.3DE9F350"
Content-Class: urn:content-classes:message

This is a multi-part message in MIME format.

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: base64

wqPigqzFkg0K

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

<html><body>£</body></html>
------=_NextPart_000_0001_01C70693.3DE9F350--

I deleted some headers for clarity. However you can see that by specifying
UTF-8 on the main message body part before writing anything to the message
has caused it to cascade the UTF-8 encoding to the alternative parts.

What happens you change the code so that the configuration sends using port
25 to your SMTP server and you specify your real email address as the
receiver. Does the email look ok when it arrives in outlook/thunderbird?


Nov 12 '06 #6
Jed
Thanks for the input Anthony,

I wrote out the email as you indicated and indeed the headers are UTF-8 but
the text is wrong:

This:
msg.BodyPart.Charset = "UTF-8"
msg.TextBody = Message
msg.TextBodyPart.Charset = "UTF-8"
msg.HTMLBody = Message
msg.HTMLBodyPart.Charset = "UTF-8"

Yields this:

------=_NextPart_000_0001_01C70806.B7C0CB80
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

------=_NextPart_000_0001_01C70806.B7C0CB80
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

When I just set the BodyPart.Charset = "UTF-8" and set the HTMLBody =
Message then I get the following text in the text version of the email.

=C3=83=C2=BA

Using Notepad++
Start in ANSI
Convert the chars above from Hex to Text
(Plugins TextFX Convert Hex to Text)
Then switch to UTF-8
You get the chars in the email

Then cut the characters
Switch to ANSI mode
Paste the characters
Switch to UTF-8
And you get the character that is supposed to be there.

I don't get it. Any ideas what I am doing wrong?
"Anthony Jones" wrote:
>
"Jed" <je****@newsgroups.nospamwrote in message
news:C6**********************************@microsof t.com...
Actually, this is the CDOSYS code I tried.

msg.BodyPart.Charset = "utf-8"
msg.HTMLBody = Message
msg.HTMLBodyPart.Charset = "utf-8"
msg.Send

I accidentally copied the CDONTS code in the last post.

Try this in a VBScript file:-

Option Explicit

Const cdoSendUsingMethod =
"http://schemas.microsoft.com/cdo/configuration/sendusing"
Const cdoFlushBuffersOnWrite =
"http://schemas.microsoft.com/cdo/configuration/flushbuffersonwrite"
Const cdoSMTPServerPickupDirectory =
"http://schemas.microsoft.com/cdo/configuration/smtpserverpickupdirectory"
Const cdoSendUsingPickup = 1

Dim oMsg : Set oMsg = CreateObject("CDO.Message")

Set oMsg.Configuration = CreateObject("CDO.Configuration")

With oMsg.Configuration.Fields
.Item(cdoSendUsingMethod) = cdoSendUsingPickup
.Item(cdoFlushBuffersOnWrite) = True
.Item(cdoSMTPServerPickupDirectory) = "G:\temp\pickup" '*** change this
.Update
End With

oMsg.BodyPart.charset = "UTF-8"

oMsg.From = "Du**@somewhere.com"
oMsg.To = "Bl***@elsewhere.com"
oMsg.Subject = "Testing"
oMsg.HTMLBody = "<html><body>£</body></html>"

oMsg.Send

MsgBox "Done"
Change the pick folder to a temp folder on your macine.

When executed open the resulting eml file in Outlook Express (double click
it). Does the £ appear correctly without other strange characters?

Open the eml file in notepad you should see something like:-

X-Receiver: Bl***@elsewhere.com
X-Sender: Du**@somewhere.com
From: <Du**@somewhere.com>
To: <Bl***@elsewhere.com>
Subject: Testing
Date: Sun, 12 Nov 2006 19:46:27 -0000
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0001_01C70693.3DE9F350"
Content-Class: urn:content-classes:message

This is a multi-part message in MIME format.

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: base64

wqPigqzFkg0K

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

<html><body>£</body></html>
------=_NextPart_000_0001_01C70693.3DE9F350--

I deleted some headers for clarity. However you can see that by specifying
UTF-8 on the main message body part before writing anything to the message
has caused it to cascade the UTF-8 encoding to the alternative parts.

What happens you change the code so that the configuration sends using port
25 to your SMTP server and you specify your real email address as the
receiver. Does the email look ok when it arrives in outlook/thunderbird?


Nov 14 '06 #7

"Jed" <je****@newsgroups.nospamwrote in message
news:CE**********************************@microsof t.com...
Thanks for the input Anthony,

I wrote out the email as you indicated and indeed the headers are UTF-8
but
the text is wrong:
Before we go any further did you paste my code verbatim into a VBS? (Cos
what you posted below isn't what I posted)
Did you then open it in outlook express and did it look right?

This:
msg.BodyPart.Charset = "UTF-8"
Don't do this:-
msg.TextBody = Message
msg.TextBodyPart.Charset = "UTF-8"
msg.HTMLBody = Message
Don't do this either:-
msg.HTMLBodyPart.Charset = "UTF-8"

Yields this:

------=_NextPart_000_0001_01C70806.B7C0CB80
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

------=_NextPart_000_0001_01C70806.B7C0CB80
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

When I just set the BodyPart.Charset = "UTF-8" and set the HTMLBody =
Message then I get the following text in the text version of the email.

=C3=83=C2=BA

Using Notepad++
Start in ANSI
Convert the chars above from Hex to Text
(Plugins TextFX Convert Hex to Text)
Then switch to UTF-8
You get the chars in the email

Then cut the characters
Switch to ANSI mode
Paste the characters
Switch to UTF-8
And you get the character that is supposed to be there.

I don't get it. Any ideas what I am doing wrong?
It would help if I knew what character this is supposed to be? ú ?
What ANSI codepage are you using and what are the char codes for these
characters in that code page?
Are you certain the chararacter isn't already corrupted?
The fact that 4 octets have appeared in the output suggests to me that the
character is going through the UTF-8 encoding twice?
Is this in ASP?
Are you posting from a UTF-8 encoded HTML form?
>
"Anthony Jones" wrote:

"Jed" <je****@newsgroups.nospamwrote in message
news:C6**********************************@microsof t.com...
Actually, this is the CDOSYS code I tried.
>
msg.BodyPart.Charset = "utf-8"
msg.HTMLBody = Message
msg.HTMLBodyPart.Charset = "utf-8"
msg.Send
>
I accidentally copied the CDONTS code in the last post.
>
Try this in a VBScript file:-

Option Explicit

Const cdoSendUsingMethod =
"http://schemas.microsoft.com/cdo/configuration/sendusing"
Const cdoFlushBuffersOnWrite =
"http://schemas.microsoft.com/cdo/configuration/flushbuffersonwrite"
Const cdoSMTPServerPickupDirectory =
"http://schemas.microsoft.com/cdo/configuration/smtpserverpickupdirectory"
Const cdoSendUsingPickup = 1

Dim oMsg : Set oMsg = CreateObject("CDO.Message")

Set oMsg.Configuration = CreateObject("CDO.Configuration")

With oMsg.Configuration.Fields
.Item(cdoSendUsingMethod) = cdoSendUsingPickup
.Item(cdoFlushBuffersOnWrite) = True
.Item(cdoSMTPServerPickupDirectory) = "G:\temp\pickup" '*** change
this
.Update
End With

oMsg.BodyPart.charset = "UTF-8"

oMsg.From = "Du**@somewhere.com"
oMsg.To = "Bl***@elsewhere.com"
oMsg.Subject = "Testing"
oMsg.HTMLBody = "<html><body>£</body></html>"

oMsg.Send

MsgBox "Done"
Change the pick folder to a temp folder on your macine.

When executed open the resulting eml file in Outlook Express (double
click
it). Does the £ appear correctly without other strange characters?

Open the eml file in notepad you should see something like:-

X-Receiver: Bl***@elsewhere.com
X-Sender: Du**@somewhere.com
From: <Du**@somewhere.com>
To: <Bl***@elsewhere.com>
Subject: Testing
Date: Sun, 12 Nov 2006 19:46:27 -0000
MIME-Version: 1.0
Content-Type: multipart/alternative;
boundary="----=_NextPart_000_0001_01C70693.3DE9F350"
Content-Class: urn:content-classes:message

This is a multi-part message in MIME format.

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/plain;
charset="UTF-8"
Content-Transfer-Encoding: base64

wqPigqzFkg0K

------=_NextPart_000_0001_01C70693.3DE9F350
Content-Type: text/html;
charset="UTF-8"
Content-Transfer-Encoding: 8bit

<html><body>£</body></html>
------=_NextPart_000_0001_01C70693.3DE9F350--

I deleted some headers for clarity. However you can see that by
specifying
UTF-8 on the main message body part before writing anything to the
message
has caused it to cascade the UTF-8 encoding to the alternative parts.

What happens you change the code so that the configuration sends using
port
25 to your SMTP server and you specify your real email address as the
receiver. Does the email look ok when it arrives in
outlook/thunderbird?




Nov 14 '06 #8
Jed
Hi Anthony,

I have a good feeling that you will be able to help me get to the bottom of
this.

Let me answer your questions.

"Anthony Jones" wrote:
Before we go any further did you paste my code verbatim into a VBS? (Cos
what you posted below isn't what I posted)
Yes. I tried it exactly as you recommended then I tried some other things.
Did you then open it in outlook express and did it look right?
Yes. I opened the eml in outlook and it did not look right.
It would help if I knew what character this is supposed to be? ú ?
Yes. You are correct about the character code. I would have pasted it in
my message but I was not confident that it would come out right in the post.
What ANSI codepage are you using and what are the char codes for these
characters in that code page?
I don't know what ANSI code page Notepad++ uses. I am guessing the default
for my localization settings in windows.
Are you certain the chararacter isn't already corrupted?
I don't know, but when I write the results out to the web page using
Response.Write(Message) I get the correct characters.

Response.Clear
'I have heard you need the following, but it seems to
' render fine in the browser without it
'Response.CodePage = 65001
Response.CharSet = "utf-8"
Response.Write Message
Response.End
The fact that 4 octets have appeared in the output suggests to me that the
character is going through the UTF-8 encoding twice?
This is possible, I guess. I don't know.
Is this in ASP?
Yes. This is a classic asp page handling the request using the standard asp
ISAPI dll in IIS 6.
Are you posting from a UTF-8 encoded HTML form?
I believe so. I put the following in the HTML of the form page:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">

Does that make any sense?

Nov 14 '06 #9

"Jed" <je****@newsgroups.nospamwrote in message
news:5B**********************************@microsof t.com...
Hi Anthony,

I have a good feeling that you will be able to help me get to the bottom
of
this.

Let me answer your questions.

"Anthony Jones" wrote:
Before we go any further did you paste my code verbatim into a VBS? (Cos
what you posted below isn't what I posted)

Yes. I tried it exactly as you recommended then I tried some other
things.
>
Did you then open it in outlook express and did it look right?

Yes. I opened the eml in outlook and it did not look right.
Was that after you 'tried some things' or before?

It didn't contain simply a British pound sign (£)?
How did the contents of the eml file create by my original code differ from
the contents I posted along with the code?

>
It would help if I knew what character this is supposed to be? ú ?

Yes. You are correct about the character code. I would have pasted it in
my message but I was not confident that it would come out right in the
post.
>
What ANSI codepage are you using and what are the char codes for these
characters in that code page?

I don't know what ANSI code page Notepad++ uses. I am guessing the
default
for my localization settings in windows.
Yes it uses the localization settings.

Are you certain the chararacter isn't already corrupted?

I don't know, but when I write the results out to the web page using
Response.Write(Message) I get the correct characters.

Response.Clear
'I have heard you need the following, but it seems to
' render fine in the browser without it
'Response.CodePage = 65001
Response.CharSet = "utf-8"
Response.Write Message
Response.End
Your problem I believe hinges around a couple of little understood facts.
The response.codepage affects the way posted characters received in the
Request are converted to unicode. IOW, if the response code page is set to
a standard ANSI character set then any characters received in a form post
will be assumed to also be in the same ANSI character set.

Here's another fact. A browser will encode characters into a Form post
according to the charset for the page. Hence a content-type specifying a
charset of UTF-8 will cause characters in the form fields to be encoded to
UTF-8 when posted.

Combining these facts we can see that if a UTF-8 page posts characters to an
ASP target which reads the form fields whilst the Response.CodePage is set
to an ANSI codepage this would result in each byte in a multibyte UTF-8
character to be treated as individual characters.

The code above hides this problem because Response.Write is assuming it is
sending ANSI but tells the page it is getting UTF-8 reversing the problem.

>
The fact that 4 octets have appeared in the output suggests to me that
the
character is going through the UTF-8 encoding twice?

This is possible, I guess. I don't know.
Is this in ASP?

Yes. This is a classic asp page handling the request using the standard
asp
ISAPI dll in IIS 6.
Are you posting from a UTF-8 encoded HTML form?

I believe so. I put the following in the HTML of the form page:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
Yeah don't do that. Use the Charset and ContentType properties of the
response object.
Does that make any sense?
Yes. When receiving a Form post from a UTF-8 page make sure your
Response.Codepage is set to 65001 before you attempt to read any form
fields.

Anthony.
Nov 15 '06 #10
Jed
You're a rock star Anthony! I assumed that the Response.Codepage only
affected the Response stream, but the fact that it also determines how the
Request items are read is good to know.

I run into encoding problems with XML too. One of these days I am going to
figure this out.

Thanks again.

"Anthony Jones" wrote:
Yes. When receiving a Form post from a UTF-8 page make sure your
Response.Codepage is set to 65001 before you attempt to read any form
fields.
Nov 16 '06 #11

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

7 posts views Thread by Gonzosez | last post: by
4 posts views Thread by Steve | last post: by
3 posts views Thread by NohaKhalifa | last post: by
1 post views Thread by Paxton | last post: by
4 posts views Thread by Mac Davis | last post: by
7 posts views Thread by Paul | last post: by
reply views Thread by Saiars | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.