P: n/a
|
Hello, everyone...
I'm trying to send an email to people with non-ASCII characters in their
names. A recpient's address may look like:
"Jörg Nørgens" <joerg@nowhere>
My example code:
=================================
def sendmail(sender, recipient, body, subject):
message = MIMEText(body)
message['Subject'] = Header(subject, 'iso-8859-1')
message['From'] = Header(sender, 'iso-8859-1')
message['To'] = Header(recipient, 'iso-8859-1')
s = smtplib.SMTP()
s.connect()
s.sendmail(sender, recipient, message.as_string())
s.close()
=================================
However the Header() method encodes the whole expression in ISO-8859-1:
=?iso-8859-1?q?=22J=C3=B6rg_N=C3=B8rgens=22_=3Cjoerg=40nowher e=3E?=
However I had expected something like:
"=?utf-8?q?J=C3=B6rg?= =?utf-8?q?_N=C3=B8rgens?=" <joerg@nowhere>
Of course my mail transfer agent is not happy with the first string
although I see that Header() is just doing its job. I'm looking for a way
though to encode just the non-ASCII parts like any mail client does. Does
anyone have a recipe on how to do that? Or is there a method in
the "email" module of the standard library that does what I need? Or
should I split by regular expression to extract the email address
beforehand? Or a list comprehension to just look for non-ASCII character
and Header() them? Sounds dirty.
Hints welcome.
Regards
Christoph | |
Share this Question
P: n/a
|
Christoph Haas skrev:
Hello, everyone...
I'm trying to send an email to people with non-ASCII characters in their
names. A recpient's address may look like:
"Jörg Nørgens" <joerg@nowhere>
My example code:
=================================
def sendmail(sender, recipient, body, subject):
message = MIMEText(body)
message['Subject'] = Header(subject, 'iso-8859-1')
message['From'] = Header(sender, 'iso-8859-1')
message['To'] = Header(recipient, 'iso-8859-1')
s = smtplib.SMTP()
s.connect()
s.sendmail(sender, recipient, message.as_string())
s.close()
=================================
However the Header() method encodes the whole expression in ISO-8859-1:
=?iso-8859-1?q?=22J=C3=B6rg_N=C3=B8rgens=22_=3Cjoerg=40nowher e=3E?=
However I had expected something like:
"=?utf-8?q?J=C3=B6rg?= =?utf-8?q?_N=C3=B8rgens?=" <joerg@nowhere>
Of course my mail transfer agent is not happy with the first string
Why offcourse? But it seems that you are passing the Header object a
utf-8 encoded string, not a latin-1 encoded.
You are telling the header the encoding. Not asking it to encode.
--
hilsen/regards Max M, Denmark http://www.mxm.dk/
IT's Mad Science | |
P: n/a
|
On Thursday 23 November 2006 16:31, Max M wrote:
Christoph Haas skrev:
Hello, everyone...
I'm trying to send an email to people with non-ASCII characters in
their names. A recpient's address may look like:
"Jörg Nørgens" <joerg@nowhere>
My example code:
=================================
def sendmail(sender, recipient, body, subject):
message = MIMEText(body)
message['Subject'] = Header(subject, 'iso-8859-1')
message['From'] = Header(sender, 'iso-8859-1')
message['To'] = Header(recipient, 'iso-8859-1')
s = smtplib.SMTP()
s.connect()
s.sendmail(sender, recipient, message.as_string())
s.close()
=================================
However the Header() method encodes the whole expression in
ISO-8859-1:
=?iso-8859-1?q?=22J=C3=B6rg_N=C3=B8rgens=22_=3Cjoerg=40nowher e=3E?=
However I had expected something like:
"=?utf-8?q?J=C3=B6rg?= =?utf-8?q?_N=C3=B8rgens?=" <joerg@nowhere>
Of course my mail transfer agent is not happy with the first string
Why offcourse?
Because my MTA doesn't care about MIME. It just transports the email. And
it expects an email address in <...but doesn't decode =?iso...? strings.
But it seems that you are passing the Header object a
utf-8 encoded string, not a latin-1 encoded.
You are telling the header the encoding. Not asking it to encode.
Uhm, okay. Let's see:
u'"Jörg Nørgens" <joerg@nowhere>'.encode('latin-1')
='"J\xc3\xb6rg N\xc3\xb8rgens" <joerg@nowhere>'
So far so good. Now run Header() on it:
='=?utf-8?b?IkrDtnJnIE7DuHJnZW5zIiA8am9lcmdAbm93aGVyZT4=?= '
Still nothing like <...in it and my MTA is unhappy again. What am I
missing? Doesn't anyone know how mail clients handle that encoding?
Desperately,
Christoph | |
P: n/a
|
Christoph Haas wrote:
Hello, everyone...
I'm trying to send an email to people with non-ASCII characters in their
names. A recpient's address may look like:
"Jörg Nørgens" <joerg@nowhere>
My example code:
=================================
def sendmail(sender, recipient, body, subject):
message = MIMEText(body)
message['Subject'] = Header(subject, 'iso-8859-1')
message['From'] = Header(sender, 'iso-8859-1')
message['To'] = Header(recipient, 'iso-8859-1')
s = smtplib.SMTP()
s.connect()
s.sendmail(sender, recipient, message.as_string())
s.close()
=================================
However the Header() method encodes the whole expression in ISO-8859-1:
=?iso-8859-1?q?=22J=C3=B6rg_N=C3=B8rgens=22_=3Cjoerg=40nowher e=3E?=
However I had expected something like:
"=?utf-8?q?J=C3=B6rg?= =?utf-8?q?_N=C3=B8rgens?=" <joerg@nowhere>
Of course my mail transfer agent is not happy with the first string
although I see that Header() is just doing its job. I'm looking for a way
though to encode just the non-ASCII parts like any mail client does. Does
anyone have a recipe on how to do that? Or is there a method in
the "email" module of the standard library that does what I need? Or
should I split by regular expression to extract the email address
beforehand? Or a list comprehension to just look for non-ASCII character
and Header() them? Sounds dirty.
Why dirty?
from email.Header import Header
from itertools import groupby
h = Header()
addr = u'"Jörg Nørgens" <joerg@nowhere>'
def is_ascii(char):
return ord(char) < 128
for ascii, group in groupby(addr, is_ascii):
h.append(''.join(group),"latin-1")
print h
=>
"J =?iso-8859-1?q?=F6?= rg N =?iso-8859-1?q?=F8?= rgens"
<joerg@nowhere>
-- Leo | |
P: n/a
|
Christoph Haas skrev:
On Thursday 23 November 2006 16:31, Max M wrote:
>Christoph Haas skrev:
>>Hello, everyone...
I'm trying to send an email to people with non-ASCII characters in their names. A recpient's address may look like:
"Jörg Nørgens" <joerg@nowhere>
My example code:
================================= def sendmail(sender, recipient, body, subject): message = MIMEText(body) message['Subject'] = Header(subject, 'iso-8859-1') message['From'] = Header(sender, 'iso-8859-1') message['To'] = Header(recipient, 'iso-8859-1')
s = smtplib.SMTP() s.connect() s.sendmail(sender, recipient, message.as_string()) s.close() =================================
However the Header() method encodes the whole expression in ISO-8859-1:
=?iso-8859-1?q?=22J=C3=B6rg_N=C3=B8rgens=22_=3Cjoerg=40nowher e=3E?=
However I had expected something like:
"=?utf-8?q?J=C3=B6rg?= =?utf-8?q?_N=C3=B8rgens?=" <joerg@nowhere>
Of course my mail transfer agent is not happy with the first string
Why offcourse?
Because my MTA doesn't care about MIME. It just transports the email. And
it expects an email address in <...but doesn't decode =?iso...? strings.
>But it seems that you are passing the Header object a utf-8 encoded string, not a latin-1 encoded. You are telling the header the encoding. Not asking it to encode.
Uhm, okay. Let's see:
u'"Jörg Nørgens" <joerg@nowhere>'.encode('latin-1')
='"J\xc3\xb6rg N\xc3\xb8rgens" <joerg@nowhere>'
So far so good. Now run Header() on it:
='=?utf-8?b?IkrDtnJnIE7DuHJnZW5zIiA8am9lcmdAbm93aGVyZT4=?= '
Still nothing like <...in it and my MTA is unhappy again. What am I
missing? Doesn't anyone know how mail clients handle that encoding?
>>address = u'"Jörg Nørgens" <joerg@nowhere>'.encode('latin-1') address
'"J\xf6rg N\xf8rgens" <joerg@nowhere>'
>>from email.Header import Header hdr = str(Header(address, 'latin-1')) hdr
'=?iso-8859-1?q?=22J=F6rg_N=F8rgens=22_=3Cjoerg=40nowhere=3E?= '
Is this not correct?
At least roundtripping works:
>>from email.Header import decode_header encoded, coding = decode_header(hdr)[0] encoded, coding
('"J\xf6rg N\xf8rgens" <joerg@nowhere>', 'iso-8859-1')
>>encoded.decode(coding)
u'"J\xf6rg N\xf8rgens" <joerg@nowhere>'
And parsing the address works too.
>>from email.Utils import parseaddr parseaddr(encoded.decode(coding))
(u'J\xf6rg N\xf8rgens', u'joerg@nowhere')
>>>
--
hilsen/regards Max M, Denmark http://www.mxm.dk/
IT's Mad Science | | This discussion thread is closed Replies have been disabled for this discussion. | | Question stats - viewed: 1981
- replies: 4
- date asked: Nov 23 '06
|