469,922 Members | 2,172 Online
Bytes | Developer Community
New Post

Home Posts Topics Members FAQ

Post your question to a community of 469,922 developers. It's quick & easy.

MIME encoding change in Python 2.4.3 (or 2.4.2? 2.4.1?) - problemand solution

I have an application that processes MIME messages. It reads a message from a file,
looks for a text/html and text/plain parts in it, performs some processing on these
parts, and outputs the new message.

Ever since I recently upgraded my Python to 2.4.3, the output messages started to
come out garbled, as a block of junk characters.

I traced the problem back to a few lines that were removed from the email package:
The new Python no longer encodes the payload when converting the MIME message to a

Since my program must work on several computers, each having a different version of
Python, I had to find a way to make it work correctly no matter if msg.as_string()
encodes the payload or not.

Here is a piece of code that demonstrates how to work around this problem:

................... code start ................
import email
import email.MIMEText
import email.Charset

def do_some_processing(s):
"""Return the input text or HTML string after processing it in some way."""
# For the sake of this example, we only do some trivial processing.
return s.replace('foo','bar')

msg = email.message_from_string(file('input_mime_msg','r ').read())
utf8 = email.Charset.Charset('UTF-8')
for part in msg.walk():
if part.is_multipart():
if part.get_content_type() in ('text/plain','text/html'):
s = part.get_payload(None, True) # True means decode the payload, which is normally base64-encoded.
# s is now a sting containing just the text or html of the part, not encoded in any way.

s = do_some_processing(s)

# Starting with Python 2.4.3 or so, msg.as_string() no longer encodes the payload
# according to the charset, so we have to do it ourselves here.
# The trick is to create a message-part with 'x' as payload and see if it got
# encoded or not.
should_encode = (email.MIMEText.MIMEText('x', 'html', 'UTF-8').get_payload() != 'x')
if should_encode:
s = utf8.body_encode(s)

part.set_payload(s, utf8)
# The next two lines may be necessary if the original input message uses a different encoding
# encoding than the one used in the email package. In that case we have to replace the
# Content-Transfer-Encoding header to indicate the new encoding.
del part['Content-Transfer-Encoding']
part['Content-Transfer-Encoding'] = utf8.get_body_encoding()

................... code end ................

Hope this helps someone out there.
(Permission is hereby granted for anybody to use this piece of code for any purpose whatsoever)
Nov 28 '06 #1
0 1261

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

4 posts views Thread by Alberto | last post: by
reply views Thread by Klaus Bonadt | last post: by
reply views Thread by José Joye | last post: by
2 posts views Thread by Der tolle Emil | last post: by
6 posts views Thread by Franz Steinhaeusler | last post: by
reply views Thread by berb.web | last post: by
7 posts views Thread by Ron Garret | last post: by
reply views Thread by Waqarahmed | last post: by
By using this site, you agree to our Privacy Policy and Terms of Use.