471,350 Members | 1,608 Online
Bytes | Software Development & Data Engineering Community
Post +

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 471,350 software developers and data experts.

Question about email-handling modules

Hello,

I'm new to Python but have lots of programming experience in C, C++ and
Perl. Browsing through the docs, the email handling modules caught my eye
because I'd always wanted to write a script to handle my huge, ancient, and
partially corrupted email archives.

Of course I know that this kind of project shouldn't be tackled by a
beginner in a language, but I still thought I'd give it a spin.

So I wrote the stuff at the bottom. It lists senders, subjects and
addressees of all messages in an mbox.

Things that I don't understand:

1. Why can I get the 'subject' and 'from' header field unsig the []
notation, but not 'to'? When I print Message.keys I get a list of all header
fields of the message, including 'to'. What's the difference between
message['to'] and message.get('to')?

2. Why can't I call the get_payload() method on the message? What I get is
this cryptic error: "AttributeError: Message instance has no attribute
'get_payload'". I'm trying to call a method here, not an attribute. It makes
no difference if I put parentheses after get_payload or not. I looked into
the email/Message module and found get_payload defined there.

I don't want to waste your time by requesting that you pick apart my silly
example. But maybe you can give me a pointer in the right direction. This is
python 2.4 on a Debian box.

---------------------------

#!/usr/bin/python
import mailbox
import email # doesn't make a difference
from email import Message # neither does this

mbox = file("mail.old/friends")

for message in mailbox.UnixMailbox(mbox):
subject = message['subject']
frm = message['from']
# to = message['to'] # this throws a "Key Error"
to = message.get('to'); # ...but this works
print frm, "writes about", subject, "to", to
# print message.get_payload() # this doesn't work

--------------------------

robert
Dec 20 '07 #1
5 1171
Robert Latest wrote:
Hello,

I'm new to Python but have lots of programming experience in C, C++ and
Perl. Browsing through the docs, the email handling modules caught my eye
because I'd always wanted to write a script to handle my huge, ancient, and
partially corrupted email archives.

Of course I know that this kind of project shouldn't be tackled by a
beginner in a language, but I still thought I'd give it a spin.

So I wrote the stuff at the bottom. It lists senders, subjects and
addressees of all messages in an mbox.

Things that I don't understand:

1. Why can I get the 'subject' and 'from' header field unsig the []
notation, but not 'to'? When I print Message.keys I get a list of all header
fields of the message, including 'to'. What's the difference between
message['to'] and message.get('to')?
On dicts, and presumably on Messages too, .get returns a default value
(None, or you can specify another with .get("key", "default") if the key
doesn't exist.

I can't say why ['to'] doesn't work when it's in the list of keys, though.
2. Why can't I call the get_payload() method on the message? What I get is
this cryptic error: "AttributeError: Message instance has no attribute
'get_payload'". I'm trying to call a method here, not an attribute. It makes
no difference if I put parentheses after get_payload or not. I looked into
the email/Message module and found get_payload defined there.
Methods are attributes. When you do "obj.method()", "obj.method" and
"()" are really two separate things: It gets the "method" attribute of
"obj", and then calls it.
I don't want to waste your time by requesting that you pick apart my silly
example. But maybe you can give me a pointer in the right direction. This is
python 2.4 on a Debian box.

---------------------------

#!/usr/bin/python
import mailbox
import email # doesn't make a difference
from email import Message # neither does this

mbox = file("mail.old/friends")

for message in mailbox.UnixMailbox(mbox):
subject = message['subject']
frm = message['from']
# to = message['to'] # this throws a "Key Error"
to = message.get('to'); # ...but this works
print frm, "writes about", subject, "to", to
# print message.get_payload() # this doesn't work

--------------------------

robert
(Oops, I wrote this like half an hour ago, but I never sent it.)
--
Dec 20 '07 #2
On Thu, 20 Dec 2007 09:31:10 +0000, Robert Latest wrote:
1. Why can I get the 'subject' and 'from' header field unsig the []
notation, but not 'to'? When I print Message.keys I get a list of all
header fields of the message, including 'to'. What's the difference
between message['to'] and message.get('to')?
message['to'] looks up the key 'to', raising an exception if it doesn't
exist. message.get('to') looks up the key and returns a default value if
it doesn't exist.

See help(message.get) for more detail.

2. Why can't I call the get_payload() method on the message? What I get
is this cryptic error: "AttributeError: Message instance has no
attribute 'get_payload'". I'm trying to call a method here, not an
attribute. It makes no difference if I put parentheses after get_payload
or not. I looked into the email/Message module and found get_payload
defined there.
All methods are attributes (although the opposite is not the case), so if
a method doesn't exist, you will get an AttributeError.

The email.Message.Message class has a get_payload, but you're not using
that class. You're using mailbox.UnixMailbox, which returns an instance
of rfc822.Message which *doesn't* have a get_payload method.

Damned if I can work out how to actually *use* the email module to read
an mbox mail box. I might have to RTFM :(

http://docs.python.org/lib/module-email.html
http://docs.python.org/lib/module-mailbox.html
*later*

Ah! The Fine Manual is some help after all. Try this:

# copied from http://docs.python.org/lib/mailbox-deprecated.html
import email
import email.Errors
import mailbox
def msgfactory(fp):
try:
return email.message_from_file(fp)
except email.Errors.MessageParseError:
# Don't return None since that will
# stop the mailbox iterator
return ''

fp = file('mymailbox', 'rb')
mbox = mailbox.UnixMailbox(fp, msgfactory)
for message in mbox:
print message.get_payload()

But note that message.get_payload() will return either a string (for
single part emails) or a list of Messages (for multi-part messages).
--
Steven
Dec 20 '07 #3
Steven D'Aprano <st***@remove-this-cybersource.com.auwrote:
On Thu, 20 Dec 2007 09:31:10 +0000, Robert Latest wrote:
[snip most of question and helpful answer]
>
But note that message.get_payload() will return either a string (for
single part emails) or a list of Messages (for multi-part messages).
Note also that the mailbox module in python 2.5 is quite unlike the
mailbox module in python 2.4 so code written for the 2.4 mailbox will
be most unlikely to work under 2.5 without at least some changes.

At least that's my experience/understanding.

Also, from the way things currently work in the 2.5 version I think
there will (hopefully) be some more quite significant changes.

--
Chris Green
Dec 20 '07 #4
Steven D'Aprano wrote:
message['to'] looks up the key 'to', raising an exception if it doesn't
exist. message.get('to') looks up the key and returns a default value if
it doesn't exist.
Ah, so the [] notation got hung up on some message right at the beginning
and didn't even let the script continue. Makes sense.
All methods are attributes (although the opposite is not the case), so if
a method doesn't exist, you will get an AttributeError.
I see. I've already gathered that Python likes to use different words for
common things (attribute instead of member or method).
Damned if I can work out how to actually *use* the email module to read
an mbox mail box. I might have to RTFM :(
Yeah, I think I haven't picked the right module to get started with.
But note that message.get_payload() will return either a string (for
single part emails) or a list of Messages (for multi-part messages).
Yes, I did note that.

Thanks for the tips (also to the others who have answered).

Python looks like fun though. Maybe I should try to tackle some other
problem first.

robert
Dec 20 '07 #5
On Dec 20, 4:15 pm, Robert Latest <boblat...@yahoo.comwrote:
Steven D'Aprano wrote:
[...]
All methods are attributes (although the opposite is not the case), so if
a method doesn't exist, you will get an AttributeError.

I see. I've already gathered that Python likes to use different words for
common things (attribute instead of member or method).
...we were hoping it would make you feel comfy when coming from
Perl ;-)

On a more serious note, Python is actually quite semantically regular:
a dot (.) always means the same thing, as do parens. It might not be
immediately obvious exactly _what_ it means if you're coming from
languages that confuse issues with syntactic sweetness.

When you see code that says

foo.bar(baz)

there are two distinct operations happening, namely

tmp = foo.bar # attribute lookup (the dot-operator)
tmp(baz) # function call (the paren-operator)

this will give you the insight to one of the first optimization
methods you can use if a loop is a bottleneck

for i in range(100000):
foo.bar(i) # profiling says this is a bottleneck

attribute lookup hoisting optimization

tmp = foo.bar # move attribute lookup outside the loop
for i in range(100000):
tmp(i)

in the interest of full disclosure, I should probably mention that I'm
of course lying to you ;-) You can override both attribute lookup and
function call in your own classes, but (a) that shouldn't be important
to you at this point *wink*, and (b) at that level Python is quite
semantically regular.

-- bjorn
Dec 21 '07 #6

This discussion thread is closed

Replies have been disabled for this discussion.

Similar topics

5 posts views Thread by sea | last post: by
8 posts views Thread by Hardy Wang | last post: by
2 posts views Thread by akshayrao | last post: by
56 posts views Thread by spibou | last post: by
13 posts views Thread by Eric_Dexter | last post: by
4 posts views Thread by =?Utf-8?B?Um9iIE1pbGxtYW4=?= | last post: by
25 posts views Thread by Thomas R. Hummel | last post: by
56 posts views Thread by mdh | last post: by
reply views Thread by XIAOLAOHU | last post: by

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.