473,325 Members | 2,792 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,325 software developers and data experts.

simple script to read and parse mailbox

Hi , Im trying to parse a specific users mailbox (testwwws) and output
the body of the messages to a file ,that file will then be loaded into a
PostGresql DB at some point .

I have read the email posts and been advised to use the email Module
and mailbox Module.

The blurb from a memeber of this list . Im not at work at the moment So
I cant test this out , but if someone could take a look and check that
im on the write track as this Monday I need to show my Boss and get the
body data out of the user's mailbox.

**Blurb form a member who's directed me**

Thus started with the mailbox and email modules. Mailbox lets you iterate over a
mailbox yielding individual messages of type email. The e-mail object lets
you parse and operate on the message components. From there you should be
able to extract your data.


## The email messages is read as flat text form a file or other source,
##the text is parsed to produce the object structure of the email message.
#!/usr/bon/env python
import mboxutils
import mailbox
import email
import sys
import os
import rfc822
import StringIO
import email.Parser
import types

# email package for managing email messages
# Open Users Mailbox
# class Message()
#mbox = mailbox.UnixMailbox(open("/var/spool/mail/chucka"))

def main():

# The Directory that will contain the Survey Results

dir = "/tmp/SurveyResults/"

# The Web Survey User Inbox
# Mailbox /home/testwwws/Mail/inbox

maildir = "/home/testwwws/Mail/inbox"
for file in os.listdir(maildir):

print os.path.join(maildir, file)

fp = open(os.path.join(maildir, file), "rb")
p = email.Parser.Parser()
msg = p.parse(fp)
fp.close()
#print msg.get("From")
#print msg.get("Content-Type")

counter = 1
for part in msg.walk():
if part.get_main_type() == 'multipart':
continue

filename = part.get_param("name")
if filename==None:
filename = "part-%i" % counter
counter += 1
fp = open(os.path.join(dir, filename), 'wb')
print os.path.join(dir, filename)
fp.write(part.get_payload(decode=1))
fp.close()
if __name__ == '__main__':
main()

Cheers all this list has been very helpful.
Jul 18 '05 #1
6 7666
On Sat, 05 Jun 2004 15:27:36 +0100, chuck amadi
<ch*********@ntlworld.com> wrote:
Hi , Im trying to parse a specific users mailbox (testwwws) and output
the body of the messages to a file ,that file will then be loaded into a
PostGresql DB at some point .

I have read the email posts and been advised to use the email Module
and mailbox Module.

The blurb from a memeber of this list . Im not at work at the moment So
I cant test this out , but if someone could take a look and check that
im on the write track as this Monday I need to show my Boss and get the
body data out of the user's mailbox.

**Blurb form a member who's directed me**

Thus started with the mailbox and email modules. Mailbox lets you iterate over a
mailbox yielding individual messages of type email. The e-mail object lets
you parse and operate on the message components. From there you should be
able to extract your data.

Hi again Chuck,

I've been reading a few of your posts and I'm wondering. Are the
emails that you're parsing have binary attachments, like pictures and
stuff, or are you just trying to get the text of the body?

Or is it a little of both? It looks like you're expecting emails with
multiple binary attachments.

Other than that it looks good. You can access the header fields
directly, like:

print msg['From']

Save you a little typing.
<{{{*>


Jul 18 '05 #2
fishboy wrote:
On Sat, 05 Jun 2004 15:27:36 +0100, chuck amadi
<ch*********@ntlworld.com> wrote:
Hi , Im trying to parse a specific users mailbox (testwwws) and output
the body of the messages to a file ,that file will then be loaded into a
PostGresql DB at some point .

I have read the email posts and been advised to use the email Module
and mailbox Module.

The blurb from a memeber of this list . Im not at work at the moment So
I cant test this out , but if someone could take a look and check that
im on the write track as this Monday I need to show my Boss and get the
body data out of the user's mailbox.

**Blurb form a member who's directed me**

Thus started with the mailbox and email modules. Mailbox lets you iterate over a
mailbox yielding individual messages of type email. The e-mail object lets
you parse and operate on the message components. From there you should be
able to extract your data.


Hi again Chuck,

I've been reading a few of your posts and I'm wondering. Are the
emails that you're parsing have binary attachments, like pictures and
stuff, or are you just trying to get the text of the body?

Or is it a little of both? It looks like you're expecting emails with
multiple binary attachments.

Other than that it looks good. You can access the header fields
directly, like:

print msg['From']

Save you a little typing.
<{{{*>


Just Trying to get the body of the messages .

I have built and developed a dtml zope web form that encapsulate the
survey data.
Thus the created user testwwws mail box with have the results that I
must parse and process to a file that I can then use to populate a
database .

Cheers Chuck
Jul 18 '05 #3
fishboy wrote:
On Sat, 05 Jun 2004 15:27:36 +0100, chuck amadi
<ch*********@ntlworld.com> wrote:
Hi , Im trying to parse a specific users mailbox (testwwws) and output
the body of the messages to a file ,that file will then be loaded into a
PostGresql DB at some point .

I have read the email posts and been advised to use the email Module
and mailbox Module.

The blurb from a memeber of this list . Im not at work at the moment So
I cant test this out , but if someone could take a look and check that
im on the write track as this Monday I need to show my Boss and get the
body data out of the user's mailbox.

**Blurb form a member who's directed me**

Thus started with the mailbox and email modules. Mailbox lets you iterate over a
mailbox yielding individual messages of type email. The e-mail object lets
you parse and operate on the message components. From there you should be
able to extract your data.


Hi again Chuck,

I've been reading a few of your posts and I'm wondering. Are the
emails that you're parsing have binary attachments, like pictures and
stuff, or are you just trying to get the text of the body?

Or is it a little of both? It looks like you're expecting emails with
multiple binary attachments.

Other than that it looks good. You can access the header fields
directly, like:

print msg['From']

Save you a little typing.
<{{{*>


Well I did hack most of the code . I was trying using the mboxutils
module but I could only get the headers . I assume form this script I
can get the text of the body . The reason I haven't tested is while at
work I started the write (Oops Hack ) the script then emailed it home .
Because I use pop3 account I onlt have a /var/spool/mail/Chucka not as
in work /home/User/Mail/inbox that I usuaslly scan to view data in inbox.

So please re-affirm that my hack script will be able to parse the text
of the body ( No attachments of binaries will exist within the email
messages.

Cheers for you help.

print msg['Body']

I just need the text of the body. But from your psi I can -
Jul 18 '05 #4
On Sun, 06 Jun 2004 11:15:22 +0100, chuck amadi
<ch*********@ntlworld.com> wrote:
Well I did hack most of the code . I was trying using the mboxutils
module but I could only get the headers . I assume form this script I
can get the text of the body . The reason I haven't tested is while at
work I started the write (Oops Hack ) the script then emailed it home .
Because I use pop3 account I onlt have a /var/spool/mail/Chucka not as
in work /home/User/Mail/inbox that I usuaslly scan to view data in inbox.

So please re-affirm that my hack script will be able to parse the text
of the body ( No attachments of binaries will exist within the email
messages.

Cheers for you help.

print msg['Body']

I just need the text of the body. But from your psi I can -


Ah, the problem is far too simple for our complicated minds.
just do:
body = msg.get_payload()
That will give you the plain text message body of an email

get_payload(decode=True) is for binary stuff (or maybe unicode, maybe)
all that get_content_type(),get_param() stuff can be ignored if you're
just doing plain text
The script you are adapting is for multiple binary (like pictures)
attachments

So, looking at the doc page for mailbox there's an interesting code
fragment:

import email
import mailbox
mbox = mailbox.UnixMailbox(fp, email.message_from_file)

So if you emails are all plain/text you could just write:

import email
import mailbox
fp = open("/var/spool/mail/chucka")
mbox = mailbox.UnixMailbox(fp, email.message_from_file)
bodies = []
for msg in mbox:
body = msg.get_payload()
bodies.append(body)

Which will leave you with a list of strings, each one a message body.

msg = email.message_from_file(fileobj) does the same thing as

p = email.Parser.Parser()
msg = p.parse(fileobj)

it's just a short cut
As is passing Unixmailbox email.message_from_file as a handler

You could also do

fp = open("/var/spool/mail/chucka")
mbox = mailbox.UnixMailbox(fp) # no handler
for mail in mbox:
msg = email.message_from_file(mail) # handle here
body = msg.get_payload()
Hth,<{{{*>


Jul 18 '05 #5

Hi all exspecailly fishboy here's the script I'm just waiting to get
confirmation where im going to run the script form.

I have added a output =('/tmp/SurveyResults','w+a') which I believe will
process the body messages data to this file for future work ie database
loading.

Also that using I can add 'a' opens the file # for appending any data written
to the file is automatically added to the end.Is this logical .I have tried to
# comments to aid my learning process So bear with me.
chuck@sevenofnine:~/pythonScript> cat getSurveyMail.py
################################################## #############
## This script will open and parse email messages body content.
## This Python script will reside on Mail Server on ds9:
## Emails are all plain/text you could just write the following
## Which will leave a list of strings , each one a message body.
## The Survey User is testwws and the .procmailrc file folder is
## Survey . i.e /home/testwws/Mail/inbox/Survey .
################################################## #############
## file:getSurveyMail.py Created : 06/06/04 Amended date: 07/06/04
################################################## #############

#The following line makes it run itself(executable script on UN*X)
#!/usr/bin/env python

import sys
import os
import email
import mailbox

# Open the testwws user mailbox (tmp user chuck)
# fp denotes factory paremeter

output =('/tmp/SurveyResults','w+a')
fp = open("/var/spool/mail/chuck")

#fp = open("/var/spool/mail/testwws")

# message_from_file returns a message object struct tree from an
# open file object.

mbox = mailbox.UnixMailbox(fp, email.message_from_file)
# list of body messages.
bodies = []

# for loop iterates through the msg in the mbox(mailbox).
# Subparts of messages can be accessed via the -
# get_payload() method will return a string object.
# If it is multipart, use the "walk" method to iterate through each part and
the
# get the payload.In our case it's not multipart So ignore.
# for part in msg.walk():
# msg = part.get_payload()
# # do something(print)

for msg in mbox:
body = msg.get_payload()
bodies.append(body)
# Print to screen for testing purposes.
# print the bodies list of the messages.
print bodies
chuck@sevenofnine:~/pythonScript> vi getSurveyMail.py
chuck@sevenofnine:~/pythonScript> python getSurveyMail.py
[]

The last line I assume would list all the body messages within the bodies list []/

Cheers for all your help list.
Jul 18 '05 #6
Sorry to bovver you again (again) here's script.

I still can't see why the get_payload() doesn't produce
the plain text message body of an emails in the testwwws users mailbox.
AS you can see I have tried a few things but no joy what am I missing.

Cheers

Chuck

ds9:[pythonScriptMail] % cat getSurveyMail.py
################################################## #############
## This script will open and parse email messages body content.
## This Python script will reside on Mail Server on ds9:
## Emails are all plain/text you could just write the following
## Which will leave a list of strings , each one a message body.
## The Survey User is testwws and the .procmailrc file folder is
## Survey . i.e /home/testwws/Mail/inbox/Survey .
################################################## #############
## file:getSurveyMail.py Created : 06/06/04 Amended date: 07/06/04
################################################## #############

#The following line makes it run itself(executable script on UN*X)
#!/usr/bin/env python

import sys
import os
import email
import mailbox

# Open the testwws user mailbox (tmp user chuck)
# fp denotes factory paraemeter
# mode can be 'r' when the file will only be read, 'w' for only writing
#(an existing file with the same name will be erased), and 'a' opens the file
# for appending; any data written to the file is automatically added to the
end.
# 'r+' opens the file for both reading and writing. The mode.
output =("/tmp/SurveyResults", "w+a")
#output =('/tmp/SurveyResults','w')

# open() returns a file object, and is most commonly used with two arguments:
# "open(filename, mode)".
# /home/testwwws/Mail/work
#
#fp The file or file-like object passed at instantiation time. This can be
used to read the message content.
fp = open("/var/spool/mail/testwwws")

#fp = open("/home/testwwws/Mail/work")

# message_from_file returns a message object struct tree from an
# open file object.

mbox = mailbox.UnixMailbox(fp, email.message_from_file)
# list of body messages.
bodies = []

msg = email.message_from_file(fp)
# for loop iterates through the msg in the mbox(mailbox).
# Subparts of messages can be accessed via the -
# get_payload() method will return a string object.
# If it is multipart, use the "walk" method to iterate through each part and
the
# get the payload.In our case it's not multipart So ignore.
# for part in msg.walk():
# msg = part.get_payload()
# # do something(print)

for msg in mbox:
body = msg.get_payload()
bodies.append(body)
# output.close() to close it and free up any system resources taken up by the
open file.
# After calling output.close(), attempts to use the file object will
automatically fail.
#print bodies
print fp
print msg
print msg['body']
# print body - NameError: name 'msg' is not defined
#
#print >> output,bodies
#output.close()
#print the bodies list of the messages.
print bodies

Jul 18 '05 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

9
by: Alex Martelli | last post by:
All my mailboxes have been filling up with files of about 130k to 150k, no doubt copies of some immensely popular virus. So, I've no doubt lost lots of real mail because of "mailbox full"...
1
by: Paul Rubin | last post by:
I was surprised there was no obvious way with spamassassin (maybe I shoulda looked at spambayes) to split an existing mbox file into its spam and non-spam messages. So I wrote one. It's pretty...
1
by: chuck amadi | last post by:
any python script which will parse an email messages into a file to poplulate a database. Im trying with UnixMailbox but I cant figure out howto abstract the all email data messages to a file . ...
4
by: Chuck Amadi | last post by:
Has anyone got a simple python script that will parse a linux mbox and create a large file to view . Cheers Chu
16
by: Chuck Amadi | last post by:
Sorry to bovver you again (again) here's script. I still can't see why the get_payload() doesn't produce the plain text message body of an emails in the testwwws users mailbox. As you can see I...
1
by: Allen | last post by:
I am trying to add an additional photo/hyperlink to the company web site (I didn't create it) without any luck. The mouseover feature 'highlights' pics by swapping them with another pic using this...
0
by: matej | last post by:
Hi, I am writing a script to convert couple of thousand emails (in couple of hundred folders) and before I will get to the hard part -- maintaing structure folders and subfolders, and maintaing...
3
plumpnation
by: plumpnation | last post by:
Warning: mail(): SMTP server response: 550 Requested action not taken: mailbox unavailable or not local in..... This is the error message. The history of this situation is this: We took a...
0
by: Grzegorz Smith | last post by:
Hi All. I 'm learning ZSI to use SOAP and I desperately need help. I'm working on example from tutorial -(examples/server/send_response/ simple/wsdl/). Here are my wsdl files...
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: ryjfgjl | last post by:
ExcelToDatabase: batch import excel into database automatically...
0
by: jfyes | last post by:
As a hardware engineer, after seeing that CEIWEI recently released a new tool for Modbus RTU Over TCP/UDP filtering and monitoring, I actively went to its official website to take a look. It turned...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...
1
by: CloudSolutions | last post by:
Introduction: For many beginners and individual users, requiring a credit card and email registration may pose a barrier when starting to use cloud servers. However, some cloud server providers now...
1
by: Shællîpôpï 09 | last post by:
If u are using a keypad phone, how do u turn on JavaScript, to access features like WhatsApp, Facebook, Instagram....
0
by: af34tf | last post by:
Hi Guys, I have a domain whose name is BytesLimited.com, and I want to sell it. Does anyone know about platforms that allow me to list my domain in auction for free. Thank you
0
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 3 Apr 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome former...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.