473,287 Members | 3,240 Online
Bytes | Software Development & Data Engineering Community
Post Job

Home Posts Topics Members FAQ

Join Bytes to post your question to a community of 473,287 software developers and data experts.

Extract zip file from email attachment

Hi all,

I'm trying to extract zip file (containing an xml file) from an email
so I can process it. But I'm running up against some brick walls.
I've been googling and reading all afternoon, and can't seem to figure
it out.

Here is what I have so far.

p = POP3("mail.server.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+OK'):
# Handle error
exit(1)

for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+OK'):
#print resp, '=======================\n'
#extract message body and attachment.
parsed_msg = email.message_from_string('\n'.join(resp[1]))
payload= parsed_msg.get_payload(decode=True)
print payload #doesn't seem to work
else:
pass# Deal with error retrieving message.

How do I:
a) retrieve the body of the email into a string so I can do some
processing? (I can get at the header attributes without any trouble)
b) retrieve the zip file attachment, and unzip into a string for xml
processing?

Thanks so much for your help!
Erik

Apr 5 '07 #1
7 7492
erikcw wrote:
Hi all,

I'm trying to extract zip file (containing an xml file) from an email
so I can process it. But I'm running up against some brick walls.
I've been googling and reading all afternoon, and can't seem to figure
it out.

Here is what I have so far.

p = POP3("mail.server.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+OK'):
# Handle error
exit(1)

for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+OK'):
#print resp, '=======================\n'
#extract message body and attachment.
parsed_msg = email.message_from_string('\n'.join(resp[1]))
payload= parsed_msg.get_payload(decode=True)
print payload #doesn't seem to work
else:
pass# Deal with error retrieving message.

How do I:
a) retrieve the body of the email into a string so I can do some
processing? (I can get at the header attributes without any trouble)
b) retrieve the zip file attachment, and unzip into a string for xml
processing?

Thanks so much for your help!
Erik
Hi,

some weeks ago I wrote some code to extract attachments from emails.
It's not that long, so maybe it could be of help for you:

-------------------------------------------

#!/usr/bin/env python

import poplib
import email
import os
import sys
import string

#
# attsave.py
# Check emails at PROVIDER for attachments and save them to SAVEDIR.
#

PROVIDER = "pop.YourMailProvider.de"
USER = "YourUserName"
PASSWORD = "YourPassword"

SAVEDIR = "/home/YourUserDirectory"
def saveAttachment(mstring):

filenames = []
attachedcontents = []

msg = email.message_from_string(mstring)

for part in msg.walk():

fn = part.get_filename()

if fn <None:
filenames.append(fn)
attachedcontents.append(part.get_payload())

for i in range(len(filenames)):
fp = file(SAVEDIR + "/" + filenames[i], "wb")
fp.write(attachedcontents[i])
print 'Found and saved attachment "' + filenames[i] + '".'
fp.close()

try:
client = poplib.POP3(PROVIDER)
except:
print "Error: Provider not found."
sys.exit(1)

client.user(USER)
client.pass_(PASSWORD)

anzahl_mails = len(client.list()[1])

for i in range(anzahl_mails):
lines = client.retr(i + 1)[1]
mailstring = string.join(lines, "\n")
saveAttachment(mailstring)

client.quit()

-------------------------------------------

See you

H.
Apr 6 '07 #2
On Apr 5, 8:00 pm, hlubenow <hluben...@gmx.netwrote:
erikcw wrote:
Hi all,
I'm trying to extract zip file (containing an xml file) from an email
so I can process it. But I'm running up against some brick walls.
I've been googling and reading all afternoon, and can't seem to figure
it out.
Here is what I have so far.
p = POP3("mail.server.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+OK'):
# Handle error
exit(1)
for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+OK'):
#print resp, '=======================\n'
#extract message body and attachment.
parsed_msg = email.message_from_string('\n'.join(resp[1]))
payload= parsed_msg.get_payload(decode=True)
print payload #doesn't seem to work
else:
pass# Deal with error retrieving message.
How do I:
a) retrieve the body of the email into a string so I can do some
processing? (I can get at the header attributes without any trouble)
b) retrieve the zip file attachment, and unzip into a string for xml
processing?
Thanks so much for your help!
Erik

Hi,

some weeks ago I wrote some code to extract attachments from emails.
It's not that long, so maybe it could be of help for you:

-------------------------------------------

#!/usr/bin/env python

import poplib
import email
import os
import sys
import string

#
# attsave.py
# Check emails at PROVIDER for attachments and save them to SAVEDIR.
#

PROVIDER = "pop.YourMailProvider.de"
USER = "YourUserName"
PASSWORD = "YourPassword"

SAVEDIR = "/home/YourUserDirectory"

def saveAttachment(mstring):

filenames = []
attachedcontents = []

msg = email.message_from_string(mstring)

for part in msg.walk():

fn = part.get_filename()

if fn <None:
filenames.append(fn)
attachedcontents.append(part.get_payload())

for i in range(len(filenames)):
fp = file(SAVEDIR + "/" + filenames[i], "wb")
fp.write(attachedcontents[i])
print 'Found and saved attachment "' + filenames[i] + '".'
fp.close()

try:
client = poplib.POP3(PROVIDER)
except:
print "Error: Provider not found."
sys.exit(1)

client.user(USER)
client.pass_(PASSWORD)

anzahl_mails = len(client.list()[1])

for i in range(anzahl_mails):
lines = client.retr(i + 1)[1]
mailstring = string.join(lines, "\n")
saveAttachment(mailstring)

client.quit()

-------------------------------------------

See you

H.
Thanks H!

I'm now able to get the name of the zip file, and the contents (is it
still encoded?).

I now need to be able to unzip the zip file into a string and get the
body of the email into a string.

Here is my updated code:
p = POP3("mail.**********.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+OK'):
# Handle error in listings
exit(1)

for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+OK'):
#print resp, '=======================\n'
parsed_msg = email.message_from_string('\n'.join(resp[1]))
for part in parsed_msg.walk():
fn = part.get_filename()
if fn <None:
fileObj = StringIO.StringIO()
fileObj.write( part.get_payload() )
#attachment = zlib.decompress(part.get_payload())
#print zipfile.is_zipfile(fileObj)
attachment = zipfile.ZipFile(fileObj)
print fn, '\n', attachment
payload= parsed_msg.get_payload(decode=True)
print payload

else:
pass# Deal with error retrieving message.
I get this error:
Traceback (most recent call last):
File "wa.py", line 208, in <module>
attachment = zipfile.ZipFile(fileObj)
File "/usr/lib/python2.5/zipfile.py", line 346, in __init__
self._GetContents()
File "/usr/lib/python2.5/zipfile.py", line 366, in _GetContents
self._RealGetContents()
File "/usr/lib/python2.5/zipfile.py", line 378, in _RealGetContents
raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file

Is the zip file still encoded? Or am I passing in the wrong arguments
to the zipfile module?

Thanks for your help!
Erik

Apr 6 '07 #3

erikcw wrote:

resp = p.retr(msg_num)
if resp[0].startswith('+OK'):
You don't have to check this; errors are transformed into exceptions.
fileObj = StringIO.StringIO()
cStringIO is faster
fileObj.write( part.get_payload() )
You have to reset the file pointer to the beginning: fileObj.seek(0),
else ZipFile will not be able to read the contents.

--
Gabriel Genellina

Apr 6 '07 #4
On Apr 6, 12:51 am, "Gabriel Genellina" <gagsl-...@yahoo.com.ar>
wrote:
erikcw wrote:
resp = p.retr(msg_num)
if resp[0].startswith('+OK'):

You don't have to check this; errors are transformed into exceptions.
fileObj = StringIO.StringIO()

cStringIO is faster
fileObj.write( part.get_payload() )

You have to reset the file pointer to the beginning: fileObj.seek(0),
else ZipFile will not be able to read the contents.

--
Gabriel Genellina
Hi Gabriel,

I added fileObj.seek(0) on the line directly after
fileObj.write( part.get_payload() ) and I'm still getting the
following error.

Traceback (most recent call last):
File "wa.py", line 209, in <module>
attachment = zipfile.ZipFile(fileObj)
File "/usr/lib/python2.5/zipfile.py", line 346, in __init__
self._GetContents()
File "/usr/lib/python2.5/zipfile.py", line 366, in _GetContents
self._RealGetContents()
File "/usr/lib/python2.5/zipfile.py", line 378, in _RealGetContents
raise BadZipfile, "File is not a zip file"
zipfile.BadZipfile: File is not a zip file

Could the file like object still be encoded in MIME or something?

Thanks!
Erik

Apr 6 '07 #5
>
Could the file like object still be encoded in MIME or something?
Yes it is. You don't need to seek(0).
Try this:

decoded = email.base64mime.decode(part.get_payload())
fileObj.write(decoded)
-Basilisk96

Apr 7 '07 #6

Basilisk96 wrote:

Could the file like object still be encoded in MIME or something?

Yes it is. You don't need to seek(0).
Try this:

decoded = email.base64mime.decode(part.get_payload())
fileObj.write(decoded)
-Basilisk96
Apr 7 '07 #7

Basilisk96 wrote:

Could the file like object still be encoded in MIME or something?

Yes it is. You don't need to seek(0).
Try this:

decoded = email.base64mime.decode(part.get_payload())
fileObj.write(decoded)
Or better:
decoded = part.get_payload(decode=True)
fileObj.write(decoded)
fileObj.seek(0)
zip = zipfile.ZipFile(fileObj)
zip.printdir()

Apr 7 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
by: chuck amadi | last post by:
By the way list is there a better way than using the readlines() to > > >parse the mail data into a file , because Im using > > >email.message_from_file it returns > > >all the data i.e reads one...
3
by: Selen | last post by:
I would like to be able to extract a BLOB from the database (SqlServer) and pass it to a browser without writing it to a file. (The BLOB's are word doc's, MS project doc's, and Excel spreadsheets....
4
by: moondaddy | last post by:
Using vb.net I need to download image files to the client browser where they can save to disk. Below is some sample code I'm using. when I run this the File Download window in the browser says: ...
1
by: JohnRHarlow | last post by:
Hi: I am looking for advice on the best way to set up a process to read incoming emails (from a normal unix mailbox on the same host) containing a gzipped telemetry attachment. I'd like the...
9
by: deepaks85 | last post by:
Dear Sir, I have created a simple request form which will be mailed to me. Now I want to attach files and send it through that request form. For this I am using the following script: ...
0
by: suis | last post by:
Hi Everybody, I have a big dought about, how to read meta data information in a specific file type like .MSG , anyway thanks to this URL bellow . Edanmo has done a great job for that. ...
1
by: Edwin.Madari | last post by:
from each line separate out url and request parts. split the request into key-value pairs, use urllib to unquote key-value pairs......as show below... import urllib line = "GET...
7
by: =?Utf-8?B?QmVu?= | last post by:
Hi I am looking for a way to extraxt an icon from a .exe file an save it as an icon not a bitmap or jpeg to a file? The code below extracts the icon but only as a bitmap PictureBox1.Image =...
2
by: Erik Witkop | last post by:
So I have been trying to get this to work all day. I can't get a local file on my web server to attach to an email. Right now I have it printing out in the body of the email. Please help me with...
2
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 7 Feb 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:30 (7.30PM). In this month's session, the creator of the excellent VBE...
0
by: MeoLessi9 | last post by:
I have VirtualBox installed on Windows 11 and now I would like to install Kali on a virtual machine. However, on the official website, I see two options: "Installer images" and "Virtual machines"....
0
by: DolphinDB | last post by:
Tired of spending countless mintues downsampling your data? Look no further! In this article, you’ll learn how to efficiently downsample 6.48 billion high-frequency records to 61 million...
0
by: Aftab Ahmad | last post by:
So, I have written a code for a cmd called "Send WhatsApp Message" to open and send WhatsApp messaage. The code is given below. Dim IE As Object Set IE =...
0
by: marcoviolo | last post by:
Dear all, I would like to implement on my worksheet an vlookup dynamic , that consider a change of pivot excel via win32com, from an external excel (without open it) and save the new file into a...
1
isladogs
by: isladogs | last post by:
The next Access Europe meeting will be on Wednesday 6 Mar 2024 starting at 18:00 UK time (6PM UTC) and finishing at about 19:15 (7.15PM). In this month's session, we are pleased to welcome back...
0
by: Vimpel783 | last post by:
Hello! Guys, I found this code on the Internet, but I need to modify it a little. It works well, the problem is this: Data is sent from only one cell, in this case B5, but it is necessary that data...
0
by: ArrayDB | last post by:
The error message I've encountered is; ERROR:root:Error generating model response: exception: access violation writing 0x0000000000005140, which seems to be indicative of an access violation...
1
by: PapaRatzi | last post by:
Hello, I am teaching myself MS Access forms design and Visual Basic. I've created a table to capture a list of Top 30 singles and forms to capture new entries. The final step is a form (unbound)...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.