473,583 Members | 3,386 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Extract zip file from email attachment

Hi all,

I'm trying to extract zip file (containing an xml file) from an email
so I can process it. But I'm running up against some brick walls.
I've been googling and reading all afternoon, and can't seem to figure
it out.

Here is what I have so far.

p = POP3("mail.serv er.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+O K'):
# Handle error
exit(1)

for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+O K'):
#print resp, '============== =========\n'
#extract message body and attachment.
parsed_msg = email.message_f rom_string('\n' .join(resp[1]))
payload= parsed_msg.get_ payload(decode= True)
print payload #doesn't seem to work
else:
pass# Deal with error retrieving message.

How do I:
a) retrieve the body of the email into a string so I can do some
processing? (I can get at the header attributes without any trouble)
b) retrieve the zip file attachment, and unzip into a string for xml
processing?

Thanks so much for your help!
Erik

Apr 5 '07 #1
7 7591
erikcw wrote:
Hi all,

I'm trying to extract zip file (containing an xml file) from an email
so I can process it. But I'm running up against some brick walls.
I've been googling and reading all afternoon, and can't seem to figure
it out.

Here is what I have so far.

p = POP3("mail.serv er.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+O K'):
# Handle error
exit(1)

for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+O K'):
#print resp, '============== =========\n'
#extract message body and attachment.
parsed_msg = email.message_f rom_string('\n' .join(resp[1]))
payload= parsed_msg.get_ payload(decode= True)
print payload #doesn't seem to work
else:
pass# Deal with error retrieving message.

How do I:
a) retrieve the body of the email into a string so I can do some
processing? (I can get at the header attributes without any trouble)
b) retrieve the zip file attachment, and unzip into a string for xml
processing?

Thanks so much for your help!
Erik
Hi,

some weeks ago I wrote some code to extract attachments from emails.
It's not that long, so maybe it could be of help for you:

-------------------------------------------

#!/usr/bin/env python

import poplib
import email
import os
import sys
import string

#
# attsave.py
# Check emails at PROVIDER for attachments and save them to SAVEDIR.
#

PROVIDER = "pop.YourMailPr ovider.de"
USER = "YourUserNa me"
PASSWORD = "YourPasswo rd"

SAVEDIR = "/home/YourUserDirecto ry"
def saveAttachment( mstring):

filenames = []
attachedcontent s = []

msg = email.message_f rom_string(mstr ing)

for part in msg.walk():

fn = part.get_filena me()

if fn <None:
filenames.appen d(fn)
attachedcontent s.append(part.g et_payload())

for i in range(len(filen ames)):
fp = file(SAVEDIR + "/" + filenames[i], "wb")
fp.write(attach edcontents[i])
print 'Found and saved attachment "' + filenames[i] + '".'
fp.close()

try:
client = poplib.POP3(PRO VIDER)
except:
print "Error: Provider not found."
sys.exit(1)

client.user(USE R)
client.pass_(PA SSWORD)

anzahl_mails = len(client.list ()[1])

for i in range(anzahl_ma ils):
lines = client.retr(i + 1)[1]
mailstring = string.join(lin es, "\n")
saveAttachment( mailstring)

client.quit()

-------------------------------------------

See you

H.
Apr 6 '07 #2
On Apr 5, 8:00 pm, hlubenow <hluben...@gmx. netwrote:
erikcw wrote:
Hi all,
I'm trying to extract zip file (containing an xml file) from an email
so I can process it. But I'm running up against some brick walls.
I've been googling and reading all afternoon, and can't seem to figure
it out.
Here is what I have so far.
p = POP3("mail.serv er.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+O K'):
# Handle error
exit(1)
for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+O K'):
#print resp, '============== =========\n'
#extract message body and attachment.
parsed_msg = email.message_f rom_string('\n' .join(resp[1]))
payload= parsed_msg.get_ payload(decode= True)
print payload #doesn't seem to work
else:
pass# Deal with error retrieving message.
How do I:
a) retrieve the body of the email into a string so I can do some
processing? (I can get at the header attributes without any trouble)
b) retrieve the zip file attachment, and unzip into a string for xml
processing?
Thanks so much for your help!
Erik

Hi,

some weeks ago I wrote some code to extract attachments from emails.
It's not that long, so maybe it could be of help for you:

-------------------------------------------

#!/usr/bin/env python

import poplib
import email
import os
import sys
import string

#
# attsave.py
# Check emails at PROVIDER for attachments and save them to SAVEDIR.
#

PROVIDER = "pop.YourMailPr ovider.de"
USER = "YourUserNa me"
PASSWORD = "YourPasswo rd"

SAVEDIR = "/home/YourUserDirecto ry"

def saveAttachment( mstring):

filenames = []
attachedcontent s = []

msg = email.message_f rom_string(mstr ing)

for part in msg.walk():

fn = part.get_filena me()

if fn <None:
filenames.appen d(fn)
attachedcontent s.append(part.g et_payload())

for i in range(len(filen ames)):
fp = file(SAVEDIR + "/" + filenames[i], "wb")
fp.write(attach edcontents[i])
print 'Found and saved attachment "' + filenames[i] + '".'
fp.close()

try:
client = poplib.POP3(PRO VIDER)
except:
print "Error: Provider not found."
sys.exit(1)

client.user(USE R)
client.pass_(PA SSWORD)

anzahl_mails = len(client.list ()[1])

for i in range(anzahl_ma ils):
lines = client.retr(i + 1)[1]
mailstring = string.join(lin es, "\n")
saveAttachment( mailstring)

client.quit()

-------------------------------------------

See you

H.
Thanks H!

I'm now able to get the name of the zip file, and the contents (is it
still encoded?).

I now need to be able to unzip the zip file into a string and get the
body of the email into a string.

Here is my updated code:
p = POP3("mail.**** ******.com")
print p.getwelcome()
# authentication, etc.
print p.user("USER")
print p.pass_("PASS")
print "This mailbox has %d messages, totaling %d bytes." % p.stat()
msg_list = p.list()
print msg_list
if not msg_list[0].startswith('+O K'):
# Handle error in listings
exit(1)

for msg in msg_list[1]:
msg_num, _ = msg.split()
resp = p.retr(msg_num)
if resp[0].startswith('+O K'):
#print resp, '============== =========\n'
parsed_msg = email.message_f rom_string('\n' .join(resp[1]))
for part in parsed_msg.walk ():
fn = part.get_filena me()
if fn <None:
fileObj = StringIO.String IO()
fileObj.write( part.get_payloa d() )
#attachment = zlib.decompress (part.get_paylo ad())
#print zipfile.is_zipf ile(fileObj)
attachment = zipfile.ZipFile (fileObj)
print fn, '\n', attachment
payload= parsed_msg.get_ payload(decode= True)
print payload

else:
pass# Deal with error retrieving message.
I get this error:
Traceback (most recent call last):
File "wa.py", line 208, in <module>
attachment = zipfile.ZipFile (fileObj)
File "/usr/lib/python2.5/zipfile.py", line 346, in __init__
self._GetConten ts()
File "/usr/lib/python2.5/zipfile.py", line 366, in _GetContents
self._RealGetCo ntents()
File "/usr/lib/python2.5/zipfile.py", line 378, in _RealGetContent s
raise BadZipfile, "File is not a zip file"
zipfile.BadZipf ile: File is not a zip file

Is the zip file still encoded? Or am I passing in the wrong arguments
to the zipfile module?

Thanks for your help!
Erik

Apr 6 '07 #3

erikcw wrote:

resp = p.retr(msg_num)
if resp[0].startswith('+O K'):
You don't have to check this; errors are transformed into exceptions.
fileObj = StringIO.String IO()
cStringIO is faster
fileObj.write( part.get_payloa d() )
You have to reset the file pointer to the beginning: fileObj.seek(0) ,
else ZipFile will not be able to read the contents.

--
Gabriel Genellina

Apr 6 '07 #4
On Apr 6, 12:51 am, "Gabriel Genellina" <gagsl-...@yahoo.com.a r>
wrote:
erikcw wrote:
resp = p.retr(msg_num)
if resp[0].startswith('+O K'):

You don't have to check this; errors are transformed into exceptions.
fileObj = StringIO.String IO()

cStringIO is faster
fileObj.write( part.get_payloa d() )

You have to reset the file pointer to the beginning: fileObj.seek(0) ,
else ZipFile will not be able to read the contents.

--
Gabriel Genellina
Hi Gabriel,

I added fileObj.seek(0) on the line directly after
fileObj.write( part.get_payloa d() ) and I'm still getting the
following error.

Traceback (most recent call last):
File "wa.py", line 209, in <module>
attachment = zipfile.ZipFile (fileObj)
File "/usr/lib/python2.5/zipfile.py", line 346, in __init__
self._GetConten ts()
File "/usr/lib/python2.5/zipfile.py", line 366, in _GetContents
self._RealGetCo ntents()
File "/usr/lib/python2.5/zipfile.py", line 378, in _RealGetContent s
raise BadZipfile, "File is not a zip file"
zipfile.BadZipf ile: File is not a zip file

Could the file like object still be encoded in MIME or something?

Thanks!
Erik

Apr 6 '07 #5
>
Could the file like object still be encoded in MIME or something?
Yes it is. You don't need to seek(0).
Try this:

decoded = email.base64mim e.decode(part.g et_payload())
fileObj.write(d ecoded)
-Basilisk96

Apr 7 '07 #6

Basilisk96 wrote:

Could the file like object still be encoded in MIME or something?

Yes it is. You don't need to seek(0).
Try this:

decoded = email.base64mim e.decode(part.g et_payload())
fileObj.write(d ecoded)
-Basilisk96
Apr 7 '07 #7

Basilisk96 wrote:

Could the file like object still be encoded in MIME or something?

Yes it is. You don't need to seek(0).
Try this:

decoded = email.base64mim e.decode(part.g et_payload())
fileObj.write(d ecoded)
Or better:
decoded = part.get_payloa d(decode=True)
fileObj.write(d ecoded)
fileObj.seek(0)
zip = zipfile.ZipFile (fileObj)
zip.printdir()

Apr 7 '07 #8

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

1
2391
by: chuck amadi | last post by:
By the way list is there a better way than using the readlines() to > > >parse the mail data into a file , because Im using > > >email.message_from_file it returns > > >all the data i.e reads one entire line from the file , headers as well > > >as just the desired body messages . > > > > > >fp = file("/home/chuck/pythonScript/testbox") > >...
3
3796
by: Selen | last post by:
I would like to be able to extract a BLOB from the database (SqlServer) and pass it to a browser without writing it to a file. (The BLOB's are word doc's, MS project doc's, and Excel spreadsheets. How can I do this?
4
2308
by: moondaddy | last post by:
Using vb.net I need to download image files to the client browser where they can save to disk. Below is some sample code I'm using. when I run this the File Download window in the browser says: File name: ViewAttachment File type: From localhost 1) "ViewAttachment" is the name of the aspx page and not the image file. 2) Its not...
1
3041
by: JohnRHarlow | last post by:
Hi: I am looking for advice on the best way to set up a process to read incoming emails (from a normal unix mailbox on the same host) containing a gzipped telemetry attachment. I'd like the script to extract the attachment into a directory where another process will pick it up. I plan to run it every minute out of cron, so it would probably...
9
4216
by: deepaks85 | last post by:
Dear Sir, I have created a simple request form which will be mailed to me. Now I want to attach files and send it through that request form. For this I am using the following script: <?php function mail_attachment ($from , $to, $subject, $message, $attachment){
1
13281
by: suis | last post by:
Hi Everybody, I have a big dought about, how to read meta data information in a specific file type like .MSG , anyway thanks to this URL bellow . Edanmo has done a great job for that. http://www.mvps.org/emorcillo/en/code/grl/storage.shtml but the problem is still i cant find out a way of how to extract .MSG file Attachement file in to...
1
2713
by: Edwin.Madari | last post by:
from each line separate out url and request parts. split the request into key-value pairs, use urllib to unquote key-value pairs......as show below... import urllib line = "GET...
7
2338
by: =?Utf-8?B?QmVu?= | last post by:
Hi I am looking for a way to extraxt an icon from a .exe file an save it as an icon not a bitmap or jpeg to a file? The code below extracts the icon but only as a bitmap PictureBox1.Image = Drawing.Icon.ExtractAssociatedIcon("file.exe").ToBitmap The problem is I need the icon not the bitmap.
2
4278
by: Erik Witkop | last post by:
So I have been trying to get this to work all day. I can't get a local file on my web server to attach to an email. Right now I have it printing out in the body of the email. Please help me with any thouhgts on how to get it in as an attachment. CODE: <?php ini_set(SMTP, "172.18.1.65");
0
8172
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. ...
0
8320
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that...
1
7929
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For...
0
8190
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the...
1
5697
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes...
0
5370
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and then checking html paragraph one by one. At the time of converting from word file to html my equations which are in the word document file was convert...
1
2328
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1424
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
0
1152
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.