473,772 Members | 2,388 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

How do I Extract Attachment from Newsgroup Message

I'm parsing NNTP messages that have XML file attachments. How can I
extract the encoded text back into a file? I looked for a solution
with mimetools (the way I'd approach it for email) but found nothing.

Here's a long snippet of the message:
>>n.article('11 6431')
('220 116431 <D8*******@news .ap.orgarticle' , '116431',
'<D8*******@new s.ap.org>', ['MIME-Version: 1.0', 'Message-ID:
<D8*******@news .ap.org>', 'Content-Type: Multipart/Mixed;', '
boundary="------------Boundary-00=_A5NJCP3FX6Y 5BI3BH890"', 'Date: Thu,
24 May 2007 07:41:34 -0400 (EDT)', 'From: Newsclip <ne******@ap.or g>',
'Path: newsclip.ap.org !flounder.ap.or g!flounder', 'Newsgroups:
ap.spanish.onli ne,ap.spanish.o nline.business' , 'Keywords: MUN ECO
PETROLEO PRECIOS', 'Subject: MUN ECO PETROLEO PRECIOS', 'Summary: ',
'Lines: 108', 'Xref: newsclip.ap.org ap.spanish.onli ne:938298
ap.spanish.onli ne.business:116 431', '', '', '--------------
Boundary-00=_A5NJCP3FX6Y 5BI3BH890', 'Content-Type: Text/Plain',
'Content-Transfer-Encoding: 8bit', 'Content-Description: text,
unencoded', '', '(AP) Precios del crudo se mueven sin rumbo claro',
'Por GEORGE JAHN', 'VIENA', 'Los precios

.... (truncated for length) ...

'', '___', '', 'Editores: Derrick Ho, periodista de la AP en Singapur,
contribuy\xf3 con esta informaci\xf3n. ', '', '', '--------------
Boundary-00=_A5NJCP3FX6Y 5BI3BH890', 'Content-Type: Text/Xml', 'Content-
Transfer-Encoding: base64', 'Content-Description: text, base64
encoded', '',
'PD94bWwgdmVyc2 lvbj0iMS4wIiBlb mNvZGluZz0iVVRG LTgiPz4KPCFET0N UWVBFIG5pdGYgU1 lT',
'VEVNICJuaXRmLm R0ZCI+CjxuaXRmP gogPGhlYWQ
+CiAgPG1ldGEgbm FtZT0iYXAtdHJhb nNyZWYi',
'IGNvbnRlbnQ9Il NQMTQ3MiIvPgogI DxtZXRhIG5hbWU9 ImFwLW9yaWdpbiI gY29udGVudD0ic3 Bh',
'bm9sIi8+CiAgPG 1ldGEgbmFtZT0iY XAtc2VsZWN0b3Ii IGNvbn

May 31 '07 #1
2 1941
On May 31, 8:54 am, "snewma...@gmai l.com" <snewma...@gmai l.comwrote:
I'm parsing NNTP messages that have XML file attachments. How can I
extract the encoded text back into a file? I looked for a solution
with mimetools (the way I'd approach it for email) but found nothing.

Here's a long snippet of the message:
>n.article('116 431')

('220 116431 <D8PANK...@news .ap.orgarticle' , '116431',
'<D8PANK...@new s.ap.org>', ['MIME-Version: 1.0', 'Message-ID:
<D8PANK...@news .ap.org>', 'Content-Type: Multipart/Mixed;', '
boundary="------------Boundary-00=_A5NJCP3FX6Y 5BI3BH890"', 'Date: Thu,
24 May 2007 07:41:34 -0400 (EDT)', 'From: Newsclip <newsc...@ap.or g>',
'Path: newsclip.ap.org !flounder.ap.or g!flounder', 'Newsgroups:
ap.spanish.onli ne,ap.spanish.o nline.business' , 'Keywords: MUN ECO
PETROLEO PRECIOS', 'Subject: MUN ECO PETROLEO PRECIOS', 'Summary: ',
'Lines: 108', 'Xref: newsclip.ap.org ap.spanish.onli ne:938298
ap.spanish.onli ne.business:116 431', '', '', '--------------
Boundary-00=_A5NJCP3FX6Y 5BI3BH890', 'Content-Type: Text/Plain',
'Content-Transfer-Encoding: 8bit', 'Content-Description: text,
unencoded', '', '(AP) Precios del crudo se mueven sin rumbo claro',
'Por GEORGE JAHN', 'VIENA', 'Los precios

... (truncated for length) ...

'', '___', '', 'Editores: Derrick Ho, periodista de la AP en Singapur,
contribuy\xf3 con esta informaci\xf3n. ', '', '', '--------------
Boundary-00=_A5NJCP3FX6Y 5BI3BH890', 'Content-Type: Text/Xml', 'Content-
Transfer-Encoding: base64', 'Content-Description: text, base64
encoded', '',
'PD94bWwgdmVyc2 lvbj0iMS4wIiBlb mNvZGluZz0iVVRG LTgiPz4KPCFET0N UWVBFIG5pdGYgU1 lT',
'VEVNICJuaXRmLm R0ZCI+CjxuaXRmP gogPGhlYWQ
+CiAgPG1ldGEgbm FtZT0iYXAtdHJhb nNyZWYi',
'IGNvbnRlbnQ9Il NQMTQ3MiIvPgogI DxtZXRhIG5hbWU9 ImFwLW9yaWdpbiI gY29udGVudD0ic3 Bh',
'bm9sIi8+CiAgPG 1ldGEgbmFtZT0iY XAtc2VsZWN0b3Ii IGNvbn
This looks like what you might be looking for:
http://mail.python.org/pipermail/pyt...ne/265018.html

Not sure if you'll need this or not, but here's some info on encoding/
decoding files:
http://www.jorendorff.com/articles/unicode/python.html

There are lots of ways to parse xml. I use the minidom module myself.

Mike

May 31 '07 #2
I looked for a solution
with mimetools (the way I'd approach it for email) but found nothing.
....
<D8*******@news .ap.org>', 'Content-Type: Multipart/Mixed;', '
boundary="------------Boundary-00=_A5NJCP3FX6Y 5BI3BH890"', 'Date: Thu,
....

Playing with

data = n.article('1164 31')[3]

and email.message_f rom_string, there seems to be a problem with the
content type being split up. I was able to get a multipart message by
using

msg = email.message_f rom_string('\n' .join(data).rep lace(';\n', ';'))

(and adding an ending boundary to your sample data).
This is a bit hackish and could cause problems if there are
semicolons inside the message body (no warranties expressed or
implied, etc.)

Hope this helps,
-Dave
May 31 '07 #3

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

3
3813
by: Selen | last post by:
I would like to be able to extract a BLOB from the database (SqlServer) and pass it to a browser without writing it to a file. (The BLOB's are word doc's, MS project doc's, and Excel spreadsheets. How can I do this?
2
3824
by: Chris Kane | last post by:
We have written a class that enumerates the items in a WSS list and then attemptes to open the attachment for each item. We have written two classes, one to impersonate a user and read in the list information and the other to be called by the first which actually opens the attachment. Our code fails when it tries to open the attachment in the second class with a 401 - Unauthorized error from IIS on http://localhost/_vti_bin/owssrv.dll. ...
2
4559
by: CTDev Team | last post by:
Hi, We are using Exchange Server 5.5, and have applications written in VB6 and C# that read and process emails. We are experiencing intermittent errors similar to C# Application System.Runtime.InteropServices.COMException (0x80004005): The client
1
3049
by: JohnRHarlow | last post by:
Hi: I am looking for advice on the best way to set up a process to read incoming emails (from a normal unix mailbox on the same host) containing a gzipped telemetry attachment. I'd like the script to extract the attachment into a directory where another process will pick it up. I plan to run it every minute out of cron, so it would probably start by moving the mbox file to another name so that incoming emails and later instances of...
7
7646
by: erikcw | last post by:
Hi all, I'm trying to extract zip file (containing an xml file) from an email so I can process it. But I'm running up against some brick walls. I've been googling and reading all afternoon, and can't seem to figure it out. Here is what I have so far. p = POP3("mail.server.com")
1
13926
by: suis | last post by:
Hi Everybody, I have a big dought about, how to read meta data information in a specific file type like .MSG , anyway thanks to this URL bellow . Edanmo has done a great job for that. http://www.mvps.org/emorcillo/en/code/grl/storage.shtml but the problem is still i cant find out a way of how to extract .MSG file Attachement file in to a seperate location in my hard drive. i realy appriciate that if any body can guide me to extract...
1
1229
by: =?Utf-8?B?RGl2ZXJzaXR5IE1hbg==?= | last post by:
My problem is when I send an attachment to a created E-Mail file that is not a Microsoft Word attachment it still attempts to open the attachment in word. How can I correct this problem? -- JW
1
2729
by: Edwin.Madari | last post by:
from each line separate out url and request parts. split the request into key-value pairs, use urllib to unquote key-value pairs......as show below... import urllib line = "GET...
7
2346
by: =?Utf-8?B?QmVu?= | last post by:
Hi I am looking for a way to extraxt an icon from a .exe file an save it as an icon not a bitmap or jpeg to a file? The code below extracts the icon but only as a bitmap PictureBox1.Image = Drawing.Icon.ExtractAssociatedIcon("file.exe").ToBitmap The problem is I need the icon not the bitmap.
0
9621
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
10264
Oralloy
by: Oralloy | last post by:
Hello folks, I am unable to find appropriate documentation on the type promotion of bit-fields when using the generalised comparison operator "<=>". The problem is that using the GNU compilers, it seems that the internal comparison operator "<=>" tries to promote arguments from unsigned to signed. This is as boiled down as I can make it. Here is my compilation command: g++-12 -std=c++20 -Wnarrowing bit_field.cpp Here is the code in...
1
10039
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows Update option using the Control Panel or Settings app; it automatically checks for updates and installs any it finds, whether you like it or not. For most users, this new feature is actually very convenient. If you want to control the update process,...
0
9914
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
0
8937
agi2029
by: agi2029 | last post by:
Let's talk about the concept of autonomous AI software engineers and no-code agents. These AIs are designed to manage the entire lifecycle of a software development project—planning, coding, testing, and deployment—without human intervention. Imagine an AI that can take a project description, break it down, write the code, debug it, and then launch it, all on its own.... Now, this would greatly impact the work of software developers. The idea...
0
5484
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
4009
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
2
3610
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
3
2851
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.