473,651 Members | 2,549 Online
Bytes | Software Development & Data Engineering Community
+ Post

Home Posts Topics Members FAQ

Regexp and parsing .eml file in details problem

Hello,

Is anyone have an example of RegExp expression to parse .EML files (Email
Message)? I need to extract headers, HTML body, Textual body and attachments
if any exists.

I did some example, but not sure that its a good start:

^Message-ID: (?<messageid>.* )\nFrom: (?<from>.*)\nTo : (?<to>.*)\nSubj ect:
(?<subject>.*)\ nDate: (?<date>.*)\nMI ME-Version: (?<mime>.*)\nCo ntent-Type:
(?<contenttype> .*)\n

And the test message file was:
Received: from ([127.0.0.1]) with arachnoMail.NET Server; Sat, 06 Mar 2004
05:19:49 -0800
Message-ID: <00************ *************** *@test.vpn>
From: Test <te**@test.vp n>
To: Test <te**@test.vp n>
Subject: bg_stripe.gif
Date: Sat, 6 Mar 2004 12:24:06 +0200
MIME-Version: 1.0
Content-Type: multipart/mixed;
boundary="----=_NextPart_000_ 0005_01C40375.E B4919F0"
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 6.00.2800.1158
X-MimeOLE: Produced By Microsoft MimeOLE V6.00.2800.1165

This is a multi-part message in MIME format.

------=_NextPart_000_ 0005_01C40375.E B4919F0
Content-Type: multipart/alternative;
boundary="----=_NextPart_001_ 0006_01C40375.E B4919F0"
------=_NextPart_001_ 0006_01C40375.E B4919F0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

bg_stripe.gif
------=_NextPart_001_ 0006_01C40375.E B4919F0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META http-equiv=3DContent-Type content=3D"text/html; =
charset=3Diso-8859-1">
<META content=3D"MSHT ML 6.00.2800.1400" name=3DGENERATO R>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffff ff>
<DIV>&nbsp;</DIV><BR>&nbsp;b g_stripe.gif</BODY></HTML>

------=_NextPart_001_ 0006_01C40375.E B4919F0--

------=_NextPart_000_ 0005_01C40375.E B4919F0
Content-Type: image/gif;
name="bg_stripe .gif"
Content-Transfer-Encoding: base64
Content-Disposition: attachment;
filename="bg_st ripe.gif"

R0lGODlhBgALAIA AAABQn////yH5BAQUAP8ALAAA AAAGAAsAAAIKjI8 Gy+0P40s0FAA7

------=_NextPart_000_ 0005_01C40375.E B4919F0--
Nov 20 '05 #1
0 1630

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

19
2177
by: Magnus Lie Hetland | last post by:
I'm working on a project (Atox) where I need to match quite a few regular expressions (several hundred) in reasonably large text files. I've found that this can easily get rather slow. (There are many things that slow Atox down -- it hasn't been designed for speed, and any optimizations will entail quite a bit of refactoring.) I've tried to speed this up by using the same trick as SPARK, putting all the regexps into a single or-group in...
5
2346
by: Lukas Holcik | last post by:
Hi everyone! How can I simply search text for regexps (lets say <a href="(.*?)">(.*?)</a>) and save all URLs(1) and link contents(2) in a dictionary { name : URL}? In a single pass if it could. Or how can I replace the html &entities; in a string "blablabla&amp;blablabal&amp;balbalbal" with the chars they mean using re.sub? I found out they are stored in an dict . I though about this functionality:
1
1216
by: geos | last post by:
hello, I have the problem writing the regular expression to verify the valid system path in the way that RegExp.$1 has to contain path up to the parent folder of a file, and RegExp.$2 should contain a file name (or be empty if there was no file in the path). The allowed characters are all except \ / : * ? " < > | The problem I have is to find a smart way to separate the file part from the rest of the path.. could you give me some hints...
19
3554
by: Dr Clue | last post by:
I'm not really an expert with RegExp() , although I do use it. The problem I have is that I want to strip comments out of a CSS file using RegExp() The reason is that I'm loading and parsing to simulate javscript access to stylesheets in Opera. I thought I had it licked untill the '/' characters in url('') tripped me up Below is a test case. I've tried many things. but if I can't figure out a nice clean RegExp(), I'm going to have to...
4
2744
by: conan | last post by:
This regexp '<widget class=".*" id=".*">' works well with 'grep' for matching lines of the kind <widget class="GtkWindow" id="window1"> on a XML .glade file However that's not true for the re module in python, since this one takes the regexp as if were specified this way: '^<widget class=".*"
27
3235
by: SQL Learner | last post by:
Hi all, I have an Access db with two large tables - 3,100,000 (tblA) and 7,000 (tblB) records. I created a select query using Inner Join by partial matching two fields (X from tblA and Y from tblB). The size of the db is about 200MBs. Now my issue is, the query has been running for over 3 hours already - I have no idea when it will end. I am using Access 2003. Are there ways to improve the speed performance? (Also, would the...
11
3550
by: Flyzone | last post by:
Hello, i have again problem with regexp :-P I need to match all lines that contain one word but not contain another. Like to do "grep one | grep -v two:" The syntax of the string is: (any printable char)two:(any printable char)one(any printable char) Example: Apr 30 00:00:09 v890neg0 two: findings: blablabla
13
1387
by: otrWalter | last post by:
I'm trying to display that type, name and value to class properties. Yes, I know about print_r(). I'm just trying to build a display format for this information. AFAIK, the standard PHP tools to look into classes do not give you PRIVATE and PROTECTED properties (nor methods), but the print_r() does! (At least for properties it does.) This this display below...
3
4592
by: =?Utf-8?B?RGFuYQ==?= | last post by:
I am re-posting this message after registering my posting alias. When I specify an end tag for the clear element of namespaces in my web.config file, the parser error "Unrecognized element 'add'" is reported. .... <pages> <namespaces> <clear></clear> <add namespace="System"/>
0
8345
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However, people are often confused as to whether an ONU can Work As a Router. In this blog post, we’ll explore What is ONU, What Is Router, ONU & Router’s main usage, and What is the difference between ONU and Router. Let’s take a closer look ! Part I. Meaning of...
0
8693
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven tapestry of website design and digital marketing. It's not merely about having a website; it's about crafting an immersive digital experience that captivates audiences and drives business growth. The Art of Business Website Design Your website is...
0
8570
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each protocol has its own unique characteristics and advantages, but as a user who is planning to build a smart home system, I am a bit confused by the choice of these technologies. I'm particularly interested in Zigbee because I've heard it does some...
1
6156
isladogs
by: isladogs | last post by:
The next Access Europe User Group meeting will be on Wednesday 1 May 2024 starting at 18:00 UK time (6PM UTC+1) and finishing by 19:30 (7.30PM). In this session, we are pleased to welcome a new presenter, Adolph Dupré who will be discussing some powerful techniques for using class modules. He will explain when you may want to use classes instead of User Defined Types (UDT). For example, to manage the data in unbound forms. Adolph will...
0
4143
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The last exercise I practiced was to create a LAN-to-LAN VPN between two Pfsense firewalls, by using IPSEC protocols. I succeeded, with both firewalls in the same network. But I'm wondering if it's possible to do the same thing, with 2 Pfsense firewalls...
0
4279
by: adsilva | last post by:
A Windows Forms form does not have the event Unload, like VB6. What one acts like?
1
2694
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated we have to send another system
1
1904
muto222
by: muto222 | last post by:
How can i add a mobile payment intergratation into php mysql website.
2
1584
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence can significantly impact your brand's success. BSMN Consultancy, a leader in Website Development in Toronto offers valuable insights into creating effective websites that not only look great but also perform exceptionally well. In this comprehensive...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.