473,473 Members | 2,236 Online
Bytes | Software Development & Data Engineering Community
Create Post

Home Posts Topics Members FAQ

Outlook MSG file reading

Hello everyone,

I am attempting to extract some header information from typical Microsoft
Outlook MSG files in VB.NET. I am not after a complete message or
attachments that may be enclosed. I am particularly interested in the
Message ID field. I have examined MSG files in notepad and hex editors. I
can see that the Internet Headers are there and present. I can do a search
for Message-ID and locate it without any problems in notepad. The only
display issue I have seen so far is that each letter is separated by hex
character 00. Thus the Message-ID string would actually be, M e s s a g e -
I D.

I don't want to use Outlook automation. I have found it to be cumbersome
and slow. I also don't want to be reliant on an installation of Office.

Since the file is binary, I have attempted to use the System.IO.StreamFile
object to read the file. However, I have
not been able successfully walk through the file and obtain any readable
text. I have played around with various encodings, such as ASCII and
Unicode. I think that MSG files are BASE64/Mime encoded though. Perhaps
that could be part of my trouble.

I have downloaded several example applications that mimic Notepad. However,
none of them have been able to read the encoding of MSG files. I have
gained a new level of appreciation for Notepad :). I wander what it is that
notepad uses to detect the file encoding and display it in such a readable
way.

Does anyone have any experience with reading Outlook data? Again, I am not
after pretty formatting, I just want to extract certain text fragments from
these binary files. Can someone point me in the right direction? I would
think that I just need to be able to read Byte Sream from the file with the
correct encoding and convert it to ASCII text. I have been totally
unsuccessful so far.

Thanks,
Dmitry
Apr 14 '06 #1
6 8697

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:hK********************@comcast.com...
Does anyone have any experience with reading Outlook data? Again, I am
not after pretty formatting, I just want to extract certain text fragments
from these binary files. Can someone point me in the right direction? I
would think that I just need to be able to read Byte Sream from the file
with the correct encoding and convert it to ASCII text. I have been
totally unsuccessful so far.


Outlook can be automated, just like Word, Excel etc. It's a bit cranky, but
I have done it. Have you tried adding a reference to it?

Apr 14 '06 #2
Hi,

That's my whole thing is that I don't want to automate Outlook. It's very
clunky. I need to be able to process millions of MSG files and Office
products (i.e. Access) suck with that many files.

Thank you though.

dmitry

"Homer J Simpson" <no****@nowhere.com> wrote in message
news:QkW%f.89413$%H.47856@clgrps13...

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:hK********************@comcast.com...
Does anyone have any experience with reading Outlook data? Again, I am
not after pretty formatting, I just want to extract certain text
fragments from these binary files. Can someone point me in the right
direction? I would think that I just need to be able to read Byte Sream
from the file with the correct encoding and convert it to ASCII text. I
have been totally unsuccessful so far.


Outlook can be automated, just like Word, Excel etc. It's a bit cranky,
but I have done it. Have you tried adding a reference to it?

Apr 15 '06 #3

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:27******************************@comcast.com. ..
That's my whole thing is that I don't want to automate Outlook. It's very
clunky. I need to be able to process millions of MSG files and Office
products (i.e. Access) suck with that many files.


In that case I'd start searching for third party tools. I assume that MSFT
aren't offering to divulge the details of the format.
Apr 15 '06 #4

Check out the Redemption COM object:

http://www.dimastr.com/redemption/

Dmitry Akselrod wrote:
Hello everyone,

I am attempting to extract some header information from typical Microsoft
Outlook MSG files in VB.NET. I am not after a complete message or
attachments that may be enclosed. I am particularly interested in the
Message ID field. I have examined MSG files in notepad and hex editors.
I
can see that the Internet Headers are there and present. I can do a
search
for Message-ID and locate it without any problems in notepad. The only
display issue I have seen so far is that each letter is separated by hex
character 00. Thus the Message-ID string would actually be, M e s s a g e
- I D.

I don't want to use Outlook automation. I have found it to be cumbersome
and slow. I also don't want to be reliant on an installation of Office.

Since the file is binary, I have attempted to use the System.IO.StreamFile
object to read the file. However, I have
not been able successfully walk through the file and obtain any readable
text. I have played around with various encodings, such as ASCII and
Unicode. I think that MSG files are BASE64/Mime encoded though. Perhaps
that could be part of my trouble.

I have downloaded several example applications that mimic Notepad.
However,
none of them have been able to read the encoding of MSG files. I have
gained a new level of appreciation for Notepad :). I wander what it is
that notepad uses to detect the file encoding and display it in such a
readable way.

Does anyone have any experience with reading Outlook data? Again, I am
not after pretty formatting, I just want to extract certain text fragments
from
these binary files. Can someone point me in the right direction? I would
think that I just need to be able to read Byte Sream from the file with
the
correct encoding and convert it to ASCII text. I have been totally
unsuccessful so far.

Thanks,
Dmitry


--
Texeme Textcasting powers
http://www.you-read-it-here-first.com
Apr 15 '06 #5
No, MS is definitely not documenting their MSG format. I did find this
article:

http://www.msusenet.com/archive/topic.php/t-288764.html

A gentleman, named Eduardo A. Morcillo has developed some .NET classes that
wrap the Office OLE storage. They are pretty good so far. The classes are
here:

http://www.mvps.org/emorcillo/en/code/grl/storage.shtml

I have been able to take a couple of MSG files and obtain a list of streams
(properties) and their values. However, I am still missing the Internet
Headers. They must lie somewhere else in the file. All of this is quite
annoying, thanks to Microsoft.

The only known working API I have seen so far (used by many forensic
applications) is from Fookes software. These guys are great and their tools
are phenomenal, but the API is a little outside my price range.

Being able to obtain the Sender, Recipient, Subject, etc. is definitely a
plus, but I need the Message ID. I guess it's back to more research.

Dmitry
Basically, the MSG file format is a series of binary streams.
"Homer J Simpson" <no****@nowhere.com> wrote in message
news:UsX%f.89569$%H.59346@clgrps13...

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:27******************************@comcast.com. ..
That's my whole thing is that I don't want to automate Outlook. It's
very clunky. I need to be able to process millions of MSG files and
Office products (i.e. Access) suck with that many files.


In that case I'd start searching for third party tools. I assume that MSFT
aren't offering to divulge the details of the format.

Apr 15 '06 #6
Actually, never mind on the Internet Headers, they are there. They happen
to be stream, __substg1.0_007D001F. I just had some issues with data
formatting and conversion. I think that my problem is solved, thanks to Mr.
Morcillo.

dmitry

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:Sc******************************@comcast.com. ..
No, MS is definitely not documenting their MSG format. I did find this
article:

http://www.msusenet.com/archive/topic.php/t-288764.html

A gentleman, named Eduardo A. Morcillo has developed some .NET classes
that wrap the Office OLE storage. They are pretty good so far. The
classes are here:

http://www.mvps.org/emorcillo/en/code/grl/storage.shtml

I have been able to take a couple of MSG files and obtain a list of
streams (properties) and their values. However, I am still missing the
Internet Headers. They must lie somewhere else in the file. All of this
is quite annoying, thanks to Microsoft.

The only known working API I have seen so far (used by many forensic
applications) is from Fookes software. These guys are great and their
tools are phenomenal, but the API is a little outside my price range.

Being able to obtain the Sender, Recipient, Subject, etc. is definitely a
plus, but I need the Message ID. I guess it's back to more research.

Dmitry
Basically, the MSG file format is a series of binary streams.
"Homer J Simpson" <no****@nowhere.com> wrote in message
news:UsX%f.89569$%H.59346@clgrps13...

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:27******************************@comcast.com. ..
That's my whole thing is that I don't want to automate Outlook. It's
very clunky. I need to be able to process millions of MSG files and
Office products (i.e. Access) suck with that many files.


In that case I'd start searching for third party tools. I assume that
MSFT aren't offering to divulge the details of the format.


Apr 15 '06 #7

This thread has been closed and replies have been disabled. Please start a new discussion.

Similar topics

0
by: pcouas | last post by:
Hi, Anyone know how reading Outlook or Outlook express contact File from Javamail ? It seems theses files are named .dbx and .vba Thanks Philippe
2
by: A | last post by:
I am sending email via access. Outlook now blocks me with following message: A program is trying to automatically send e-mail on your behalf. Do you want to allow this. I must click yes for...
8
by: John | last post by:
Hi I am using the latest redemption. I am using the below code in vb.net to send mail in html format. The problem is that text does not get sent as html and html tags appear as they are in the...
23
by: Graham F French | last post by:
Hello, I can read text files into my application, but I cannot read in msg files as they seem to be in a proprietry format. Is there anyway of converting it on the fly or is there an...
2
by: Karen Grube | last post by:
Hi! I hate to bother you all with this, but I don't know how best to approach a particular task. Here's the deal: Once a month I receive in my own inbox on my company's Outlook Exchange...
3
by: Qwert | last post by:
Hello, I would like to read the contents ( mail ) of my outlook program. I assume these messages are kept in the files "Inbox.dbx" and "microsoft.public.dotnet.languages.vb.dbx". Is there...
0
by: rcoutts | last post by:
I have a custom Access database that is a bulk mailing program for my small business to send emails to my customers (not spam!). Before sending mail, I export a folder in Outlook to an Access MDB...
7
by: Dmitry Akselrod | last post by:
Hello everyone, I am attempting to extract some header information from typical Microsoft Outlook MSG files in VB.NET. I am not after a complete message or attachments that may be enclosed. I...
3
by: Siv | last post by:
Hi, A little while ago I wrote a small program that allowed the user to view products from a database. The database holds the details of the products which can be viewed via a form and...
3
by: tshad | last post by:
I have a program that is reading one of my mailboxes. It is polling the mailboxes ever 5 minutes to see if there is a message or not. Is this how Outlook does it (using Exchange as the...
0
marktang
by: marktang | last post by:
ONU (Optical Network Unit) is one of the key components for providing high-speed Internet services. Its primary function is to act as an endpoint device located at the user's premises. However,...
0
by: Hystou | last post by:
Most computers default to English, but sometimes we require a different language, especially when relocating. Forgot to request a specific language before your computer shipped? No problem! You can...
0
jinu1996
by: jinu1996 | last post by:
In today's digital age, having a compelling online presence is paramount for businesses aiming to thrive in a competitive landscape. At the heart of this digital strategy lies an intricately woven...
1
by: Hystou | last post by:
Overview: Windows 11 and 10 have less user interface control over operating system update behaviour than previous versions of Windows. In Windows 11 and 10, there is no way to turn off the Windows...
0
tracyyun
by: tracyyun | last post by:
Dear forum friends, With the development of smart home technology, a variety of wireless communication protocols have appeared on the market, such as Zigbee, Z-Wave, Wi-Fi, Bluetooth, etc. Each...
0
by: conductexam | last post by:
I have .net C# application in which I am extracting data from word file and save it in database particularly. To store word all data as it is I am converting the whole word file firstly in HTML and...
0
by: TSSRALBI | last post by:
Hello I'm a network technician in training and I need your help. I am currently learning how to create and manage the different types of VPNs and I have a question about LAN-to-LAN VPNs. The...
0
by: 6302768590 | last post by:
Hai team i want code for transfer the data from one system to another through IP address by using C# our system has to for every 5mins then we have to update the data what the data is updated ...
0
bsmnconsultancy
by: bsmnconsultancy | last post by:
In today's digital era, a well-designed website is crucial for businesses looking to succeed. Whether you're a small business owner or a large corporation in Toronto, having a strong online presence...

By using Bytes.com and it's services, you agree to our Privacy Policy and Terms of Use.

To disable or enable advertisements and analytics tracking please visit the manage ads & tracking page.