By using this site, you agree to our updated Privacy Policy and our Terms of Use. Manage your Cookies Settings.
435,134 Members | 1,275 Online
Bytes IT Community
+ Ask a Question
Need help? Post your question and get tips & solutions from a community of 435,134 IT Pros & Developers. It's quick & easy.

Outlook MSG file reading

P: n/a
Hello everyone,

I am attempting to extract some header information from typical Microsoft
Outlook MSG files in VB.NET. I am not after a complete message or
attachments that may be enclosed. I am particularly interested in the
Message ID field. I have examined MSG files in notepad and hex editors. I
can see that the Internet Headers are there and present. I can do a search
for Message-ID and locate it without any problems in notepad. The only
display issue I have seen so far is that each letter is separated by hex
character 00. Thus the Message-ID string would actually be, M e s s a g e -
I D.

I don't want to use Outlook automation. I have found it to be cumbersome
and slow. I also don't want to be reliant on an installation of Office.

Since the file is binary, I have attempted to use the System.IO.StreamFile
object to read the file. However, I have
not been able successfully walk through the file and obtain any readable
text. I have played around with various encodings, such as ASCII and
Unicode. I think that MSG files are BASE64/Mime encoded though. Perhaps
that could be part of my trouble.

I have downloaded several example applications that mimic Notepad. However,
none of them have been able to read the encoding of MSG files. I have
gained a new level of appreciation for Notepad :). I wander what it is that
notepad uses to detect the file encoding and display it in such a readable
way.

Does anyone have any experience with reading Outlook data? Again, I am not
after pretty formatting, I just want to extract certain text fragments from
these binary files. Can someone point me in the right direction? I would
think that I just need to be able to read Byte Sream from the file with the
correct encoding and convert it to ASCII text. I have been totally
unsuccessful so far.

Thanks,
Dmitry
Apr 14 '06 #1
Share this Question
Share on Google+
7 Replies


P: n/a

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:hK********************@comcast.com...
Does anyone have any experience with reading Outlook data? Again, I am
not after pretty formatting, I just want to extract certain text fragments
from these binary files. Can someone point me in the right direction? I
would think that I just need to be able to read Byte Sream from the file
with the correct encoding and convert it to ASCII text. I have been
totally unsuccessful so far.


Outlook can be automated, just like Word, Excel etc. It's a bit cranky, but
I have done it. Have you tried adding a reference to it?

Apr 14 '06 #2

P: n/a
Hi,

That's my whole thing is that I don't want to automate Outlook. It's very
clunky. I need to be able to process millions of MSG files and Office
products (i.e. Access) suck with that many files.

Thank you though.

dmitry

"Homer J Simpson" <no****@nowhere.com> wrote in message
news:QkW%f.89413$%H.47856@clgrps13...

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:hK********************@comcast.com...
Does anyone have any experience with reading Outlook data? Again, I am
not after pretty formatting, I just want to extract certain text
fragments from these binary files. Can someone point me in the right
direction? I would think that I just need to be able to read Byte Sream
from the file with the correct encoding and convert it to ASCII text. I
have been totally unsuccessful so far.


Outlook can be automated, just like Word, Excel etc. It's a bit cranky,
but I have done it. Have you tried adding a reference to it?

Apr 15 '06 #3

P: n/a

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:27******************************@comcast.com. ..
That's my whole thing is that I don't want to automate Outlook. It's very
clunky. I need to be able to process millions of MSG files and Office
products (i.e. Access) suck with that many files.


In that case I'd start searching for third party tools. I assume that MSFT
aren't offering to divulge the details of the format.
Apr 15 '06 #4

P: n/a

Check out the Redemption COM object:

http://www.dimastr.com/redemption/

Dmitry Akselrod wrote:
Hello everyone,

I am attempting to extract some header information from typical Microsoft
Outlook MSG files in VB.NET. I am not after a complete message or
attachments that may be enclosed. I am particularly interested in the
Message ID field. I have examined MSG files in notepad and hex editors.
I
can see that the Internet Headers are there and present. I can do a
search
for Message-ID and locate it without any problems in notepad. The only
display issue I have seen so far is that each letter is separated by hex
character 00. Thus the Message-ID string would actually be, M e s s a g e
- I D.

I don't want to use Outlook automation. I have found it to be cumbersome
and slow. I also don't want to be reliant on an installation of Office.

Since the file is binary, I have attempted to use the System.IO.StreamFile
object to read the file. However, I have
not been able successfully walk through the file and obtain any readable
text. I have played around with various encodings, such as ASCII and
Unicode. I think that MSG files are BASE64/Mime encoded though. Perhaps
that could be part of my trouble.

I have downloaded several example applications that mimic Notepad.
However,
none of them have been able to read the encoding of MSG files. I have
gained a new level of appreciation for Notepad :). I wander what it is
that notepad uses to detect the file encoding and display it in such a
readable way.

Does anyone have any experience with reading Outlook data? Again, I am
not after pretty formatting, I just want to extract certain text fragments
from
these binary files. Can someone point me in the right direction? I would
think that I just need to be able to read Byte Sream from the file with
the
correct encoding and convert it to ASCII text. I have been totally
unsuccessful so far.

Thanks,
Dmitry


--
Texeme Textcasting powers
http://www.you-read-it-here-first.com
Apr 15 '06 #5

P: n/a
No, MS is definitely not documenting their MSG format. I did find this
article:

http://www.msusenet.com/archive/topic.php/t-288764.html

A gentleman, named Eduardo A. Morcillo has developed some .NET classes that
wrap the Office OLE storage. They are pretty good so far. The classes are
here:

http://www.mvps.org/emorcillo/en/code/grl/storage.shtml

I have been able to take a couple of MSG files and obtain a list of streams
(properties) and their values. However, I am still missing the Internet
Headers. They must lie somewhere else in the file. All of this is quite
annoying, thanks to Microsoft.

The only known working API I have seen so far (used by many forensic
applications) is from Fookes software. These guys are great and their tools
are phenomenal, but the API is a little outside my price range.

Being able to obtain the Sender, Recipient, Subject, etc. is definitely a
plus, but I need the Message ID. I guess it's back to more research.

Dmitry
Basically, the MSG file format is a series of binary streams.
"Homer J Simpson" <no****@nowhere.com> wrote in message
news:UsX%f.89569$%H.59346@clgrps13...

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:27******************************@comcast.com. ..
That's my whole thing is that I don't want to automate Outlook. It's
very clunky. I need to be able to process millions of MSG files and
Office products (i.e. Access) suck with that many files.


In that case I'd start searching for third party tools. I assume that MSFT
aren't offering to divulge the details of the format.

Apr 15 '06 #6

P: n/a
Actually, never mind on the Internet Headers, they are there. They happen
to be stream, __substg1.0_007D001F. I just had some issues with data
formatting and conversion. I think that my problem is solved, thanks to Mr.
Morcillo.

dmitry

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:Sc******************************@comcast.com. ..
No, MS is definitely not documenting their MSG format. I did find this
article:

http://www.msusenet.com/archive/topic.php/t-288764.html

A gentleman, named Eduardo A. Morcillo has developed some .NET classes
that wrap the Office OLE storage. They are pretty good so far. The
classes are here:

http://www.mvps.org/emorcillo/en/code/grl/storage.shtml

I have been able to take a couple of MSG files and obtain a list of
streams (properties) and their values. However, I am still missing the
Internet Headers. They must lie somewhere else in the file. All of this
is quite annoying, thanks to Microsoft.

The only known working API I have seen so far (used by many forensic
applications) is from Fookes software. These guys are great and their
tools are phenomenal, but the API is a little outside my price range.

Being able to obtain the Sender, Recipient, Subject, etc. is definitely a
plus, but I need the Message ID. I guess it's back to more research.

Dmitry
Basically, the MSG file format is a series of binary streams.
"Homer J Simpson" <no****@nowhere.com> wrote in message
news:UsX%f.89569$%H.59346@clgrps13...

"Dmitry Akselrod" <dm****@nospam.com> wrote in message
news:27******************************@comcast.com. ..
That's my whole thing is that I don't want to automate Outlook. It's
very clunky. I need to be able to process millions of MSG files and
Office products (i.e. Access) suck with that many files.


In that case I'd start searching for third party tools. I assume that
MSFT aren't offering to divulge the details of the format.


Apr 15 '06 #7

P: n/a
MSG Files:

Priasoft has several MSG related products. http://www.priasoft.com

In particular, they have a MSG file parsing library that gives
developers access to all the properties of a .msg file, without
outlook.

They also have a viewer product that looks very similar to outlook 2003
with regards to the User Interface. The viewer can view, search, print,
and export msg files.

Regards,
the MSG Guru, Eriq VanBibber

Apr 21 '06 #8

This discussion thread is closed

Replies have been disabled for this discussion.