Best way to export emails from Outlook PST files

I've been saving every email for more than 15 years. It is a great historical resource to be able to go back to. I've been using Outlook for the entire time because I think it is an amazingly productive tool. Unfortunately, Outlook stores emails in PST  files. These files are fragile (they get corrupted easily) and proprietary, so I want to find a better way to store and search my emails going into the distant future.

As the first step in this process, I wanted to export all the emails in all my PST files into a format that would be easy to use and search and that would still be readable for decades to come. I decided on RFC822 ("EML" files) and HTML.

I needed a way to export from PST to these formats. I wanted....

  1. The process to be easy and mostly automated since I have >20 PST files to process.
  2. I wanted to keep as much of the original data intacts as possible (hopefully all). I specifically did not want to loose any data in the email headers or timestamps.
  3. I wanted the resulting files to be in a manageable directory structure. Some of my Outlook folders have 100,000+ messages in them and Windows does not like file folders with so many files. Maybe being able to break out subdirs based on date would be nice.

 

What I did

To test the various solutions, I used one very old (1996) and one very new (2011) PST file and exported the first 25 emails from each. Then I checked the resulting files to how well the solution worked.

Solutions

Outlook itself

 

Import Everything into an email archiving program

MailStore Home 5.0

MailStore Home

For almost everyone, this free program will be good enough. You can import millions of emails from your PST files and easily do simple searches.

It can also do a fairly complete export into standard RFC822 files, so you are not trapped incase you ever wanted to stop using it.

Unfortunately, it is missing two key features that I need...

  1. You can not search for text inside the email header. Most people don't care about the headers, but I do - mostly because I like to see where my spam comes from.
  2. You cannot search in groups of folders, only in a single specific folder or the entire database. Again, this will not bother most people, but I keep my spam in a folder called "spam" in every PST and when I search, I usually (but not always) do not want to see these results.

Sadly, both of these flaws would be trivial to fix but not much I can do to get around them.

SCAN 1.3

SCAN

Open source indexing engine. I just could not get this to work at all. I'd import a bunch of EML files, but nothing would show up in the index.

 

 

FAQ:

Q: But wait, what about all the messaging and sequencing and indexing?
A: This code is only the matching engine, which sits on top of plenty of other support software. All the networking and message sequencing code is written in hard code C and ASM. Any place you see INT97that is a call down to this code. But that is just plumbing. This also depends on FoxPro's ISAM indexing engine which uses a B*tree and is still impressive in its performance and reliability even 20 years later. Any place you see a SEEK, that is a call down to the FoxPro indexer.

UPDATES

1/2/2012: First published

###