Encoding Problem When Importing from Outlook

User Help for Mozilla Thunderbird
Post Reply
cweinhofer
Posts: 8
Joined: August 27th, 2008, 7:53 am

Encoding Problem When Importing from Outlook

Post by cweinhofer »

I recently decided to switch from Outlook 2007 to TB. After getting everything set-up, I used TB's import feature to bring over all my Outlook data. As I started looking through the messages, I realized that all of the messages that had Asian characters (in this case Japanese) were broken somehow. The characters mostly displayed as question marks. I tried a few things to fix the problem, but am still having issues. Below is what I tried and the partial results I got:

1. I switched to the Japanese font set under "Fonts & Encoding":
This didn't seem to fix anything.

2. I unchecked "Allow messages to use other fonts." and "Use fixed width font for plain text messages." I also set the default encoding for incoming and outgoing mail to "Unicode (UTF-8)":
This got some of the sender names and subject lines to display correctly.

3. After changing the above, I re-imported all my messages:
About the same as 2.

4. I manually changed the encoding on some of the messages until I found the right one:
This allowed me to display the content of some messages.

I still cannot see the sender names on some messages. This is particularly odd because the majority of these messages are ones I sent from Outlook where my name was rendered in Japanese script. I know that sometime an ISP or mail provider's server can turn non-ASCII characters into question marks on the trip through cyberspace, but these cam straight from my system.

I still cannot see the contents of some message. I tried manually switching to every encoding I could think of. This is particularly odd because many of these messages are from a co-worker who I know is using the Japanese version of TB.

Everything displayed correctly in Outlook, with only the rarest need for manual encoding changes. I was really hoping to convert to TB as an alternative to buying Outlook 2010 for my new system, but this would be a deal killer.

Thanks for your help.
cweinhofer
Posts: 8
Joined: August 27th, 2008, 7:53 am

Re: Encoding Problem When Importing from Outlook

Post by cweinhofer »

Well, with a little bit of prompting from Matt and the help of a 3rd part utility, I think I have the problem figured out. Here's what I did:

Intending to try and compare message like Matt suggested, I looked at the about.com link he posted. It had two very insightful comments. It said that they way Outlook stores messages, "the original message structure is lost... Even when you save the message to disk as an .msg file, (the resulting file is) ...a slightly modified version." As Matt alluded, this is an MS issue. It gave instructions on how to modify the registry to stop Outlook from doing this, but cautioned "you can retrieve the (full) source of newly retrieved POP messages (however) ...editing the SaveAllMIMENotJustHeaders value does not restore the complete message source for emails that were already in Outlook."

Because it was the 13,000 emails that I have that "were already in Outlook" that I cared about, comparing the message sources from Outlook & TB didn't seem to be much good.

However, the whole process got me thinking that maybe there was some way to use a 3rd party utility that would export Outlook messages into a more universal format where they could then be successfully imported into TB.

I can remember the exact progression, but eventually I honed in on the MBOX format and then discovered a utility called MessageSave made by a company called TechHit. Their website contained instructions for using this tool to convert from .PST / Exchange to MBOX format.

Here it is for reference if the link ever goes dead (http://www.techhit.com/outlook/convert_outlook_mbox.html):
1. Select the messages you would like to export, or the folder, if you would like to export the entire folder.
2. Click the MessageSave Outlook toolbar button.
3. Select "include subfolders" if you would like to export subfolders of the current folder as well.
4. Select "MBOX" in the "Format" field.
5. Click "Save Now".
6. That's it. You should see mbox file(s) created in the destination directory.

Note: Some email clients, such as Thunderbird, have issues importing mbox files with very long names, or with certain characters, such as #, in the file name. If you see errors, try renaming your mbox file to a short name without special characters in it.

Note: MBOX format is supported only when using MessageSave with Outlook 2002 or newer.


From there I followed the MozillaZine instructions for manually importing MBOX files.

Here it is for reference if the link ever goes dead
(http://kb.mozillazine.org/Importing_and_exporting_your_mail#Mbox_files):
1. Identify the e-mail account you want to use for the imported mail. You can use the Local Folders account, or some other account, or you can create a new account specially.
2. In Account Settings, go to the account's main page to find its directory path. Make a note of the path.
3. Exit Thunderbird or Mozilla Suite.
4. Back up your profile or Mail directory, especially if you plan to overwrite existing mbox files.
5. Copy the mbox files you are importing to the local directory that you identified in step 2. For example, copy "Inbox" (not Inbox.msf) to the account's local directory.
6. Open Thunderbird or Mozilla Suite. As it starts up, the application automatically discovers new mbox files and makes them into folders. A folder for each file you copied should be displayed.
7. Open each folder and verify that it contains the correct number of e-mails, that they are readable, and that you can open attachments. Sometimes, differences between mbox formats cause multiple emails to be combined into one larger e-mail or can make some e-mails unreadable.

Note: If you want to keep the existing mbox file of the same name in the new location, rename the file you wish to import before copying it over. For example, rename "Inbox" to "InboxOld" and "Sent" to "SentOld".


I used "Local Folders" and then moved the messages into the appropriate folders in TB. TechHit offers an eval version of the utility which will do 50 messages. This was good to put some handpicked messages in a folder in Outlook and confirm that everything came across fine. The full version of the program is $50, which is steep if you can take or leave TB; but definitely worth it if you're like me and a solution will alleviate the need for an Outlook purchase. Everything looks to have converted fine, but I will post again if I hit any snags.

I should also note that I considered using a transfer to GMail and back to accomplish the import. This seemed too cumbersome and I was worried about altering the messages further. However, it may be a good low cost solution for some. This page is a good place to start: http://www.labnol.org/internet/email/export-outlook-email-to-gmail-pst-backup/1938/

Also, the MozillaZine KB article on import / export has a page with listings of utilities. Some of these might do the trick as well and could be cheaper.

:idea: :idea: A COMMENT FOR ANY THUNDERBIRD DEVELOPERS THAT HAPPEN ACCROSS THIS PAGE :idea: :idea:
I know that issues when moving messages from Outlook are not necessarily the fault of TB, but this almost prevented me from being able to convert to TB and the $50 price tag may prevent others.

People who have messages that contain mixed western & Asian encodings are not a huge group, but certainly significant. This and other issues with encoding seem to be one of TB's weak points – one of the places where Outlook actually does better.

This particular issue seems to be an ongoing problem which has yet to be resolved. The fact that there is a solution says that the import / export function of TB could be corrected / updated to prevent this issue. Please consider adding it and other more robust encoding features to the list of items that are developed in future versions.

If there is a place to submit feature suggestions, please PM me and I will be happy to pass what I have discovered along to the TB team. A workaround like this could also be included in the MozillaZine KB article on importing / exporting.
User avatar
tanstaafl
Moderator
Posts: 49647
Joined: July 30th, 2003, 5:06 pm

Re: Encoding Problem When Importing from Outlook

Post by tanstaafl »

What is the about.com link he gave you?

We're not run by or formally associated with Mozilla/Mozilla Messaging despite the similarity in names. We're a independent user community. None of the developers read this forum.

I updated http://kb.mozillazine.org/Import_.pst_files to point users to this thread "If you have problems importing messages using mixed Western and Asian fonts".

The best place to submit feature suggestions is to file a bug report at https://bugzilla.mozilla.org/ , setting the priority to enhancement request. If you do that please post the URL. You have to provide a email address to file a bug report and it is listed there, but I've never heard of people getting spam due to that. I suspect that's because the bug reports are not listed automatically, so they're not worth the effort for a spammer to try to collect those email addresses when there are so many other easier places.
cweinhofer
Posts: 8
Joined: August 27th, 2008, 7:53 am

Re: Encoding Problem When Importing from Outlook

Post by cweinhofer »

Sorry for the confusion. I had gotten some suggestions from a guy over at getsatisfaction.com/mozilla_messaging and decided to post my stuff here as well so people could have another place to find it.

The link he mentioned was http://email.about.com/od/outlooktips/q ... utlook.htm

It's too bad there isn't more of a connection between this site and the development team. The KB here and some of the instructions and workarounds have been a great help with FF (which I love) and have been one of the things keeping me sane as I try to wrestle with TB.

For those reading this who are switching from Outlook to TB, some additional advice:

Make sure to set your default encoding for receiving and especially sending to ISO-2022-JP. I found out this is the de facto encoding used by a lot of email clients & providers in Japan and some will not accept / display email that is not encoded like this.

Also, when sending mail to Japan cell phone email addresses, the mail almost always needs to be plain text or it will be unreadable on the other end (though you probably won't get a bounce-back).

Outlook seemed to do both of these automatically, but they need to be specifically configured in TB.
Post Reply