Quotes and comma in From header

Discussion of bugs in Mozilla Thunderbird
Post Reply
aklefdal
Posts: 4
Joined: September 15th, 2005, 4:05 pm
Location: Oslo
Contact:

Quotes and comma in From header

Post by aklefdal »

The problem is the same as this: <a href="http://support.microsoft.com/default.aspx?scid=kb;en-us;834511&sd=rss&spid=2520">
http://support.microsoft.com/default.as ... &spid=2520</a>

When the "From"-address contains both a comma and international characters, it seems like strips away the quotes (that are needed because of the comma). When doing a reply, the last name (that comes before the comma) is placed on its own "To:"-line (without email address), and only the first name(s) are used as display name for the email address.

To reproduce the error do the following. In Exchange set your Display name like this: Lefdal, Alf Kåre

The From header produced will look like the following:
<pre>From: =?iso-8859-1?Q?Lefdal=2C_Alf_K=E5re?= <alf.kare.lefdal@myexchangeemailxxxxxx.no></pre>
When replying it will look like this:

<img src="http://www.lefdal.cc/div/test/thunderbird-bug1.png">

In Thunderbird set your display name likewise, with the from header produced looking like the following:
<pre>From: =?ISO-8859-1?Q?=22Lefdal=2C_Alf_K=E5re=22?= <alf.kare.lefdal@mythunderbirdemailxxxxxxxx.no></pre>
When replying it will look perfectly fine, like this:

<img src="http://www.lefdal.cc/div/test/thunderbird-bug2.png">

I believe this is a bug in MS Exchange 2003, but I'm not sure. Still I hope that Thunderbird should be able to handle this. Is there an extension that handles this?
User avatar
hansen
Posts: 5268
Joined: June 23rd, 2003, 6:28 am
Location: denmark

Post by hansen »

Well...both - mostly Outlook though.

Outlook should add quotes around such names - that's the standard.

Thunderbird could be better at detecting such things, but it shouldn't be Thunderbirds problem.
aklefdal
Posts: 4
Joined: September 15th, 2005, 4:05 pm
Location: Oslo
Contact:

Post by aklefdal »

Even though it is Microsoft's fault, it is me as a Thunderbird-user that gets the problems when Outlook/Exchange-users are sending me email. This could be solved the same way firefox handles junk HTML: Quirks mode.

(Or Someone™ could make an extension :-) )
aklefdal
Posts: 4
Joined: September 15th, 2005, 4:05 pm
Location: Oslo
Contact:

Post by aklefdal »

I managed (through my job) to report this to Microsoft. The response I got was in short that Thunderbird (and Squirrelmail and possibly others) are wrong and Outlook 2003 are doing it right. <a href="http://www.faqs.org/rfcs/rfc2047.html">RFC 2047</a> says in 6.2 that "Decoding and display of encoded-words occurs *after* a structured field body is parsed into tokens."

So, this means I will have to submit a bug :-(

Alf

Microsofts internal bug report says the following:
--------------------------------------------------------

<blockquote>
Cause
=============================
Outlook Express and Outlook are both using INETCOMM.DLL to parse mail headers received using POP3. When the Display name contains only a comma with no extended character, the display name for the "from" field is surrounded by quotes.
When the Display name contains both comma and extended character, the "from" field looks like this:

From: =?iso-8859-1?Q?Garaikoetxea_Mu=F1oa=2C_Juan_Mari_=28Pruebas_POP3=29?= <prujuan@myorg.es>
---> Not using quotes is perfectly valid according to RFC 2047: " An
---> 'encoded-word' MUST NOT appears within a 'quoted-string'." <---
However, when Outlook Express parses the "From" header, it considers the comma as a separator if the string is not quoted.

Resolution
=============================
This arrives after you have apply Exchange 2003 SP1 version 7226 or in Exchange 2000 the Post SP3 rollup from August version 6603.1 in an Exchange Server dedicated to mailboxes.

--

Apparently some MIME parsers don't properly handle address headers (From:/To:/Cc:) encoded using the RFC 2047 message header extensions for non-ASCII text. (RFC 2047 is the one that specifies how things like "=?iso-8859-1?Q?blah?=" are to be encoded in RFC 2822 headers.) Some mail clients (Yahoo and Notes according to X5:207641, and Outlook Express according to my own tests) decode RFC 2047-encoded words first and then do their token parsing after. This means, e.g., if you encode a comma (=2C) in an encoded-word in a display name, the comma won't appear as part of the display name; instead it'll be extracted and consumed by the RFC 2822 parser that is handling the header, so you end up with the wrong tokens. You can see similar behavior suggesting that this is the case if you, say, encode an angle-bracketed address in a display name, too, like this:

To: =?iso-8859-1?Q?user1_=3Cuser1=40domain.com=3E=2C_user2?= <user2@domain.com>

(Try putting that header into an .EML file and opening it with a mail client.) According to section 6.2 of RFC 2047:

"NOTE: Decoding and display of encoded-words occurs *after* a structured field body is parsed into tokens."

According to the RFC, that example To: line above should really represent 1 recipient with a display name of "user1 <user1@domain.com>, user2" and an address of "user2@domain.com", since tokenization should occur first. (In fact, a non-RFC 2047 aware client should handle this just fine, since it'd recognize all that gobbledy-gook as a single atom.) Clients like Outlook Express see this as two recipients though -- "user1" <user1@domain.com> and "user2" <user2@domain.com>.

"By design" is the correct response there, since Exchange is in fact producing the correct encoding.
</blockquote>
aklefdal
Posts: 4
Joined: September 15th, 2005, 4:05 pm
Location: Oslo
Contact:

Post by aklefdal »

This is already submitted as <a href="https://bugzilla.mozilla.org/show_bug.cgi?id=254519">Bug #254519</a> in Bugzilla.
Post Reply