How to extract HTML body of an email?

Discussion of general topics about Mozilla Thunderbird
Post Reply
pokeefe0001
Posts: 103
Joined: January 18th, 2010, 4:25 pm
Location: Pacific Northwest, USA

How to extract HTML body of an email?

Post by pokeefe0001 »

I want to extract and use the HTML source from a few HTML-formatted emails. I can get most of this by either just copying the body or editing the email as new, selecting all, and opening the "Insert HTML" window. In both cases, images are lost. When viewing the email source I see that the message is in multi-part MIME format, and that each <img ...> contains a "src=CID:xxxxx" where the CID is defined in another MIME part. In the extracted HTML the img src has been converted to some sort of email reference that has no meaning outside of Thunderbird - that has no meaning outside of my instance of Thunderbird.

Is there any way of getting HTML source that somehow includes these images?
Patrick O'Keefe
Win11 x64 Pro, FF 113.0.2 TB 102.7.2
User avatar
tanstaafl
Moderator
Posts: 49647
Joined: July 30th, 2003, 5:06 pm

Re: How to extract HTML body of an email?

Post by tanstaafl »

Use view -> message source or control-U to view the raw message source. You typically see "src=CID:xxxxx" when an image is embedded in a HTML message. The image is in a separate MIME part, whose contents are base64 encoded so that they can be embedded in a 7bit ASCII message. The img src tag specifies the URL of the image. In this case the URL is referring to a MIME part within the message rather than a file or a location on the web.

CID is the value of the Content-ID: header in the MIME part that contains the image. You can safely copy and paste the MIME part and reuse it as long as you change that header (and the corresponding CID used to reference it). Its supposed to be globally unique. The MIME part is not HTML, so you might need to use something like the Header Tools Lite add-on to edit the raw source of an existing message.

If you don't want to copy the MIME part that contains the embedded image you could right click on the image in the displayed message, select "save image as", save it as a image file and use a HTML editor to embed it again later on. That would be what I'd do.
pokeefe0001
Posts: 103
Joined: January 18th, 2010, 4:25 pm
Location: Pacific Northwest, USA

Re: How to extract HTML body of an email?

Post by pokeefe0001 »

tanstaafl wrote: ... you could right click on the image in the displayed message, select "save image as", save it as a image file and use a HTML editor to embed it again later on. That would be what I'd do.
That's what I've been doing. I was hoping I'd missed neat feature that did that did that re-embedding automatically. Oh well.
Patrick O'Keefe
Win11 x64 Pro, FF 113.0.2 TB 102.7.2
User avatar
trolly
Moderator
Posts: 39851
Joined: August 22nd, 2005, 7:25 am

Re: How to extract HTML body of an email?

Post by trolly »

Getting the HTML code is easy: Right click -> Save As -> Select HTML and done.
But it does not include the images. The code references the images in the mailbox.

If you chose "Edit as new message" you can double click the image in the mail and replace the original link with the saved one. And finally save it again as HTML.
It is not much simpler but you can do all editing in Thunderbird.
Think for yourself. Otherwise you have to believe what other people tell you.
A society based on individualism is an oxymoron. || Freedom is at first the freedom to starve.
Constitution says: One man, one vote. Supreme court says: One dollar, one vote.
pokeefe0001
Posts: 103
Joined: January 18th, 2010, 4:25 pm
Location: Pacific Northwest, USA

Re: How to extract HTML body of an email?

Post by pokeefe0001 »

trolly wrote:Getting the HTML code is easy: Right click -> Save As -> Select HTML and done.
But it does not include the images. The code references the images in the mailbox.

If you chose "Edit as new message" you can double click the image in the mail and replace the original link with the saved one. And finally save it again as HTML.
It is not much simpler but you can do all editing in Thunderbird.
I'll give that a try. However, keeping it all within Thunderbird is not necessarily a plus. The email is just a model for a weekly newsletter. Use of a WYSIWYG HTML editor is currently being dictated to me "owning" the newsletter. And as a retiree with calcified synapses and almost no knowledge of HTML prior to this past weekend, use of a WYSIWYG HTML editor is almost required as a teaching aid. (When I've learned a bit more, the HTML editor may not be needed.) I eventually have to get the new emails into GroupMail so having an extra processing step in between TB and GroupMail is not a big deal. Once I get the model HTML source out of the sample email, TB is out of the picture. (I think.)
Patrick O'Keefe
Win11 x64 Pro, FF 113.0.2 TB 102.7.2
Post Reply