Searching for a string incl. HTML in an e-mail body

User Help for Mozilla Thunderbird
Post Reply
UbuntuUser
Posts: 2
Joined: September 5th, 2015, 5:23 am

Searching for a string incl. HTML in an e-mail body

Post by UbuntuUser »

Finding any strings, even parts of words, works quite well for the displayed text of an email. However, I would like to search email bodies completely, i.e. also including all HTML code as if it was text, for the purpose of email filtering. If there is e.g. a message with some code

Code: Select all

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
<title>Example</title>
</head>

<body>
<p><span style="font-size: x-large;">

searching the email body for
font-size
or
large;">
will not find that e-mail, at least not or not reliably with my Thunderbird 38.2.0, although these strings occur :(
Is there any solution?
User avatar
tanstaafl
Moderator
Posts: 49647
Joined: July 30th, 2003, 5:06 pm

Re: Searching for a string incl. HTML in an e-mail body

Post by tanstaafl »

UbuntuUser
Posts: 2
Joined: September 5th, 2015, 5:23 am

Re: Searching for a string incl. HTML in an e-mail body

Post by UbuntuUser »

Thank you for the hint, but when I searched for a solution before posting my question, I had already seen this mozillaZine page about message filters. My example email does not have separate plain text and HTML versions of the text but just HTML. I use POP3, hence I did not need to create a custom header for filtering the body. My problem seems to be similar to the problem in viewtopic.php?f=39&t=2734223 where mad.engineer wants to filter emails depending on the occurrence of the string "Content-Type: text/calendar" but does not succeed and concludes that Thunderbird seems to search the parsed body content rather than the raw body content. Is there any easy workaround?
For example, is it possible to create a custom header for
- the "raw" body or
- the entire email with all of its headers and the body,
and if so, is there any description for that (e.g. what to type into the "New Message Header" field of the "Customize Headers" window)?
PWSowner
Posts: 84
Joined: February 13th, 2006, 9:31 pm
Location: Canada
Contact:

Re: Searching for a string incl. HTML in an e-mail body

Post by PWSowner »

There has to be some way to search raw bodies of emails. I've wished for it, and searched for it, many times over several years.
A message contains:
<a href="http://eaxcr.yourmedsworld.ru?" style="text-decoration: none; color: #0099ff;">Click here!</a>
I want to search for all messages containing .ru? but can't. My business is Internet and I deal with 200+ emails/day. Half of those are spam. My server does mark them as spam, but for safety in the event of false detection, I have spam email still sent to me rather than being deleted. It goes to my spam folder, but every once in a while I need to skim through them just to see if any were incorrectly marked. It would be so much easier if I could either search or filter based on raw email bodies. View message source shows the raw body, so it should be very simple to also search that. Is there an addon or way to do this?
User avatar
tanstaafl
Moderator
Posts: 49647
Joined: July 30th, 2003, 5:06 pm

Re: Searching for a string incl. HTML in an e-mail body

Post by tanstaafl »

If its for a single message use view -> message source or control-U. Then use edit->find or Control-F.

One problem with searching message bodies of multiple messages using the quick filter bar is they can search the message body, but they will interpret HTML tags rather than displaying the raw message source. It also doesn't support wild cards (what you wanted to use) or regular expressions (more powerful superset). The expression search / Gmail enhances the search capabilities of the quick filter bar. For example it adds support for regular expressions.
PWSowner
Posts: 84
Joined: February 13th, 2006, 9:31 pm
Location: Canada
Contact:

Re: Searching for a string incl. HTML in an e-mail body

Post by PWSowner »

Thanks, but doesn't work. I don't need regular expressions. For my example above:
<a href="http://eaxcr.yourmedsworld.ru?" style="text-decoration: none; color: #0099ff;">Click here!</a>
That's exactly what's in the message source.

I also don't want to find occurrences in a single email. Just like the original poster, I want to search folders for all messages containing something, but that something is in the html coding. I need to be able to search raw email bodies. If I search for "Click Here", it finds that message, but if I search for just "yourmedsworld" it doesn't find it.

I just installed the addon you suggested, and the regex is nice, but it still only searches the generated html page, not the original source code.

I've found many people asking about searching the raw email body, and some bug reports about it, but it doesn't look like this is going to be possible.

My current workaround, which is a bit extravagant, is:
1 - export all messages in the folder I want to search in eml format
2 - use Total Commander, or similar, to search the messages (individual files now) and delete the individual messages from there
3 - if there are any messages left, import the messages back to Thunderbird

Crazy, but this morning I had a folder of over 20,000 old emails that were marked as spam. I used that method and did about 20 different searches, and ended up with less than 1000 emails left. I re-imported those to look at individually and there were 3 emails that were not spam (incorrectly marked). Took about 1 hour to do all that. Do you know how long that would have taken to actually skim down through the emails manually.

Would be nice to be able to search the raw email bodies in Thunderbird, rather than have to send them elsewhere to do the searching. Oh well, I don't normally have near that many to go through, so for now just wait, with my fingers crossed ;), for the day Thunderbird either allows raw body searching, or someone comes up with an addon that does that.

Maybe I should learn how to make addons and do one myself for that. I know html, javascript, perl, and php. Shouldn't be too hard to figure out how to write an addon.
Post Reply