Passive fingerprinting of email client, based on email headers

Question

Are there ways to passively fingerprint (infer) the operating system or mail client that an email sender is using, based upon the headers of an email from that sender?

I'm familiar with passive network fingerprinting tools like p0f: given a trace of some packets sent from a particular machine, they try to infer the operating system that machine is using. Are there techniques or tools for doing something analogous, with email messages?

I can think of some techniques that get some partial information in some cases, but I don't want to re-invent the wheel if this already exists or has already been studied. For instance, I know of the X-Mailer: header, which if present will typically reveal the mail client that the sender is using. But are there techniques that work to infer the mail client if X-Mailer: is not present, or to infer the operating system?

I don't know of any tools that do this, but I suspect it has the same ad- and disadvantages as browser fingerprinting; you'll have a lot of problems with custom-configured clients. In addition there are a *lot* of mail servers out there that do *something* with mail headers, and what they do with it is really different, so it's a bit harder to get to. Basically, apart from the standard version string that may or may not be in the header, I'm not sure what you're trying to infer. — Rens van der Heijden, Aug 05 '15 at 20:55
@RensvanderHeijden, thanks for the feedback! That roughly matches what I'm expecting, but I'd accept imperfect solutions that do the best possible, accepting that it won't always be possible to infer anything and the inferences won't always be correct. I know about `X-Mailer:`, but one can ask whether it's possible to infer the mail client when `X-Mailer:` is not present, or one can infer the OS. I edited my question to hopefully make that clearer. — D.W., Aug 05 '15 at 20:58
Thanks for the clarification! I took a quick look at a few e-mails -- looks like you are right that this is possible; Apple Mail/iPhone Mail seems to use the client name in the multipart message separation. I suppose you could also derive the supported character encoding and type of client (GUI/text-only) to some extent, but it'll be pretty hard to get down to individual clients as you'd want to do for browsers. You could probably infer the OSs of the intermediate servers, but I suspect beyond client support getting to the senders' OS is pretty hard. — Rens van der Heijden, Aug 05 '15 at 21:04
You can take a look at the Thunderbird add-on [Display Mail User Agent](http://www.juergen-ernst.de/addons/dispmua.html), which seems to use the `X-Mailer`, `User-Agent`, `Message-Id` and perhaps other headers to try and determine what program or webmailer the sender used to send an email. That can include the operating system (if the sender includes it in the `User-Agent` header), but is usually just a program name and version. — n.st, Aug 05 '15 at 23:55

score 9 · Answer 1 · answered Aug 06 '15 at 00:02

Are there ways to passively fingerprint (infer) the operating system or mail client that an email sender is using, based upon the headers of an email from that sender?

Yes, but they are extremely prone to errors and very easy to spoof, either intentionally or unintentionally (e.g. some high-end firewall systems will "repack" a message after removing/sanitizing attachments, modifying some information).

Apart from the X-Mailer header, you can glean some information from other fields such as the message-id and multipart boundaries, if present. Both should contain a unique sequence, and different systems generate that in different ways. Some MUA will generate a message-id of their own, other will leave the chore to the server; so you see 'DE9E2BFC.12345@gmail.com' and you know it's a GMail customer and nothing more, while o8xp315ahi3207wv6gbl5tfiedsaelntas@4ax.com is an ID produced by Forté Agent, a software which only runs on Microsoft Windows.

The old Eudora up to 6.1 had a X-Sender header which also revealed the "persona" used to send an email.

Microsoft Outlook uses, I think, boundaries of the form _NextPart_XXX_0000_HHHHHHHH.HHHHHHHH where H is a hex digit and XXX represents the inner sequence number (000 onwards), while Mozilla Thunderbird (which also adds a User-Agent header) uses a boundary of "------------0x0x0x0x0x0x0x0x0x0x0x0x" where x is a 0-9 digit.

In several cases you will see that a part of the uniqueid is actually a timestamp, which can also double as a verification of the date header.

So, some information is there -- but you'd need to build a fingerprint database. Also, as I said, it's relatively easy to modify a single header, and maybe even all of them; you'd have to take this into account and see if the strategy still fits your purpose.

Passive fingerprinting of email client, based on email headers

1 Answers1

Linked