9

Several emails sent from my webserver to a Gmail address, where the From: address is websitevisitor@gmail.com, have been marked as spam by Gmail. The From: field is populated from form data, and corresponds to the visitor's actual email address, which often is a Gmail address. The Return-Path: is always pointing to an address account@mywebserver.com, which means that SPF and DKIM checks will work.

When I inspect the raw emails in the Gmail account, I see the following:

Delivered-To: webformrecipient@gmail.com
...
Return-Path: <account@mywebserver.com>
Received: from mywebserver.com (mywebserver.com. [my:ipv6:address])
        by mx.google.com with ESMTPS id xxx
        for <webformrecipient@gmail.com>
        (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128);
        Tue, 02 Feb 2016 00:40:02 -0800 (PST)
Received-SPF: pass (google.com: domain of account@mywebserver.com designates my:ipv6:address as permitted sender) client-ip=xxx;
Authentication-Results: mx.google.com;
       spf=pass (google.com: domain of account@mywebserver.com designates my:ipv6:address as permitted sender) smtp.mailfrom=account@mywebserver.com;
       dkim=pass header.i=@mywebserver.com;
       dmarc=fail (p=NONE dis=NONE) header.from=gmail.com
DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mywebserver.com; s=mydkim;
    h=Date:Message-Id:Sender:From:Subject:To; bh=w2snQznwxlVRVACmfQELC7VGmD1dcYdiCXbCIRYFKRs=;
    b=a0Vy3Ky43J5FdiWSuQ4qvTTH47G+Js0W/qtRU5gMlxfesNqrlyaIyExaIZlWvHNL4o0LNOF1GI94w4C41mmH+2JIkMEQZazw0MainP7UyUgsm/RZbAWoRuecPv+k108FlsWMP/l1UttXAdlvBVJmV2UGsYYlSSjKErQEF8tv3K0=;
Received: from apache by mywebserver.com with local (Exim 4.80)
    (envelope-from <account@mywebserver.com>)
    id 1aQWVF-00009b-2X
    for webformrecipient@mywebserver.com; Tue, 02 Feb 2016 09:40:01 +0100
To: webformrecipient@mywebserver.com
From: Website User <website-user@gmail.com>
Sender: webformrecipient@mywebserver.com
...

Note that both the SPF and DKIM checks are passing, but the DMARC check is not. After some searching, I tracked this down to DMARC using the From: address to get its reference domain from, according to this answer on stack overflow.

Three questions:

  1. Is it likely that indeed the dmarc=fail is the cause of the email being assigned to spam by Gmail?
  2. Why does DMARC operate on the From: address, rather than the Return-Path (envelope sender) as SPF and DKIM do?
  3. If now also the From: header has to correspond to an address @mydomain.com then how should we specify the actual (logical, flesh and blood) sender of the message?
EelkeSpaak
  • 193
  • 1
  • 4

3 Answers3

7

Think of SPF and DKIM as ways to validate the mail path, and think of DMARC as an extension that also validates the message sender. Think of this as delivering a FedEx letter. It's easy to validate where the envelope was shipped from, and that the courier was legitimate, but it doesn't provide a way to prove that the letter inside the envelope is really from the person whose name is printed on it.

Your webserver is a valid SMTP server for mywebserver.com and that your Sender address is legitimate, but that's not enough for other servers to trust that you have permission to send as website-user@gmail.com . How does GMail know that your server hasn't been hacked or otherwise used for malicious intent? Gmail's servers aren't going to blindly trust you to send mail as one of their users -- unless maybe you are hosted by them, and then you'd probably have trouble sending to Yahoo.

To address your first part of the question, yes, it's very likely that this is why GMail is categorizing it as spam. The oldest forms of spam center around spoofing the "From" address. This is what most users see when they get a message, and is the primary field they want to trust. When a message from a legitimate mail server is sent using a From address that doesn't belong to that mail server, it's still a red flag.

As you mentioned, DMARC operates on the From address as part of the specification. Granted, it makes it harder to write web apps that send on someone's behalf, but that's sort of the point. As to why they do it - well, that's up to the designers of the specification, but it's a trade-off. They are taking the high road and making a system that works very well if you stay within that limitation. Perhaps future mechanisms will find a way around this.

The unfortunate solution is to only use addresses that you have control of. To address your third question, sign your messages with your domain name, and mention in the body that it was sent on behalf of website-user@gmail.com. Otherwise you will have to request that your recipients add the address to their whitelist. It's not much fun for a legitimate web app developer, but it will protect the sanctity of the recipient's inbox. You might have luck using the Reply-To header with the web user's email address.

There is a discussion of this limitation on this DMARC thread.

In the mean time, you can try to make sure that your server isn't blacklisted on any RBLs. It could be that you can fail DMARC but still get through some spam filters if you have good enough reputation... but I wouldn't rely on it.

GuitarPicker
  • 394
  • 1
  • 8
4

There are two "why" questions:

  1. Why does a receiving mail server perform the check in this manner
  1. Why was DMARC designed that way?

That section clearly establishes that a Sender: header, when present, takes priority over a From: header, for the purposes of identifying the party responsible for sending a message:

The "Sender:" field specifies the mailbox of the agent responsible for the actual transmission of the message. For example, if a secretary were to send a message for another person, the mailbox of the secretary would appear in the "Sender:" field and the mailbox of the actual author would appear in the "From:" field. If the originator of the message can be indicated by a single mailbox and the author and transmitter are identical, the "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD appear.

Contrast this with the rationale given in RFC 7489:

DMARC authenticates use of the RFC5322.From domain by requiring that it match (be aligned with) an Authenticated Identifier. The RFC5322.From domain was selected as the central identity of the DMARC mechanism because it is a required message header field and therefore guaranteed to be present in compliant messages, and most Mail User Agents (MUAs) represent the RFC5322.From field as the originator of the message and render some or all of this header field's content to end users.

I contend that this logic is flawed, as RFC 5322 goes on to call out this error explicitly:

Note: The transmitter information is always present. The absence of the "Sender:" field is sometimes mistakenly taken to mean that the agent responsible for transmission of the message has not been specified. This absence merely means that the transmitter is identical to the author and is therefore not redundantly placed into the "Sender:" field.

I believe that DMARC is broken by design, because

  • it conflates authority to send and proof of authorship;
  • it misinterprets prior RFCs, and
  • in doing so it breaks any previously compliant list-serv that identified itself by adding its own Sender: header.

If a Sender: field is present, DMARC should say to authenticate that field and ignore the From: field. But that's not what it says, and therefore I consider it to be broken.

RFC 7489 continues:

Thus, this field is the one used by end users to identify the source of the message and therefore is a prime target for abuse.

This is simply wrong (in the context of justifying ignoring the Sender: header). At the time that DMARC was designed, common email clients would routinely display a combination of the information from Sender: and From: fields, something like From name-for-mailing-list@server on behalf of user@original.domain. So it was always clear to the user who was responsible for sending the message they were looking at.


Suggestions that Reply-To: is an adequate replacement are also flawed because that header is widely misinterpreted as "additional recipient" rather than "replacement recipient", and replacing the original sender's Reply-To: would impair the functionality for those users.

  • Except for the desktop Outlook, common email clients display nothing about the Sender: header field. See other replies [here](https://mailarchive.ietf.org/arch/msg/dmarc/LQhrF8-jmu8JiRjKlnY5UbnEpBc/). – Ale Mar 25 '21 at 10:38
  • Thanks for the mailing list link. It seems I'm not alone in my opinions, but there are reasoned arguments both ways (but I still contend that the opposing camp conflates "proof of authorship" with "authority to send"). Although DMARC's RFC was posted in 2015, it was in development for many years before that, perhaps as far back as 2010, when desktop programs were still a large proportion of email clients. Let's not forget that the prime proponents of DMARC were entities who built large webmail farms, to support their existing practice - like not showing the Sender. – Martin Kealey Mar 28 '21 at 05:25
  • Whilst it may now be true that the majority of clients only show the **From:** header, it seems to me that mostly they only show the name, which only proves my point further: that certifying something that the user never looks doesn't prevent abuse, and does impede reasonable uses. See https://www.rfc-editor.org/rfc/rfc7960.html for a more complete enumeration of the problems with DMARC as it stands. – Martin Kealey Mar 28 '21 at 09:49
  • Here is a [list of workarounds](https://wiki.asrg.sp.am/wiki/Mitigating_DMARC_damage_to_third_party_mail). – Ale Mar 28 '21 at 10:51
  • Your assertion of their misunderstanding the RFCs is absurd. The section you refer to specifically states: `In all cases, the "From:" field SHOULD NOT contain any mailbox that does not belong to the author(s) of the message. See also section 3.6.3 for more information on forming the destination addresses for a reply.` There is no indication as to the sender address because, for instance, one may utilize a service to send their marketing emails. The sender address would be within the domain of the service, while the from address would be within the domain of the author. – TheHitchenator May 19 '22 at 13:03
  • @TheHitchenator I pointed out _which part_ they had ignored. Part of the rationale for DMARC using From rather than Sender was because "Sender isn't always present". While that's true in a shallow examination of the header names, it's false ine sense that the RFC says that the sender information _is always_ present, but that the Sender header isn't always the place to look for it. And while it's supposed to prove authorship, in practice it's used as (part of) "proof of authority to send", without providing any mechanism for authors to delegate that authority. – Martin Kealey May 26 '22 at 23:38
  • @TheHitchenator I did _not_ make an unqualified assertion that they had misunderstood it. I provided three alternative explanations in one sentence, any one of which would have been sufficient. I agree that "didn't understand" is quite unlikely, even absurd, but that simply emphasises my real point: it was (in all likelihood) _ignored_, for reasons that are seemingly undocumented. Perhaps you can point me at some documentation to that effect? – Martin Kealey May 31 '22 at 05:06
3

1) yes, likely the dmarc failure will cause gmail to junk your mails

2) also would be interested in an answer for this

3) I would (and we do) use the reply-to field for the customer address, our mails look like this:

from: website@mydomain.com

to: user@mydomain.com

subject: contact form

reply to: customer-addy@somewhere.com

Hope this helps

lukester85
  • 41
  • 3