2

I have an SPF record that has too many DNS lookups. Consequence is some mail servers will silently drop emails; RFC7028 says that over 10 lookups:

SPF implementations MUST limit the total number of those terms to 10 during SPF evaluation, to avoid unreasonable load on the DNS. If this limit is exceeded, the implementation MUST return "permerror".

Not all mailservers obey this MUST so most email is delivered. But some do, and don't generate errors.

One suggestion has been to flatten the SPF record, by resolving all the include lines. For example:

$ dig -t TXT google.com | grep spf1
google.com.     3527    IN  TXT "v=spf1 include:_spf.google.com ~all"

Which leads to:

$ dig -t TXT _spf.google.com | grep spf1
"v=spf1 include:_netblocks.google.com include:_netblocks2.google.com include:_netblocks3.google.com ~all"

Recursing through those three include hosts and concatenating results returns this as a final record

"v=spf1 ip4:35.190.247.0/24 ip4:64.233.160.0/19 ip4:66.102.0.0/20 ip4:66.249.80.0/20 ip4:72.14.192.0/18"
"ip4:108.177.8.0/21 ip4:173.194.0.0/16 ip4:209.85.128.0/17 ip4:216.58.192.0/19 ip4:216.239.32.0/19 ip6:2001:4860:4000::/36"
"ip6:2404:6800:4000::/36 ip6:2607:f8b0:4000::/36 ip6:2800:3f0:4000::/36 ip6:2a00:1450:4000::/36 ip6:2c0f:fb50:4000::/36 "
"ip4:172.217.0.0/19 ip4:172.217.32.0/20 ip4:172.217.128.0/19 ip4:172.217.160.0/20 ip4:172.217.192.0/19 ip4:172.253.56.0/21"
"ip4:172.253.112.0/20 ip4:108.177.96.0/19 ip4:35.191.0.0/16 ip4:130.211.0.0/22 ~all"

Clearly this flattening needs to be an automatic process that refreshes at least as often as the TTL for this record (3600s for google.com)

QUESTION What are the implications of doing this flattening ?

Criggie
  • 508
  • 3
  • 12
  • I am aware the correct solution is to reduce the SPF record size and move senders off to a sub-domain. However that's not likely to happen anytime soon. – Criggie Aug 17 '22 at 22:56
  • The worst domain I have found for exceeding this limit is `toyota.com` that has 18 DNS lookups in its SPF record. – Criggie Aug 17 '22 at 22:57
  • 1
    There are some mail servers out there that fail 2048-bit DKIM tests because the key record is longer than one string. Flattening doesn't automatically mean the string has to be too long, but I checked my messages from `toyotaowners@e.toyota.com` and the SPF record had `include:cust-spf.exacttarget.com`, which that responds within several strings. – Paul Aug 18 '22 at 16:43
  • 1
    It just occurred to me to check the `toyota.com` DMARC records and it turns out they have `p=none`. Given how much attention they put on their mail RRs, I suspect someone there knows they have a problem. I wonder if a message to their abuse@ would go to someone that would be willing to work with you on this. – Paul Aug 19 '22 at 13:06
  • @paul My issue is with my own records, not toyota which was only an example of someone else with more-than-10 lookups in their SPF. – Criggie Aug 19 '22 at 23:20

2 Answers2

1

In the world of email, very few implementations (of anything!) strictly adhere to the spec. Anti-spoofing technologies like SPF, DKIM, and DMARC tend to buck that trend and implement as designed, but there are of course exceptions. The maximum of ten DNS lookups is frequently relaxed because there are sooo many implementations that fail to meet it.

This usually comes from bad advice like including all of an affiliate's own SPF record rather than the parts that are relevant to the affiliate. I pointed out Bluehost's bad advice a few years ago.

Google is aware of this issue and, assuming they actually need to include that ridiculous volume of IPs, they've managed to narrow that list down to three DNS lookups. Your "flattened" version exceeds the UDP size recommendation and must be sent via TCP. While this does happen, it introduces more latency and isn't broadly compatible given different SPF software implementations.

If you're coping another domain's SPF records, you introduce the potential to be out of sync. That in turn allows for potential attackers to spoof the domain and pass SPF and DMARC, which is dangerous. I wrote about over-broad IP allocation in SPF earlier this week, specifically calling out the concern over IPs being blessed but not actually under the SPF domain operator's control.

As I noted in that other post, you're better off using aligned DKIM instead of SPF. SPF is coarse and cannot safely be applied to IPs that are not exclusively under the domain's control. DKIM at least requires using an appropriate key, so other tenants on that shared system would need access to that key to forge as the domain in question. A reallocated host could be acquired by an attacker, but they'd only get the IP address, not the DKIM key (unless they actually compromise that host).

With that in mind, SPF should only be used for wholly-owned hosts that are not able to implement DKIM. Such an implementation would look something like this: v=spf1 mx a:non-dkim.example ?all, which allows the domain's MX records and the A record at non-dkim.example but everything else is neutral (neither a pass nor a failure), so their mail can only be DMARC-valid if they use DKIM.

⚠️ Warning: Always vet your DMARC configuration with aggregate reports before moving to p=reject.

Adam Katz
  • 9,718
  • 2
  • 22
  • 44
  • Yep - the cause is that we've been adding INCLUDE: for a bunch of different mail services over time. Salesforce, etc, etc. Getting someone to re-configure their working setup has proved impossible. **This is a dirty hack** but I can't see any significant gotchas other than having to automate it. – Criggie Aug 19 '22 at 01:46
  • 1
    I have evidence that some of our accounting invoices, reminders, etc are not getting to the end users, which was the main reason for doing something. – Criggie Aug 19 '22 at 01:47
  • Set up aggregate reports with [DMARC](https://en.wikipedia.org/wiki/DMARC) to see what's not working. – Adam Katz Aug 19 '22 at 01:54
  • Yep done that - I get around 20 a day, with 90% of them being "SPF record says no" because the client us using a mailwashing service like proofpoint and their final mailserver is misconfigured. Can't fix that. Its the 10% other I am trying to resolve. – Criggie Aug 19 '22 at 02:00
1

Simple recursive bash script used to generate a flattened output:

call it as flatten.sh toyota.com

#!/bin/bash

# Get the SPF record and recurse through it resolving host includes
# until none are left, just IP4 and IP6 lines, plus unrecognised things.

# Here's a single global array where we build the output.
declare OUTPUT=( )

function diggity {
    # FQDN or Domain name, given as single parameter
    D=$1

    # Get initial record
    temp=`dig -t TXT $D | grep v=spf1 | awk ' {for(i=6;i<=NF;i++) printf $i" "; print ""}  ' `

    # Strip out pesky quotes and make another local-scope array
    local RECORD=(`echo $temp | tr -d \"`)

    # Now we have an array of include, ip4 and ip6.
    # Iterate though replacing INCLUDE lines with whatever they have

    for i in "${RECORD[@]}"
    do
        :
        case "$i" in
        *include:*)
            temp=`echo $i | sed -e "s/include://g" `
            diggity $temp
            ;;
        *all*)
            # Do nothing with this, eat it.
            ;;
        *)
            # copy verbatim into output array
            OUTPUT+=($i)
            ;;
    esac
    done
}
# -----------------end of function---------------

# Start the loop
diggity $1


# Check we got some output
if [ ${#OUTPUT[@]} -eq 0 ] ; then
    echo Error - no output returned.  Check $1 is a correct hostname
    exit 1
fi

# Generate output
echo -n "v=spf1 ${OUTPUT[@]} -all"

After that, the trick is to get the output into your DNS service automatically. Don't one-shot this, any include lines could change over time.

So far there has been no downside, but I'm not yet live. (TBC)

We left a clone of the SPF record as _original_spf.domain.com and that updates the real SPF record. This was done for convenience and error avoidance.

Criggie
  • 508
  • 3
  • 12