11

I have noticed a peculiar behavior of my google apps domain. Most of the mails come through as you would expect, but over a period of time I have come to the conclusion that mails from certain senders don't come through. After identifying one such sender, whose mails wouldn't come through, I have asked him to try to send me an email and forward the "delivery failure"-response to my regular gmail.

The delivery failure response contained the following snippet:

----- Transcript of session follows -----
<myusername@GHS.L.GOOGLE.COM>... Deferred: Connection timed out with ghs.l.google.com.

This helped me to identify the problem by doing a quick search which led me to this page on Google Apps Help Forum. Indeed, I checked the DNS record for my domain, and @ was set to ghs.google.com. (CNAME), which it shouldn't be. Changing that to @ 74.125.93.121 (A)* resolved the problem.

I understand that in the cases where the mail wouldn't come through, my domain name was substituted by it's canonical name through a CNAME lookup, so the mail was sent to myusername@ghs.l.google.com instead of myusername@mydomain.com. But why did it work for the vast majority of senders? Did the senders whose mail wouldn't come through, use some different kind of mail protocol, some weird DNS settings, or what could it be?

From what I could see by researching the problem on google, this seems to be a wide-spread issue (lots of people complaining about emails from battle.net not coming through, would be one popular example), only that people don't seem to be aware that the problem lies in their own DNS settings, rather then at the senders' side.

So how can this be explained?

* I used this IP because of what I read here, but I think any IP would do the trick. Can anyone confirm this? Note that simply removing the @ record did not resolve the problem, it had to be changed.

Milo Wielondek
  • 212
  • 1
  • 2
  • 7

2 Answers2

13

From RFC 2821 "Simple Mail Transfer Protocol", section 5 "Address Resolution and Mail Handling":

The lookup first attempts to locate an MX record associated with the name. If a CNAME record is found instead, the resulting name is processed as if it were the initial name.

In general, this is how CNAMEs work. They are often mis-used, mis-understood, and mis-implemented. :-)

If your domain is example.com, you probably have existing MX records pointing to the usual Google Apps hosts.

example.com. MX 10 ASPMX.L.GOOGLE.COM.
example.com. MX 20 ALT1.ASPMX.L.GOOGLE.COM.
example.com. MX 20 ALT2.ASPMX.L.GOOGLE.COM.
example.com. MX 30 ASPMX2.GOOGLEMAIL.COM.
example.com. MX 30 ASPMX3.GOOGLMAILE.COM.
example.com. MX 30 ASPMX4.GOOGLEMAIL.COM.
example.com. MX 30 ASPMX5.GOOGLEMAIL.COM.

It sounds like you also had an entry like this:

example.com. CNAME ghs.l.google.com.

RFC 1034 "Domain Concepts and Facilities" states in section 3.6.2 "Aliases and canonical names" recommends against this configuration:

If a CNAME RR is present at a node, no other data should be present; this ensures that the data for a canonical name and its aliases cannot be different.

In the case of the error you pasted, the mail server and/or DNS server on the sending end attempted to look up MX record(s) for your domain, example.com, and found a CNAME pointing to ghs.l.google.com. It then tried to look up the MX record(s) for ghs.l.google.com. That domain does not currently have any MX records, so the mail server would have fallen through to the A record for ghs.l.google.com. That IP address was not listening on the SMTP port, so the result is the error "Connection timed out with ghs.l.google.com."

By removing the CNAME record, you've fixed your mail problems. You might encounter issues if the IP address you've defined in its place is changed on Google's end.

You could instead define the cname for www.example.com:

www.example.com. CNAME ghs.l.google.com.

And run a small webserver on whatever IP you point example.com at, which simply does an HTTP redirect to http://www.example.com/

It's somewhat surprising that it worked as well as it did. Postel's law gets some credit there, I believe. :-)

Back to RFC 1034 2.6.2:

CNAME RRs cause special action in DNS software. When a name server fails to find a desired RR in the resource set associated with the domain name, it checks to see if the resource set consists of a CNAME record with a matching class. If so, the name server includes the CNAME record in the response and restarts the query at the domain name specified in the data field of the CNAME record. The one exception to this rule is that queries which match the CNAME type are not restarted.

So, in this case it could be argued that the DNS server would/should not follow the CNAME on an MX lookup unless there were no MX records found.

When sending mail, Sendmail and qmail (and likely others) will by default attempt to rewrite any CNAME used in the right hand side of an email address to the canonical name.

Indeed, some sites relied on this behavior. djb goes into some detail on why he thinks people should stop relying on it in his "CNAME records in mail" document.

jeff
  • 3,006
  • 1
  • 19
  • 10
  • Thank you for this exhaustive answer! :) So to summarize, you'd say that the reason why it worked for some but not for other senders, is that they use different MTAs which follow the CNAME despite MX records being there, which according to RFC 1034 2.6.2 can be considered faulty behavior? – Milo Wielondek May 19 '12 at 21:53
  • I'm not sure I'd call the behavior "faulty". The configuration of a CNAME with other records (MX, NS, etc) is something that was broken / not-recommended, and different hosts interpreted it in different ways. – jeff May 19 '12 at 22:05
  • Is that a 'generally yes' _but_ you are not sure you'd call the behavior faulty, or did I completely miss the point? – Milo Wielondek May 19 '12 at 22:09
  • The specifics are a mess, so 'generally yes' :-) – jeff May 19 '12 at 22:41
  • An MTA should be querying the domain after the `@` in the email address for MX records and nothing else. If it gets any, it should immediately attempt delivery to one of the lowest MX records. If all MX servers fail to connect or no MX records are found, it should try connecting to the domain itself. The MTA in question is obviously going too far in resolving information, or isn't following the rules for determining what mail server to connect to. There should be nothing wrong with having your domain point to a CNAME - but you need the MX records for email to work. – Eli Sand May 24 '12 at 02:00
1

The @ symbol in a BIND record is just a shorthand way of writing the domain. If you're creating a record for example.com, then @ is just an alias for example.com. Saying that the @ record had to be an IP is a statement that is missing critical information - you didn't tell us what type of record it was.

From the delivery report, it seems that you perhaps did something with your DNS to cause the remote mail server to rewrite your domain to ghs.l.google.com - very strange (PS, an A record must be an IP, a CNAME record must not be an IP or another CNAME record).

Why that persons mail server is rewriting your address is strange - it shouldn't unless that person did something to explicitly tell it to rewrite it. It should also not care at all what the IP of your domain is unless it couldn't find any MX records, since MX records are how mail servers figure out where mail goes.

It sounds to me like, given the very little information provided, that you did not follow the google instructions on how to properly configure your DNS for email at all. You probably even have some errors in your zone file - have them checked over by a competent zone administrator.

Eli Sand
  • 119
  • 5
  • First, I did mention that the `@` record was of type CNAME. Second, the DNS I use is the one supplied by google upon purchase, hence I don't even have access to the zone file. I used default settings provided by google. And last but not least, the "very little information provided" was apparently enough for someone competent to provide a helpful, satisfactory and (in contrast to your own) cordial answer. – Milo Wielondek May 19 '12 at 22:07
  • You clearly do not understand DNS and the downvote was completely unwarranted. You also edited your question after I posted my answer adding extra information. You also never mention once that you do not have access to your zone file despite clearly mentioning you have changed them. – Eli Sand May 21 '12 at 23:19