53

I've used greylisting on my servers for many years, but I don't know how effective it is nowadays.

Is it still good for fighting spam in 2012?

Or is the typical spammer MTA capable of resending greylisted emails now?

Greg Askew
  • 34,339
  • 3
  • 52
  • 81
neu242
  • 714
  • 2
  • 7
  • 15
  • "Good" for what? Greylisting does have pros _and_ cons. – Michael Hampton Oct 09 '12 at 12:24
  • 2
    @Michael: For fighting spam. Read the question :) – neu242 Oct 09 '12 at 12:48
  • I read the question. It doesn't betray that you are aware of the pros and cons of greylisting. – Michael Hampton Oct 09 '12 at 12:57
  • 1
    We can deal with the pros and cons of greylisting in another question :) – neu242 Oct 09 '12 at 13:03
  • Perhaps so, but then that means you should probably remove the word "good" from this one and replace it with a more appropriate adjective. It's almost certainly not the right word for any spam prevention measure that causes the CEO to complain. – Michael Hampton Oct 09 '12 at 13:04
  • 1
    I changed the title slightly. Does it look better now? – neu242 Oct 09 '12 at 13:09
  • 3
    @MichaelHampton Uh, point of interest... what spam prevention measures do you use that *don't* cause the CEO to complain? Or maybe, if more appropriate, what kind of CEO do you have that isn't a spoiled, whiny $#^&*@ who'll find something to complain about in absolutely anything? – HopelessN00b Oct 09 '12 at 15:22
  • 1
    @HopelessN00b Our CEO is not like that. I wouldn't be judging someone's personality or behaviour based solely on their profession. –  Oct 10 '12 at 02:23

5 Answers5

55

I last looked at this quantitatively in July of this year (2012). In July, my mailserver received about 46,000 attempts to deliver mail; of those, about 1,750 returned and were permitted through by the greylisting (and passed valid sender domain, SPF and some other non-content-based tests). Of those, about another 1,500 were filtered by my content-based filtering..

Assuming that those 44,250 emails were spam (since they couldn't pass greylisting, I think that's a fair assumption), if it were not for the greylisting my content-based filtering would have had to deal with 46,000 mails instead of 1,750.

A twenty-five-fold increase in load on my content-based filtering would require me to have much beefier CPUs and more memory. That would in turn increase my monthly hosting costs, because of the extra power consumption (and, probably, the size of the server).

So in short, the last time I counted, yes, greylisting still made very, very good sense as part of a complete spam-filtering system. I have activated it for clients in the past few weeks, and all are extremely happy with the decrease in load on their content-based filtering systems also.

Edit: I note that I haven't answered the question about whether it's becoming less effective over time. When I turned it on, in late 2006, my estimate at that time was that it was filtering out about 95% of the spam. 1,750 as a proportion of 46,000 is about 4%, so my data suggest that it's not become less effective over that time period.

MadHatter
  • 78,442
  • 20
  • 178
  • 229
  • 2
    Exactly the kind of answer I was looking for. Thanks! – neu242 Oct 09 '12 at 11:55
  • Well, it's only a small mailserver, so the results might not be representative, but I'm glad a quantitative answer was what you wanted! – MadHatter Oct 09 '12 at 11:56
  • 3
    I think it makes a lot of sense to look at this quantitatively in your particular situation. I just checked, and my mail server sees very different figures: total for August and September, 460214 5xx rejects, 12331 4xx rejects and 22665 accepts. Thus, 4.6% accepted and only 2.6% of spam (at best) blocked by greylisting. The 5xx rejects are dominated by 8.4% unknown user and >90% RBL. (And I don't even run extremely aggressive RBLs. A completely overwhelming majority of the RBL blocks are [XBL](http://www.spamhaus.org/xbl/).) Then again, traffic caught by RBLs never make it to greylisting. – user Oct 09 '12 at 14:36
  • 7
    Interesting, but I can't make a direct comparison because I won't, as a matter of principle, use any RBL as a brightline test for receipt; I only use them as contributors to a spamassassin score. I've been on RBLs myself too often, for completely bogus reasons, to entrust the operation of my own mail to someone else's rationale. If, however, we were to assume that all those XBL rejections are from fire-and-forget botnets, then if you greylisted first as I do, you'd see comparable percentages to me. – MadHatter Oct 09 '12 at 14:47
  • 1
    Yes, I have strongly considered changing it to being only a spam score contributor and relying on greylisting, for precisely the reason that you mention. However, that does not negate the point I was making, that different servers may see very different traffic patterns and the only way to really know if greylisting is effective is to look at it from the point of view of your particular setup. – user Oct 10 '12 at 12:20
  • 1
    I was going to disagree again, but really I completely agree with you. For all users, the best way to find out if it's an effective technique in your mail flow is to **try it in your mail flow and measure** - Michael speaks wisely! – MadHatter Oct 10 '12 at 12:25
12

Update 2019: Conditional Greylisting is the best trade-off

After having used greylisting on all mails for a long time on a busy mailserver, I stopped doing so. It delays ham mails unnecessarily. This is especially annoying for mails with account activation links. It creates severe problems with cloud mailers that you need to whitelist (details in the answer below). Another disadvantage of greylisting all mails before passing them to your spam filter is that a learning spam filter misses the opportunity to get to know all the easy spam that the greylisting filtered away.

Instead of greylisting all mail, I now use a spam filter (rspamd, I highly recommend it) that only greylists mails that have a spam score between clear ham and clear spam. This way, ham mails hardly get delayed. On the other side, spam mails that hit greylisting often get detected as spam when the server tries the second time, because by then, the spam is often known in blacklists (RBLs, URIBLs) and fuzzy filters and thus gets a higher score.

So I recommend to greylist mails that are unclear whether they are spam or ham. Time really helps the spam filter to figure out the unclear cases.

Original answer from 2018:

I was always a great fan of greylisting. For these reasons:

  • It does not only mark spam, it blocks it.
  • It is legal to use as a service provider in Germany (unlike deleting spam mails after reception)
  • It is simple and effective.
  • It adds load to the spammer and not to your receiving mailserver. So even though the spammers may make it through your greylisting, you forced their machine to work harder and thus they can send less spam in total.
  • It blocks almost no legit mail, unlike IP-based RBLs etc.
  • It introduces delays, but you can whitelist clients (sending servers) of frequent contacts and whitelist recipients that really need email with minimum delay. Remember that using a spam filter like Spamassasin directly on all your mail (without greylisting) can introduce delays to legit mail as well: Some spammer sends so many mails to your server that the spam filter gets overloaded. Thus, it will send a temporary failure (e.g. 451) to the sending server of further incoming mails. This causes the same effects as greylisting, i.e. mails get delayed, with the exception that whitelisting is not that easy. Of course, you can use a cloud spam filter that scales to whatever power the spammer has, but that may be more expensive.
  • Limited or no maintenance required. No blacklist that need to be updated and change over the time. No pattern-based rules that need to be updated.

But unfortunately, in my stats I see that in this year, greylisting becomes less and less effective. The amount of delayed messages really approaches the amount of greylisted messages rather fast, which means the amount of blocked spam is reducing.

In the last year (365 days), 55% of the greylisted messages made it eventually through greylisting, i.e. 45% got blocked.

mailgraph stats year

mailgraph stats year

Note that this chart included a timeframe in which greylisted messages were not counted due to a configuration error of mailgraph, only delayed ones. This means this calculation overestimates delayed messages a little, in fact a little more mails got blocked.

In the last month, 64% got delayed and only 36% got blocked.

mailgraph stats month

mailgraph stats month

In the last week, 75% got delayed and only 25% blocked.

mailgraph stats week

mailgraph stats week

Moreover, looking at the total amount of blocked messages: This month, greylisting blocked 4 411 messages, but Amavisd (spamassasin) blocked 22 763 messages. This means only 16% of the spam gets blocked by greylisting, all the rest by amavisd.

Moreover, more and more cloud sending providers send from a bunch of several hundred IP-addresses. They attempt each transmission attempt from another IP. Thus, greylisting may block these mails for even days. Therefore, you need to whitelist all the "good" mail providers. This introduces new maintenance effort.

I have always been a great fan of greylisting, but sadly, I see that it is becoming less and less effective, and I think that I will disable it soon, as it starts to only delay 14% of my mails unnecessarily without blocking much spam.

The missleading stats

The amount of blocked mails in my (and your) stats may also be largely misleading. Let's take one email that is coming from a big cloud mail provider (like Microsoft's *.outbound.protection.outlook.com) that is not yet whitelisted. The first attempt fails. The second and third transmission attempts come from two other servers (IPs), so it again it fails, as the triplet does not match. Now the fourth attempt comes from the first server again and succeeds. This will be counted as one delayed transmission and four greylisted messages. My calculations above would indicate that 1/4=25% greylisted messages was delayed and 3/4=75% were blocked. But in fact, not a single message was blocked. Now we whitelist the servers of these mail providers, so they will not be greylisted anymore. What will happen is that the amount of greylisted messages will go down more than the amount of delayed messages. This means that the amount of blocked messages we calculate will go down. But it is not true that less messages were blocked.

In fact, what I did since February 2017, is adding more and more cloud mail providers to the whitelist to fight the problem of long delays due to greylisting. This may explain (partly?), why the amount of blocked mails that I calculate is going down rapidly. So maybe, I just thought all the time that greylisting is blocking lots of spam, but the amount of spam blocked was a lot less all the time, it was just calculated incorrectly. So be careful when interpreting your stats.

Christopher K.
  • 291
  • 2
  • 6
  • 1
    That's very interesting - thanks for posting the research! – Jenny D Aug 23 '18 at 09:53
  • +1 from me - time to re-examine my data. I agree that these bloody annoying people who bounce mail around their internal server estate, so that each attempt comes from a different server, will skew the data. I'm not sure I buy your last section, which seems to be arguing that all or most apparent benefits of greylisting are caused by over-counting inbound emails thereby. – MadHatter Aug 29 '18 at 15:44
  • Stats for my private mail server for the last year (as of July 2019) show that only 15% of the messages were delayed and 85% were blocked. I took a look at the blocked senders and they do look like spammers. However, I'm not greylisting everything, but only senders blacklisted by RBLs (zen.spamhaus.org, spam.dnsbl.sorbs.net and psbl.surriel.com). Well-configured greylisting still seems to be efficient. – michau Jul 22 '19 at 11:35
  • 1
    @michau Sure greylisting suspicious mails is more efficient than greylisting everything. Rspamd is an interesting rather new spamfilter that does it rather well. It greylists mails that reach a low spam score. So like you, it would also greylist mails from RBL listed senders (as long as the mail does not score high enough for a reject). But it would also greylist mails that match a few spam rules but don't score high enough for a reject. – Christopher K. Jul 22 '19 at 13:27
8

spambots usually still don't do message queueing , but some of them just send the spam twice to every recipient with a few minutes delay to defeat greylisting. also, nowadays, spam from spambots isn't the real problem anymore, spam from compromised yahoo accounts etc is much harder to catch.

From that point of view, greylisting is not as effective as it used to be. In combination with other anti-spam techniques it can still help, for example if your domain is often in the "first batch" of spam campaigns, greylisting can help delay the message long enough for domain/ip blacklists to catch up, so if the spam would have slipped through your filters on the first connection attempt, it maybe gets detected on the second attempt.

Gryphius
  • 2,710
  • 1
  • 18
  • 19
  • The vast majority of the Spam delivery attempts I receive do come from Spambots. I use other techniques to discourage Spambots and most give up before they can be be greylisted. On my server Greylisting still blocks about half the senders, it processes. I do exempt senders which can be determined to be extremely likely to pass greylisting. – BillThor Oct 10 '12 at 00:35
5

As a tangential issue, i don't like being in the position of having deployed a technique like greylisting without being able to measure its effectiveness. On Debian, with postfix as the MTA and postgrey as the greylisting policy engine, you can just apt-get install mailgraph to get a simple graph of accepted vs. rejected mail. Mailgraph is a bit old school and completely standalone, but it works, and its data or techniques could easily be integrated into a more complex modern monitoring system.

Paul Gear
  • 3,938
  • 15
  • 36
3

Get a reputation-based mail filter. Greylisting is a bit old-school and isn't a comprehensive solution. There are workarounds (from the spammer's perspective), and unpredictable mail delivery times for your users...

Either outsource the filtering to a cloud service or buy an appliance that has access to such a list and has other methods of validating spam. My recommendation is usually Barracuda for their appliance or for their cloud filtering solution. Both options have economies of scale and mature heuristics that provide a cleaner overall solution.

Looking at one of my client's Barracuda Spam Filter's report for September 2012, out of 98,457 messages, 1,623 were cut-off before even hitting the mail server because of bad recipients... 34,488 were blocked as SPAM. Only 96 questionable messages made it through. Those rated as SPAM were a combination of reputation, score, intent, three RBL's, Bayesian filtering and custom rulesets. All in one unit... All processed before hitting the relatively small mail server.

enter image description here

Also see: Fighting Spam - What can I do as an: Email Administrator, Domain Owner, or User?

ewwhite
  • 194,921
  • 91
  • 434
  • 799
  • 2
    Interesting, but you're not answering my questions regarding greylisting. And your stats without greylisting numbers aren't very relevant here :) – neu242 Oct 09 '12 at 12:51
  • @neu242 The point is that 1). Greylisting has known-disadvantages, 2). cannot be considered a *whole* solution and 3). there are better ways of detecting spam as the processes have evolved over the past few years. – ewwhite Oct 09 '12 at 12:57
  • 4
    Greylisting is of course just a part of my spam prevention toolkit. My setup is much like @MadHatter's. But since I asked specifically about greylisting, I sort of expected greylist specific answers. – neu242 Oct 09 '12 at 13:05
  • @neu242 That was not conveyed in the original question's language. – ewwhite Oct 09 '12 at 13:28
  • 1
    @ewwhite: actually, I don't see how it wasn't rather clear. For your reference: I did check the question history. Didn't see a change that affected this in any way. – Jürgen A. Erhard Dec 02 '12 at 01:31
  • 2
    @JürgenA.Erhard You're a little late to this. And your post is rude. Any professional spam filtering solution implemented *today* should not rely solely on greylisting. If you have any other concerns, see the canonical [Server Fault spam question here](http://serverfault.com/q/419407/13325). – ewwhite Dec 02 '12 at 01:35