15

Earlier today I received a notification of a security incident at Mandrill. At first I was concerned, but then after I dove into the details I became confused as to why they considered this noteworthy at all.

To summarize, it appears that Mandrill made some changes to their EC2 security groups, which resulted in some ports on their logging servers being accessible to the Internet for a few weeks.

On March 10, we discovered evidence that automated attempts were made against Mandrill's internal logging servers in an effort to use them in a botnet. Analysis of the servers that were impacted, including network traffic logs and files present on the servers, indicates that these attempts were unsuccessful. There are no signs that the servers were targeted to access the data stored on them.

We investigated the issue and found that the opportunity for this attack stemmed from a firewall change we made on February 20 in order to more granularly control access to some of Mandrill's servers. Parts of Mandrill's infrastructure are hosted with Amazon Web Services (AWS), and we use EC2 Security Groups to control access. One change was made to a security group that contained more servers than we intended to affect. As a result, a cluster of servers hosting Mandrill's internal application logs was made publicly accessible instead of allowing internal-only access.

Now, we all deal with botnets all the time and how to keep your machines reasonably safe from them is pretty well known. Especially since nearly all of what I run is public facing and I see millions of botnet attempts, none of which ever succeed at doing anything but generating log entries.

But because of this they are recommending that everyone invalidate and change all of their API keys. I personally have close to a dozen active API keys and it will take me most of an afternoon to change all of these where they need to be changed.

My personal thought is that they have massively overreacted to this. But I don't know everything. Is there some reason that simply having open ports would be considered a potential compromise here?

Michael Hampton
  • 3,877
  • 1
  • 22
  • 32
  • It says that unsuccessful attempts were made... is this "recommendation" to change API keys actually just a recommendation or is it mandatory? – KnightHawk Mar 19 '15 at 04:37
  • @KnightHawk They have not mandated changing API keys, only recommended it. – Michael Hampton Mar 19 '15 at 12:05
  • 2
    I think that recommending it is a wise step. Letting people know about a possible threat is important and mandating changes should be reserved for times when the threat is either known to be real or at least is presumed to be very likely real. Recommending a cautious course of action in this case is the right thing; if nothing else it serves to make some people less complacent in the short term future. – KnightHawk Mar 19 '15 at 15:31
  • 1
    I get brute force attacks on my web servers constantly. Should I tell people to change their passwords every day, then? – Michael Hampton Mar 19 '15 at 15:36
  • Change the keys. Then forget it. You spend more energy this way than changing the keys. But you forgot one thing: how important is it to your company that these keys are secure? Can you take the risk? – SPRBRN Mar 19 '15 at 15:45
  • No. Just getting a brute force attack is not in itself a vulnerability, and frequent password changing is not the best method of security. – KnightHawk Mar 19 '15 at 15:50
  • The communication also says: "Although there's no evidence that your API key was exposed or accessed, we strongly recommend this as a precaution." Their recommendation is to change the keys, but they have no positive knowledge that you need to. You can choose to take on the risk of the unknown. – schroeder Mar 19 '15 at 17:51

1 Answers1

16

From reading the disclosure, it seems what happens is that the open port isn't actually just a random port, but rather it is a port where a certain internal service that had never been designed to be exposed directly to the external world was listening on. If there is a vulnerability in that internal service, that might have allowed a possible attacker to obtain the API key of users that uses their service within the specified period of time.

While they do note that the attack looks like an automated attack rather than a targeted attack, it should be considered that sophisticated attackers often uses a highly visible automated attack to divert the server's and the sysadmin's attention away from another covert, targeted attack that are happening at the same time, i.e. the big DDoS that causes a lot of attention is just a diversion.

A sophisticated attacker could have erased their tracks if they managed to gain privilege into the system to make it look like nobody has compromised important data. This is why they recommend changing your API keys, even though they found no evidence of the data being actually queried. Especially since this is a logging server that was compromised, and since sysadmin would usually use logs to detect compromise, a compromised logging server can hide what's truly happening.

A sophisticated attacker could also have left a backdoor/rootkit on the system to allow themselves to reenter the system at later date when everyone thought things have settled down, so they can do their actual attack at a more leisurely pace from inside the security boundary, which could be highly dangerous. This is why they are decommissioning the machine involved.

Or it could be that they may be doing this as a show to some potential security-conscious customers that are on the fence on buying decisions due to doubts about their security. This might shift the attitude of those potentials customers to decide to take up their service, since they're now reassured that the company wouldn't be keeping mum when there are vulnerabilities that might affect them.

Is there some reason that simply having open ports would be considered a potential compromise here?

Yes, the open port exposes an internal service that wasn't designed for external use. Services intended for internal use often lacks the security hardening that public facing services would get, because the assumption was that the service is only accessible from trusted machines. For example, an internal service might not use SSL to communicate within the trusted network, or they might not validate their inputs as thoroughly as they assume the trusted senders already does any necessary validations, or they might have administrative commands that can be used without authentication, because it assumed that any necessary authentications have been done by the public facing services.

Lie Ryan
  • 31,089
  • 6
  • 68
  • 93