12

Managing multiple servers, in excess of 90 currently with 3 devops via Ansible. All is working great, however there is a giant security problem right now. Each devop is using their own local ssh key to gain access directly to the servers. Each devop uses a laptop, and each laptop potentially could be be compromised thus opening the entire network of prod servers up to an attack.

I am looking for a solution to centrally manage access, and thus block access for any given key. Not dissimilar to how keys are added to bitbucket or github.

Off the top of my head I would assume the solution would be a tunnel from one machine, the gateway, to the desired prod server... while passing the gateway the request would pick up a new key and use to gain access to the prod server. The result would be we can quickly and efficiently kill access for any devop within seconds by just denying access to the gateway.

enter image description here

Is this good logic? Has anyone seen a solution out there already to thwart this problem?

JonathanDavidArndt
  • 1,414
  • 3
  • 20
  • 29
John
  • 877
  • 4
  • 15
  • 25

6 Answers6

24

That's too complicated (checking if a key has access to a specific prod server). Use the gateway server as jump host that accepts every valid key (but can easily remove access for a specific key which removes access to all servers in turn) and then add only the allowed keys to each respective server. After that, make sure you can reach the SSH port of every server only via the jump host.

This is the standard approach.

Sven
  • 97,248
  • 13
  • 177
  • 225
  • 2
    Even better: do what @Sven says but also add 2FA at the jump host. Because you re only connecting directly from the laptop when you need to manually, right? Anything automated is running from a server inside the jump host? – Adam Mar 18 '18 at 20:26
  • 1
    If you have a local certificate authority (subordinate or isolated), you can use those certificates with SSH, allowing you to centrally invalidate a believed compromised certificate. – Randall Mar 19 '18 at 09:27
11

Engineers should not be running ansible directly from their laptop, unless this is a dev/test environment.

Instead, have a central server that pulls the runbooks from git. This allows for additional controls (four eyes, code review).

Combine this with a bastion or jump-host to restrict access further.

Henk Langeveld
  • 1,294
  • 10
  • 25
2

Check out open source CLD software, it solve that problem: https://github.com/classicdevops/cld

Your engineers will able access any server according access matrix, also it provide 2FA by IP address as option.

2

Netflix implemented your setup and released some free software to help that situation.

See this video https://www.oreilly.com/learning/how-netflix-gives-all-its-engineers-ssh-access or this presentation at https://speakerdeck.com/rlewis/how-netflix-gives-all-its-engineers-ssh-access-to-instances-running-in-production with the core point:

We’ll review our SSH bastion architecture, which at its core uses SSO to authenticate engineers, and then issues per user credentials with short lived certificates for SSH authentication of the bastion to an instance. These short lived credentials reduce the risk associated them being lost. We’ll cover how this approach allows us to audit and automatically alert after the fact, instead of slowing down engineers before granting access.

Their software is available here: https://github.com/Netflix/bless

Some interesting take aways even if you do not implement their whole solution:

  • they use SSH certificates instead of just keys; you can put far more meta-data in the certificate, hence enabling a lot of constraints per requirements and also allowing simpler audits
  • using very short term (like 5 minutes) certificates validity (the SSH sessions stay open even after the certificate expires)
  • using 2FA to also make scripting difficult and force developers to find other solutions
  • a specific submodule, outside of their infrastructure and properly secured through the security mechanisms offered by the cloud where it runs, handles generating certificates dynamically so that each developer can access any host
Patrick Mevzek
  • 9,273
  • 7
  • 29
  • 42
1

OneIdentity (ex-Balabit) SPS is the exact thing you need in this scenario. With this appliance you can manage the user identities on basically any machines, track user behavior, monitor and alert, and index whatever the users doing for later reviews.

random
  • 11
  • 1
0

My suggestion is to disallow SSH access from user machines.

Instead you should

  1. Host playbooks in Git.
  2. Turn the "Access server" into a Jenkins server.
  3. Grant only needed Jenkins access to devops users.
  4. Execute Ansible plays on the Jenkins over build jobs via HTTP.
  5. As an additional security measure , disable Jenkins CLI if needed.

The sample execution model,

  1. Jenkins Ansible plugin: https://wiki.jenkins.io/display/JENKINS/Ansible+Plugin

OR

  1. Classic shell -execute type of job. Add your build steps manually, including git checkout.

If you are limited with server resources, the same Jenkins server can host Git (scm-manager) as well, although there is an additional security risk if one of the developer machine is infected. You may be able to mitigate this by disconnecting the Jenkins server from internet, and resolve Ansible dependencies locally.