26

The devops guidelines at https://12factor.net/config suggest to put website secrets (database passwords, api keys, etc.) into environment variables. What advantages does that have instead of using text files (JSON, XML, YAML, INI, or similar) ignored from version control?

I find it much easier to copy a configuration file with secrets than to handle environment variables in .bash_profile and webserver configuration. Do I miss something?

Aidas Bendoraitis
  • 1,345
  • 1
  • 12
  • 17
  • 1
    In theory it is easier to read a file than memory so you could consider the attack surface bigger and complexity smaller. – Florin Asăvoaie Jan 16 '18 at 16:40
  • My dev ops guy's rule of thumb is that storing settings in environment variables is best only to be done in docker-like environments. Outside of container VMs, he approves/prefers all other points of 12factor.net and the use of config files. None of us liked the insecure nature of environment variables on regular server deployments. – Corey Ogburn Jan 17 '18 at 02:18

6 Answers6

22

The author lists their reasoning, although it's a bit disjoint. Their primary argument is that it's easy to accidentally check in a config file, and that config files have varying formats and may be scattered around the system (all three of which are at best mediocre arguments for security related config like auth tokens and credentials).

Given my own experience, you've essentially got the following three options, with associated advantages and disadvantages:

Store the data in config files.

When taking this approach, you should ideally isolate them from the repository itself, and make sure they're outside of the area that the app stores it's content in.

Advantages:

  • Very easy to isolate and control access to, especially if you're using things like SELinux or AppArmor to improve overall system security.
  • Generally easy to change for non-technical users (this is an advantage for published software, but not necessarily for software specific to your organization).
  • Easy to manage across large groups of servers. There's all kinds of tools for configuration deployment out there.
  • Reasonably easy to verify what the exact configuration being used is.
  • For a well written app, you can usually change the configuration without interrupting service by updating the config file and then sending a particular signal to the app (usually SIGHUP).

Disadvantages:

  • Proper planning is needed to keep the data secure.
  • You might have to learn differing formats (though these days there's only a handful to worry about, and they generally have similar syntax).
  • Exact storage locations may be hard-coded in the app, making deployment potentially problematic.
  • Parsing of the config files can be problematic.

Store the data in environment variables.

Usually this is done by sourcing a list of environment variables and values from the startup script, but in some cases it might just state them on the command-line prior to the program name.

Advantages:

  • Compared to parsing a config file, pulling a value out of an environment variable is trivial in pretty much any programming language.
  • You don't have to worry as much about accidentally publishing the configuration.
  • You gain some degree of security by obscurity because this practice is uncommon, and most people who hack your app aren't going to think to look at environment variables right away.
  • Access can be controlled by the application itself (when it spawns child processes, it can easily scrub the environment to remove sensitive info).

Disadvantages

  • On most UNIX systems, it's reasonably easy to get access to a process's environment variables. Some systems provide ways to mitigate this (the hidepid mount option for /proc on LInux for example), but they aren't enabled by default, and don't protect against attacks from the user who owns the process.
  • It is non-trivial to see the exact settings something is using if you handle the above mentioned security issue correctly.
  • You have to trust the app to scrub the environment when it spawns child processes, otherwise it will leak information.
  • You can't easily change the configuration without a complete restart of the app.

Use command-line arguments to pass in the data.

Seriously, avoid this at all costs, it's not secure and it's a pain in the arse to maintain.

Advantages:

  • Even simpler to parse than environment variables in most languages.
  • Child processes don't automatically inherit the data.
  • Provides an easy way to quickly test out particular configurations when developing the application.

Disadvantages:

  • Just like environment variables, it's easy to read another process's command-line on most systems.
  • Extremely tedious to update the configuration.
  • Puts a hard limit on how long the configuration can be (sometimes as low as 1024 characters).
Austin Hemmelgarn
  • 2,070
  • 8
  • 15
  • 1
    One not unimportant point is unattended (re)boot of a server, without manually having to give any passwords, in the end they are somewhere on the disk for that – PlasmaHH Jan 16 '18 at 20:52
  • 7
    *On most UNIX systems, you can read pretty much any processes environment variables without needing any significant privileges.* -- Can you expand on that? The /proc/####/environ file is only readable by the owner, so you'd need to be root or have sudo. – rrauenza Jan 16 '18 at 21:40
  • I think some of this env config trend also came about from things like docker where you use a standard container and configure it by passing env variables to the container. – rrauenza Jan 16 '18 at 21:41
  • @rrauenza Ownership of a process isn't a significant privilege unless you do a very good job of segregating things by account, and you actually only need the CAP_SYS_ADMIN capability (which root implicitly has) if you're not the owner. Also, regarding the environment variable thing, you're probably right, but it's a marginal design even with Docker. – Austin Hemmelgarn Jan 16 '18 at 21:43
  • I wasn't using docker to imply it was a good practice -- just where I think it gained momentum. – rrauenza Jan 16 '18 at 21:55
  • On Linux the owner of a process can ptrace it, which is effectively the same as full control over the process. – Kevin Jan 18 '18 at 04:01
  • 3
    I agree with the point @rrauenza makes. The answer is pretty great all around, but I'd like clarification on how exactly *you can read pretty much any processes environment variables without needing any significant privileges*. Regarding "*and you actually only need the CAP_SYS_ADMIN capability (which root implicitly has)*..." well, if a malicious agent has root privileges, further discussion is redundant, and CAP_SYS_ADMIN might as well be root privilege (see http://man7.org/linux/man-pages/man7/capabilities.7.html, *CAP_SYS_ADMIN* and *Notes to kernel developers*) – Nubarke Jan 18 '18 at 09:53
13

Environment variables will be inherited by every child process of the web server. That's every session that connects to the server, and every program spawned by them. The secrets will be automatically revealed to all of those processes.

If you keep secrets in text files, they have to be readable by the server process, and so potentially by every child process too. But at least the programs have to go and find them; they're not automatically provided. You might also be able to make some child processes run under different accounts, and make the secrets readable only by those accounts. For example, suEXEC does this in Apache.

Andrew Schulman
  • 8,561
  • 21
  • 31
  • 47
  • 1
    "That's every session that connects to the server" is a misleading statement. You can't open an http session to the server and get access to it's environment variables, nor can you log into a shell on that server and get them unless you have root access or own the web server process. – Segfault Jan 17 '18 at 06:40
  • Every process spawned by the web server inherits its environment, unless you take active steps otherwise. An HTML page doesn't have the capability to use that information, but a script does. – Andrew Schulman Jan 17 '18 at 13:52
  • While correct, this answer could do with some corrections/concessions, especially with regard to the term *sessions*. On first read, it seems to paint the use of environment variables in a bad light almost to suggest possibilities of information disclosure to an external client. Also, a concession comparable to suexec can be made for limited setting of env-vars e.g. setting per-process env-vars (a la `MYVAR=foo /path/to/some/executable`) limits propagation to a process and it's children only - and where needed master daemons can scrub/reset/modify the environment of child processes. – shalomb Jan 17 '18 at 20:41
2

Even if there are some security related trade offs to be made when it comes to environment variables or files, I don't think security was the main driving force for this recommendation. Remember the authors of 12factor.net are also (or were also?) developers of the Heroku PaaS. Getting everyone to use environment variables probably simplified their development quite a bit. There's so much variety in different config files formats and locations and it would have been difficult for them to support them all. Environment variables are easy in comparison.

It doesn't take much imagination to guess at some of the conversations that were had.

Developer A: "Ah this secret config file UI is too cluttered! Do we really need to have a drop down that switches between json, xml, and csv?"

Developer B: "Oh, life would be so grand if only everyone used environment variables for the app config."

Developer A: "Actually there are some plausible security-related reasons to do that. Environment variables probably won't get accidentally checked into source control."

Developer B: "Don't you set the environment variables with a script that launches the daemon, or a config file?"

Developer A: "Not in Heroku! We'll make them type them into the UI."

Developer B: "Oh look, my domain name alert for 12factor.net just went off."1


1: source: made up.

Segfault
  • 264
  • 1
  • 9
1

TL;DR

There are a number of reasons for using environment variables instead of configuration files, but two of the most common ones to overlook is the utility value of out-of-band configuration and enhanced separation between servers, applications, or organizational roles. Rather than present an exhaustive list of all possible reasons, I address just these two topics in my answer, and touch lightly on their security implications.

Out-of-Band Configuration: Separating Secrets from Source Code

If you store all your secrets in a configuration file, you have to distribute those secrets to each server. That either means checking the secrets into revision control alongside your code, or having an entirely separate repository or distribution mechanism for the secrets.

Encrypting your secrets doesn't really help solve for this. All that does is push the issue to one remove, because now you have to worry about key management and distribution, too!

In short, environment variables are an approach to moving per-server or per-application data out of source code when you want to separate development from operations. This is especially important if you have published source code!

Enhance Separation: Servers, Applications, and Roles

While you could certainly have a configuration file to hold your secrets, if you store the secrets in source code you have a specificity problem. Do you have a separate branch or repository for each set of secrets? How do you ensure the right set of secrets gets to the right servers? Or do you reduce security by having "secrets" that are the same everywhere (or readable everywhere, if you have them all in one file), and therefore constitute a bigger risk if any one system's security controls fail?

If you want to have unique secrets on each server, or for each application, environment variables do away with the problem of having to manage a multitude of files. If you add a new server, application, or role, you don't have to create new files or update old ones: you just update the environment of the system in question.

Parting Thoughts on Security

While a thorough exploration of kernel/memory/file security is out of scope for this answer, it's worth pointing out that properly-implemented, per-system environment variables are no less secure than "encrypted" secrets. In either case, the target system still has to hold the decrypted secret in memory at some point in order to use it.

It's also worth pointing out that when values are stored in volatile memory on a given node, there's no on-disk file that can be copied and attacked offline. This is generally considered an advantage to in-memory secrets, but it's certainly not conclusive.

The issue of environment variables vs. other secrets-management techniques is really more about security and usability trade-offs than it is about absolutes. Your mileage may vary.

CodeGnome
  • 285
  • 2
  • 9
  • 2
    This isn't convincing, because all of the downsides you mention for configuration files also apply for environment variables. **Environment variables *are* configuration data.** They don't magically set themselves. They have to be distributed to each system, and some sort of *configuration* mechanism must be used to set them. – jpaugh Jan 17 '18 at 18:00
  • @jpaugh You're making a straw man argument and attacking something I never said. The issues I address are out-of-band configuration and data separation. As clearly explained, you can do these things in any way you like. If you prefer, you can post your secrets alongside your code publicly on GitHub, but that certainly seems unwise in the general case. However, only *you* can determine the trade-offs necessary for *your* system to operate properly within a given threat model. – CodeGnome Jan 17 '18 at 18:29
  • 2
    All of your points are correct, except that it applies to environment variables as much as to any other configuration data. If you store environment variables in files, then you can commit them; and if you send them out-of-band, it is easier to do in a file than by typing them out. But if you prefer to type them, why not type out a JSON object instead, and read it on stdin? That's actually more secure than the command line. – jpaugh Jan 17 '18 at 18:34
1

Personally, I wouldn't recommend setting environmental variables in .bashrc as these become visible to all processes started by the shell but to set them at the daemon/supervisor level (init/rc script, systemd config) so that their scope is limited to where needed.

Where separate teams manage operations, environment variables provide an easy interface for operations to set the environment for the application without having to know about the configuration files/formats and/or to resort to mangling of their content. This is especially true in multi-language/multi-framework settings where the operations teams can chose the deployment system (OS, supervisor processes) based on operational needs (deployment ease, scalability, security, etc).

Another consideration is CI/CD pipelines - as code goes through different environments (i.e. dev, test/qa, staging, production) the environmental particulars (deployment zones, database connection particulars, credentials, IP addresses, domain names, etc, etc) are best set by dedicated configuration management tools/frameworks and consumed by the application processes from the environment (in a DRY, write once, run anywhere fashion). Traditionally where developers tend to manage these operational concerns, they tend to check-in configuration files or templates besides code - and then end up adding workarounds and other complexity when operational requirements change (e.g. new environments/deployment/sites come along, scalability/security weigh in, multiple feature branches - and suddenly there are hand-rolled deployment scripts to manage/mangle the many configuration profiles) - this complexity is a distraction and an overhead best managed outside of code by dedicated tools.

  • Env-vars simplify configuration/complexity at scale.
  • Env-vars place operational configuration squarely with the team responsible for the non-code related aspects of the application in a uniform (if not standard) non-binding way.
  • Env-vars support swapping out the master/supervisor processes (e.g. god, monit, supervisord, sysvinit, systemd, etc) that back the application - and certainly even the deployment system (OSes, container images, etc) or so on as operational requirements evolve/change. While every language framework nowadays has some process runtime of sorts, these tend to be operationally inferior, suited more for dev environments and/or increase complexity in multi-language/multi-framework production environments.

For production, I favour setting the application env-vars in an EnvironmentFile such as/etc/default/myapplication.conf that is deployed by configuration management and set readable only by root such that systemd (or anything else for that matter) can spawn the application under a dedicated Deprivileged system user in a Private group. Backed with dedicated user groups for ops and sudo - these files are unreadable by world by default. This is 12factor compliant supporting all the goodness of Dev+Ops plus has all the benefits of decent security while still allowing developers/testers to drop in their own EnvironmentFiles in the dev/qa/test environments.

shalomb
  • 264
  • 2
  • 7
0

From a developer's perspective, storing config data in environment variables simplifies deployments among different environments - development, QA, and production - and frees developers from having to worry about deploying the wrong config file.

Azure web apps provide the option to use this pattern and it works very well.

In addition to that, it keeps that potentially sensitive data out of source control. Ignoring those files from source control isn't really feasible (at least in .NET) because a lot of necessary boilerplate configuration is also present in those files.

Derek Gusoff
  • 101
  • 1