48

I get the impression that it is a programming best practice to create variables in specific scopes (like a function scope) and avoid global scope to make things more modular and better organized. However I'm not sure if there is also a security issue.

Here is an example of global variables in Bash that worked for me fine more than a year:

cat <<-EOF >> "$HOME"/.profile
    set -x
    complete -r
    export war="/var/www/html" # Web Application Root;
    export dmp="phpminiadmin" # Database Management Program;

    export -f war
    war() {
        cd $war/
    }
EOF

source "$HOME"/.profile 2>/dev/null

I have never had a problem with global variables in Bash or JavaScript, most likely because I only wrote small scripts for personal usage on minimalist environments.

Why do many programmers avoid using global variables and are there any examples of security breaches caused by using global variables?

  • 80
    *"Why many programmers abstain from using global variables?"* - because it is much easier to understand and verify small code snippets which have no side effects. When using global variables you always have to be aware which part of the code might change it in what way and what the effect will be - which is really hard with a larger code base and more than trivial global variables without obvious behavior (i.e. some global debug variable might be fine). See also [Global Variables Are Bad](http://wiki.c2.com/?GlobalVariablesAreBad). – Steffen Ullrich Sep 02 '19 at 19:15
  • Also threading issues otherwise concurrency issues ensue – makerofthings7 Sep 02 '19 at 20:28
  • 2
    Global variables are more common than you think. Java static fields are effectively globals and widely used. They are fine for constants, or things just assigned during initialization. The problems come with mutable globals. – paj28 Sep 02 '19 at 21:57
  • 16
    A more general idea of why developers hate global variables is because they can cause "action at a distance" https://en.wikipedia.org/wiki/Action_at_a_distance_(computer_programming) – Steve Sether Sep 03 '19 at 03:17
  • 15
    There is really nothing wrong with global variables if you're only ever writing "small" scripts. It's when you build medium to large scripts or programs that they become a nightmare. – whatsisname Sep 03 '19 at 04:05
  • 2
    Related on Software Engineering: [Why is Global State so Evil?](https://softwareengineering.stackexchange.com/questions/148108/why-is-global-state-so-evil) A number of the issues mentioned there could also lead to security problems. – NotThatGuy Sep 03 '19 at 12:44
  • @whatsisname Even with small scripts it can be a problem with multiple developers. If they both decide to use the same name for a global variable, but it has a different purpose, then you'll run into problems. – Anthony Grist Sep 03 '19 at 13:25
  • I think a 8 lines shell script (after generation which has even less variables and uses system maintained variables only) with no concurrency is not the best example to discuss software engineering concepts which exists mostly to make a modular large codebase better understandable, especially not if it only contains immutable variable assignments (constants). Especially since it basically can be replaced by a single line alias, in this case. – eckes Sep 04 '19 at 05:35
  • 1
    As you mention Javascript, it is a special case because global variables are visible and modifiable through the dev tools console – Kaddath Sep 04 '19 at 07:21
  • Tangentially, your missing quoting in the `war` function definition is a security antipattern, too. See [Security implications of forgetting to quote a variable in bash/POSIX shells](https://unix.stackexchange.com/questions/171346/security-implications-of-forgetting-to-quote-a-variable-in-bash-posix-shells) on [unix.se] – tripleee Sep 18 '19 at 17:23

10 Answers10

79

Boycott Globals!

I'm stealing from Steffen Ullrich's comment, but the main issue with global variables is that they make it difficult to keep code well organized and maintainable. His link is a fine one, but you won't have any trouble finding countless articles about the problems with global variables online.

When you use global variables, it becomes easy to lose track of where in your program the variable gets modified, especially if you don't have a simple linear flow. As a result global variables can work perfectly fine in small scripts, but can cause massive headaches as an application begins to scale.

The main reason why I would avoid globals is because they make automated unit/integration testing a nightmare. Small applications can survive without tests, but trying to manage a larger application without good tests is just a nightmare (trust me, I tried in my young-and-foolish days).

This might leave you with the impression that globals are fine in very small applications, but since applications usually only grow over time, and things that start off temporary become permanent, it's really just a bad idea to use them at all. Why start on the wrong foot, when it is so easy to use properly scoped variables?

Security

Using global variables doesn't have any direct implications for security, but they do make it easier to end up with security issues, because it disconnects the source of data from it's usage. I have even seen actual vulnerabilities introduced in such a way:

Imagine you have a variable which is used in an SQL query. Initially you set it from a safe value and so inject it directly into the query. However, it is a global (or becomes a global!) and later on use-cases change and it gets set from user input. The developer who sets it from user input doesn't realize that it is injected directly into a query (perhaps because it happens in a completely different file and only in a specific flow) and so doesn't bother with strict input checking, nor do they update the query it is used in for more secure usage. All of a sudden a very hard-to-find vulnerability has been introduced, because global variables hid the connection between the source and usage of a variable!

Global Variables == death

I don't know of any breaches that happened specifically because of global variables, but it's easy to argue that the use of global variables has literally killed people, so I think it's reasonable to just never use them.

Conor Mancone
  • 29,899
  • 13
  • 91
  • 96
  • Comments are not for extended discussion; this conversation has been [moved to chat](https://chat.stackexchange.com/rooms/98286/discussion-on-answer-by-conor-mancone-global-variables-and-information-security). – Rory Alsop Sep 04 '19 at 16:14
  • It may also be worth noting that many languages don't actually provide proper facilities to make global variables safe, leaving the burden on the programmer; in particular, there's usually no good way to limit what can actually access and/or modify them. Might also be worth looking at pseudo-globals like contexts and most singletons, too, although they tend to be at least somewhat safer. – Justin Time - Reinstate Monica Sep 04 '19 at 17:14
  • 2
    Good ol' `Robert'); DROP TABLE students;--` aka Bobby Tables – Monty Harder Sep 04 '19 at 20:53
  • 1
    Conor, `If a programmer modifies a function in file A but doesn't know or forgets that the value for a certain variable in that function comes from user input in file B, it might end up doing insecure things on user input without proper safe guards.` please consider changing the phrasing a bit / expanding so it would be more clear; I'm not sure I never worked on a similar case so I don't know what modification caused what harm. –  Sep 11 '19 at 20:43
  • This is well explained, I get the idea now. In a big projects, variable names may repeatedly used even though these have different meanings. – Jones G Jan 28 '21 at 15:53
10

In some programming environments, you can't trust the global scope since other sources could read / mess with it.

This has already led to some critical vulnerabilities in security sensitive products, like a remote code execution in the LastPass browser extension.

Benoit Esnard
  • 13,942
  • 7
  • 65
  • 65
9

They can give code injections easier access to everything

In PHP there is the $GLOBALS superglobal; in Python there is the globals() function, and in Javascript there is the window (or in Node, process) object. These all make it trivial to enumerate all global variables.

Once you do that you can both suck out the information they contain and change them maliciously. For example, if sensitive data is stored in a global, an attacker could easily extract it. If, say, a database connection URL is stored in a global, an attacker could point the URL to refer to the attacker's server instead, and then the victim would attempt to connect to that server and send credentials.

Also, if you have a lot of global objects then you can call their methods with ease.

They violate "principle of least privilege"

This principle of security engineering basically says "don't give someone more access than they need in order to do their job". Every function and method can read and write global variables even if they are totally irrelevant to its job, which means that if that piece of code is hijacked even in some limited way, it will open up a greater attack surface.

(Of course, most scripting languages have weak encapsulation rules which is also a violation of POLP, but if objects are out of scope then it is a lot harder to do anything to them.)

They are a breeding ground for buggy code

For reasons including:

  • the programmer forgetting that one of the variables in a function is a global and thus changes made to it will persist when the function ends,
  • the non-obvious sensitivity of the program to the order in which functions are called
  • the ease with which cascade failures can happen (e.g. when a function sets a global to null and later another one crashes)
  • the highly complex interactions which can easily occur and are hard to reason about

They are simply untidy

If you think about a business where there are papers strewn everywhere, things aren't organised and filed away neatly but just lying around, who's going to notice if something goes missing? If an employee commits fraud nobody will spot the discrepancy because discrepancies are everywhere. If a pair of keys goes missing, someone will just assume they're buried under something or someone had to borrow them.

Now you might say this is an exaggeration, but if so I doubt you've ever worked with a big software project. Globals are OK if your website is small enough to build in a couple of days (and you know what you're doing). They can save time to get something going.

But they don't scale.

Artelius
  • 588
  • 2
  • 4
  • Hello, I like this answer in general; I didn't say what you wrote is an exaggeration; the "big software project" part seems to me redundant and better to be deleted because there is no explanation to what you mean in general or specifically. Thanks, –  Sep 03 '19 at 22:22
  • Hello again, you are welcome to read my bounty message - if you want. Thanks, –  Sep 04 '19 at 23:45
  • @JohnDoea Don't forget that this site is also about helping other people who _in future_ may have basically the same question, and they may benefit from different parts of an answer. – Artelius Sep 05 '19 at 00:34
  • Hi - humbly I don't; I only disagree with some of the last passage or its phrasing; besides what I said about defining a "big" project; globals which are usally aren't okay, might be rarely okay not just for building small sites but for "operating-programming languages" such as Bash (also, a site builder can work slow to build a small site - if exploring a brand new CMS). Regards. –  Sep 05 '19 at 10:56
  • This answer seems like the best one in addressing the specific question about "do global variables cause security problems". I think you should accept it. – O'Rooney Sep 10 '19 at 23:50
6

Please compare the following pieces of code:

1)

/file.php:
    $newconn = new PDO('mysql:host=localhost;charset=utf8mb4;','username','password');

2)

../secrets.php:
    $mysqlusername = 'username';
    $mysqlpassword = 'password';

/file.php:
    require_once('../secrets.php');
    $newconn = new PDO('mysql:host=localhost;charset=utf8mb4;',$mysqlusername,$mysqlpassword);

3

../secrets.php:
    function pdoconn(i) {
        $mysqlusername = ['username1','username2'];
        $mysqlpassword = ['password1','password2'];
        $conn = new PDO('mysql:host=localhost;charset=utf8mb4;',$mysqlusername[i],$mysqlpassword[i]);
        return $conn;
    }

/file.php:
    require_once('../secrets.php');
    $newconn = pdoconn(0);

Example 1 is out of the question - incorrect configuration on production servers could end up showing sensitive parameters to unintended parties.

Example 2 is better, but those variables are available throughout the application, and modifiable, which could result in errors.

Example 3 keeps things very organised and transferable.
It can be modified to use globals if ../secrets.php instead was:

../secrets.php:
$mysqlusername = 'username';
$mysqlpassword = 'password';
function pdoconn()  {
    $conn = new
PDO('mysql:host=localhost;charset=utf8mb4;',$GLOBALS['mysqlusername'],$GLOBALS['mysqlpassword']);
return $conn;
}

And I think that demonstrates why a global doesn't make sense most of the time in quite a succinct way.


Summary:

As for security breaches using global variables (and why I wrote these examples in PHP), there was quite a (at the time) controversial change in PHP 4.2.0 where register_globals was turned on to off. I can't find any articles now, as this change was made in 2002, but I do seem to remember it being responsible for a few breaches at the time. Copying directly from the manual, there is a very clear example of vulnerable code:

<?php
// define $authorized = true only if user is authenticated
if (authenticated_user()) {
    $authorized = true;
}

// Because we didn't first initialize $authorized as false, this might be
// defined through register_globals, like from GET auth.php?authorized=1
// So, anyone can be seen as authenticated!
if ($authorized) {
    include "/highly/sensitive/data.php";
}
?>
LTPCGO
  • 965
  • 1
  • 5
  • 22
  • 2
    I'd say #3 is also bad, because the password is in the source code! You still have all the same issues with accidental exposure. Great example at the end though! – Anders Sep 03 '19 at 07:40
  • 1
    @Anders maybe it's not clear, where I have to keep the file on the same server, I use a function similar to this in place of just having a variable with the password stored in a file, since when that is included it is normally available globally. Essentially you keep it the same as what I think you're suggesting, but instead of referring to a variable when it's required you instead call a function to return the object required. – LTPCGO Sep 03 '19 at 07:52
  • 1
    My point is that you should not keep passwords in the source code at all, even if it is in a function. – Anders Sep 03 '19 at 08:07
  • I'm saying it's not in the source code though, it would sit in a file somewhere on the server that you include_once(). Unless you're talking about not keeping the password on the server at all and instead accessing credentials via an API, which I 100% agree with and aim to do in every situation that it can possibly be done. – LTPCGO Sep 03 '19 at 08:26
  • `Example 1 examples how incorrect configuration on production servers could end up showing values of sensitive parameters to unintended parties.` Do you mean to case that both values would appear in Query Strings? –  Sep 03 '19 at 15:32
  • @JohnDoea in some scripting languages, if the script fails on a certain line that line is sent to stdout to the user. If the line contains a password, the password will be sent to stdout. – LTPCGO Sep 03 '19 at 21:23
4

Because the use of global variables gives you a bunch of problems, and not really any benefits.

Globals make your code hard to test

While not directly related to security, unit tests are vital to development. And in order to properly unit-test something, you need to be able to control the exact context in which code is executed. Look at the two pieces of code:

With Globals

public Money CalculateExpenses(int departmentId)
{
    Department department = (from d in global_db.Departments
    where d.Id = departmentId
    select d).Single();

    return (from ex in department.Expenses
            select ex).Sum();
}

Without Globals

public Money CalculateExpenses(int departmentId, Database database)
{
    Department department = (from d in database.Departments
    where d.Id = departmentId
    select d).Single();

    return (from ex in department.Expenses
            select ex).Sum();
}

You might say that the code looks almost identical, except for where the data is actually coming from. You might even think that the code with Globals is "cleaner", because you don't have to pass the database every time.

Until you have to write a unit test. Because then you will have to insert some logic to not connect to the production database, but rather some local, perhaps test-specific database. Now the non-Global code looks much better, because you can just pass the database you would like to use to the code.

In Summary: Code without Globals is easier to test.

Globals change behavior if you don't change the code

If your code has around 200 lines of code, then the usage of Globals doesn't seem so bad. In fact, it may seem like a perfectly valid thing to do if you re-use the same piece of code all over the place (e.g. some random data source, logging, etc.)

However, once your codebase grows enough, the introduction of globals can be absolutely deadly. You simply lose control over where your data is from, and who has access to it.

Let's use an example in everyone's favorite language: PHP 5. Example by Eevee.

    @fopen('http://example.com/not-existing-file', 'r');

What does this code do? You don't know, because it depends on global behavior. If PHP was compiled with --disable-url-fopen-wrapper, then it will not work. Same for the global configuration allow_url_fopen. We don't know what it will do.

The @ int the beginning will disable error messages, unless scream.enabled is set in the global PHP config. It will also not be printed if the global error_reporting level is not set. Where exactly it will be printed to, if it prints, will depends on the global display_errors.

So as you saw, the behavior of one innocuous line depends on 5 different global variables. Now imagine this scenario in a code base with one million lines, and 100 different programmers, all across different teams, and you will quickly see why the use of Globals is abhorrent and leads to all sorts of problems.

Imagine going to work one day and your code suddenly breaks, even though you haven't touched it. That's the magic of Globals.

There are more examples of why the use of Globals are bad, which can be found in the other answers.

3

I have never had a problem with global variables in Bash or JavaScript, most likely because I only wrote small scripts for personal usage on minimalist environments.

Why do many programmers avoid using global variables and are there any examples of security breaches caused by using global variables?

In tiny projects global variables are just fine. Most of their problems don't show up. And the above Bash environment customization is indeed Tiny.

Global variables start causing problems as your program scales; which can happen in a bunch of ways, but the core part of the problem is the amount of state that understanding each piece of code requires.

Imagine a 10 million line code base. In it, there are 1 million variables. All of them are global.
In a given 100 line piece of code, you might use 20 variables.
Whenever you call a function, you have no idea which of those 20 variables that function will modify; when your function starts, you have no idea where the variable state came from.
So to understand what your 100 lines of code mean, you need to either understand and hold in your head all 10 million lines of code, or you need some kind of convention about where data is coming from, what data can and cannot be modified when you call a helper function, etc.

In contrast, if your code is almost entirely fueled by function arguments, local variables and return types, you only have to understand your function to understand what it does. Moreover, if that function calls another function, you know what information you are passing to it, and what information it passes back.
You might not know what it does with that information, but you know what it doesn't do.
For example, it cannot modify integer x and you are certain, because you didn't pass x to the function, nor did you assign to it from its return value!

Making code easier to reason about locally is a very important goal. You want to be able to read a function, know what it is doing, and identify bugs.

Writing code is far far easier and far far less important than making code that is clear and easy to reason about.
Almost all code in a large, persistent project gets read 100s of times more often than it is modified, and it is written and designed once.
From this rule -- make stuff easier to reason about locally -- we reach the rule of "avoid global state".

Global variable versus Super-object

Global variables that are read/written are an example of global state. But so can be some super-object that everything in your project has a pointer to passed as the first argument.

A super-object still has advantages over global variables; global variables often have confusing mechanics on how they are initialized and cleaned up (I'm looking at you, C/C++), and if you have a super-object with all of your "global" state you can instantiate two versions of your super-object and run both at the same time in the same process (this will tend not to work, but that is usually because of some implicit global state the OS foists on you).

Every global variable you read, instead of a parameter, means that any bit of code, anywhere could be modifying its behavior. Every global variable you write to, means you are changing the behavior of some code arbitrarily far away.

And it isn't just humans who find global states a pain; if your state is local, writing test harnesses that "fake" a global state or environment becomes far easier. If there is (say) a global "mouse" object, writing a unit test that creates a fake mouse that pretends to do certain movement becomes harder it may even be harder to write a macro that plays back a recorded mouse movement into some code, especially if you intend to do it while the UI remains responsive to the actual human using a mouse.

Security

Security is a function of understanding what your code does. Security, within a program, means "your code only does what it is intended to be allowed to do"; with global variables, you have a weaker ability to understand what the code does, thus less ability to control what it does.

Yakk
  • 499
  • 2
  • 7
2

Buffer Overflow Attacks

While fairly language-specific ( C/C++ come to mind ), and difficult to pull off, they are a potential attack.

Computerphile has a great video about it, that I wholeheartedly recommend, but here's a short version of the mechanics of this attacks and how it interacts with global variables.

A buffer overflow occurs when data written to a buffer also corrupts data values in memory addresses adjacent to the destination buffer due to insufficient bounds checking. This can occur when copying data from one buffer to another without first checking that the data fits within the destination buffer. (Wikipedia)

In the case of global variables, your program's memory layout is fairly static since (in most languages) they will have their memory segment assigned at program start. This gives the attacker consistency: It is much easier to target the memory zone of a different area when you're always shooting from the same spot (in case of globals) than when you're always on the move between shots.

This is not specific to global variables, but holding meaningful information in such structures facilitates ( not to be mistaken with enabling! ) these kinds of attacks.

Beware!

While global variables potentially present a slight security risk, if you fear buffer overflow attacks, your go-to defense mechanism should be checking input size, not refactoring code to get rid of global variables.

schroeder
  • 123,438
  • 55
  • 284
  • 319
  • 2
    I don't see what this has to do specifically with global variables. Can you please elaborate? – Kami Kaze Sep 03 '19 at 13:34
  • These attacks are not directly caused by global variables per se, but they are easier to pull off with global variables, whose memory slot is assigned at program start then when variables are more narrowly scoped making their assigned memory slots less predictable. I am a new contributor, so I don't know whether I should embed that in the answer as it varies based on language. I also do not want to give anyone the idea that the go-to way to protect an app from a buffer overflow attack is variable scoping as opposed to input size validation. – Mister Amen Sep 04 '19 at 11:54
  • no need to summarise your edits in your post – schroeder Sep 04 '19 at 12:12
  • You have not made your case for the connection between globals and BO. – schroeder Sep 04 '19 at 12:16
1

From a security perspective, the problem with global variables is they can create unexpected behavior, and break localization. The more general idea behind this is called "Action at a distance" where one part of a program can have an unexpected effect on another.

Since localization is broken, instead of having to understand just one part of the program you, and anyone who works with the program now have to understand any part of the program that might involve your global variable. This massively increases the complexity of the design, and in general, complex designs are more likely to be buggy, and thus insecure. As the computer scientists Edsger Dijkstra once said "The competent programmer is fully aware of the strictly limited size of his own skull".

Programs tend to grow with time in size and scope. Even though your simple script, created in isolation now only does one thing with no modularity, it may be re-used somewhere else in the future where you might have forgotten about the nasty implications of the global variables, or worse, you're gone and nobody even knows it contains global variables. Global variables are like leaving holes in the ground, or rakes facing the wrong way ready to be stepped on, or nails in your driveway.

They are, in many ways in imminent danger, and while sometimes necessary should be treated as such.

Steve Sether
  • 21,480
  • 8
  • 50
  • 76
0
Why do many programmers avoid using global variables and are there any examples of security breaches caused by using global variables?

Because it's easier to modify code that doesn't use global variables to run in unintended applications, applications which the original developer didn't foresee. Really, if you depend on global state, you can have at most one such state in an application without intensive refactoring (which would refactor away the global state).

Such unintended applications might include making the code run in multi-threaded environments.

However, what many people fail to realize: malloc() is global!

The same people who advocate against using global variables are all the time using a global allocator state.

This means that if piece of your code allocates lots of memory, making malloc() request more memory from the operating system, the memory blocks get mixed up with those allocated from other pieces of your code, and when the same piece of code frees all of its memory it was using, the free() calls don't return memory back to the operating system due to fragmentation. The end result is that your program ends up using more memory than it should.

So, the very people who advocate against using global variables, are in fact using an allocator with a global state.

I wonder why none of the standards by the major standardization organizations (C89/C99/C11/C18 by ISO, POSIX by IEEE) have defined an allocator that allows you to have multiple allocator states, i.e. independent heaps.

juhist
  • 273
  • 1
  • 6
  • You are raising a valid point with `malloc`. I think those language with `malloc`, e.g. C or C++, are unsafe anyway. Any pointer arithmetic, which might look genuine on the surface, might in fact contain a bug and cause undefined behavior across the whole program. I conjecture sandboxing applications such as browsers do in fact use multiple heaps -- if not simply due to multiple processes. – ComFreek Sep 04 '19 at 17:51
0

If just about every programming language can declare a variable globally, accessible only within the scope of it's implementation code, then this is a non-issue. There isn't much reason to use global variables.

Yet, this isn't the case, which is why when I've found myself writing my projects in C & not in C++, where it is naturally supported as a static variable at the outermost clause of a class definition, I've had to make do with global variables.

Dehbop
  • 101
  • 1