163

This question has appeared on a pre-interview quiz and it's making me crazy. Can anyone answer this and put me at ease? The quiz has no reference to a particular shell but the job description is for a unix sa. again the question is simply...

What does 'set -e' do, and why might it be considered dangerous?

voretaq7
  • 79,345
  • 17
  • 128
  • 213
egorgry
  • 2,871
  • 2
  • 22
  • 21

8 Answers8

173

set -e causes the shell to exit if any subcommand or pipeline returns a non-zero status.

The answer the interviewer was probably looking for is:

It would be dangerous to use "set -e" when creating init.d scripts:

From http://www.debian.org/doc/debian-policy/ch-opersys.html 9.3.2 --

Be careful of using set -e in init.d scripts. Writing correct init.d scripts requires accepting various error exit statuses when daemons are already running or already stopped without aborting the init.d script, and common init.d function libraries are not safe to call with set -e in effect. For init.d scripts, it's often easier to not use set -e and instead check the result of each command separately.

This is a valid question from an interviewer standpoint because it gauges a candidates working knowledge of server-level scripting and automation

voretaq7
  • 79,345
  • 17
  • 128
  • 213
Rich
  • 1,746
  • 1
  • 11
  • 3
  • good point. it would halt the boot process over something that might be an error technically, but shouldn't cause the whole script to stop – hookenz Sep 19 '12 at 02:29
  • Though I agree with https://stackoverflow.com/questions/78497/design-patterns-or-best-practices-for-shell-scripts/739034#739034, I think this style finds a good fit with bash for short init checks and logging or other environment monkey patching before execing the target workload.We use this same policy for our docker image init scripts. – joshperry Jun 06 '17 at 03:52
38

From bash(1):

          -e      Exit immediately if a pipeline (which may consist  of  a
                  single  simple command),  a subshell command enclosed in
                  parentheses, or one of the commands executed as part  of
                  a  command  list  enclosed  by braces (see SHELL GRAMMAR
                  above) exits with a non-zero status.  The shell does not
                  exit  if  the  command that fails is part of the command
                  list immediately following a  while  or  until  keyword,
                  part  of  the  test  following  the  if or elif reserved
                  words, part of any command executed in a && or  ││  list
                  except  the  command  following  the final && or ││, any
                  command in a pipeline but the last, or if the  command’s
                  return  value  is being inverted with !.  A trap on ERR,
                  if set, is executed before the shell exits.  This option
                  applies to the shell environment and each subshell envi-
                  ronment separately (see  COMMAND  EXECUTION  ENVIRONMENT
                  above), and may cause subshells to exit before executing
                  all the commands in the subshell.

Unfortunately I'm not creative enough to think of why it would be dangerous, other than "the rest of the script won't get executed" or "it might possibly perhaps mask real problems".

Ignacio Vazquez-Abrams
  • 45,019
  • 5
  • 78
  • 84
  • 1
    mask real/other problems, yeah, i guess i could see that... i was struggling to come up with an example of it being dangerous, as well... – cpbills May 19 '10 at 16:48
  • There's the manpage for `set`! I always end up at `builtin(1)`. Thanks. – Jared Beck May 29 '13 at 23:27
  • how would I know that the documentation for `set` is to befound in `man 1 bash`?? and how would anybody be able to find the `-e` option for the `set` keyword in a manual page which is so enormously big? you cant just `/-e` search here – phil294 Jun 05 '17 at 06:33
  • @Blauhirn using `help set` – phil294 Mar 13 '18 at 22:04
23

It should be noted that set -e can be turned on and off for various sections of a script. It doesn't have to be on for the whole script's execution. It could even be conditionally enabled. That said, I don't ever use it since I do my own error handling (or not).

some code
set -e     # same as set -o errexit
more code  # exit on non-zero status
set +e     # same as set +o errexit
more code  # no exit on non-zero status

Also noteworthy is this from the Bash man page section on the trap command which also describes how set -e functions under certain circumstances.

The ERR trap is not executed if the failed command is part of the command list immediately following a while or until keyword, part of the test in an if statement, part of a command executed in a && or ⎪⎪ list, or if the command's return value is being inverted via !. These are the same conditions obeyed by the errexit option.

So there are some conditions under which a non-zero status will not cause an exit.

I think the danger is in not understanding when set -e comes into play and when it doesn't and relying on it incorrectly under some invalid assumption.

Please also see BashFAQ/105 Why doesn't set -e (or set -o errexit, or trap ERR) do what I expected?

Dennis Williamson
  • 60,515
  • 14
  • 113
  • 148
  • While deviating from the question a bit, this answer is exactly what I am looking for: turning it off for just a small part of a script! – Stephan Bijzitter Nov 09 '17 at 10:18
15

Keep in mind this is a quiz for a job interview. The questions may have been written by the current staff, and they may be wrong. This isn't necessarily bad, and everyone makes mistakes, and interview questions often sit in a dark corner without review, and only come out during an interview.

It's entirely possible that 'set -e' does nothing that we would consider "dangerous". But the author of that question may mistakenly believe that 'set -e' is dangerous, due to their own ignorance or bias. Maybe they wrote a buggy shell script, it bombed horribly, and they mistakenly thought that 'set -e' was to blame, when in fact they neglected to write proper error checking.

I've participated in probably 40 job interviews over the last 2 years, and the interviewers sometimes ask questions which are wrong, or have answers which are wrong.

Or maybe it's a trick question, which would be lame, but not entirely unexpected.

Or maybe this is a good explanation: http://www.mail-archive.com/debian-bugs-dist@lists.debian.org/msg473314.html

Stefan Lasiewski
  • 22,949
  • 38
  • 129
  • 184
  • 1
    +1 i'm in the same boat, and a lot of the 'technical' interviews i've been dealing with have been at least slightly misguided. – cpbills May 19 '10 at 17:43
  • 1
    wow. Good find on the debian list. +1 for the ignorance of interview questions. I remember arguing a netback answer once since I was 100% sure I was right. They said nope. I went home and googled it, I was right. – egorgry May 19 '10 at 18:48
13

set -e terminates the script if a nonzero exit code is encountered, except under certain conditions. To sum up the dangers of its usage in a few words: it doesn't behave how people think it does.

In my opinion, it should be regarded as a dangerous hack which continues to exist for compatibility purposes only. The set -e statement does not turn shell from a language that uses error codes into a language that uses exception-like control flow, it merely poorly attempts to emulate that behaviour.

Greg Wooledge has a lot to say on the dangers of set -e:

In the second link, there are various examples of the unintuitive and unpredictable behaviour of set -e.

Some examples of the unintuitive behaviour of set -e (some taken from the wiki link above):

  • set -e
    x=0
    let x++
    echo "x is $x"
    The above will cause the shell script to prematurely exit, because let x++ returns 0, which is treated by the let keyword as a falsy value and turned into a nonzero exit code. set -e notices this, and silently terminates the script.
  • set -e
    [ -d /opt/foo ] && echo "Warning: foo is already installed. Will overwrite." >&2
    echo "Installing foo..."
    

    The above works as expected, printing a warning if /opt/foo exists already.

    set -e
    check_previous_install() {
        [ -d /opt/foo ] && echo "Warning: foo is already installed. Will overwrite." >&2
    }
    
    check_previous_install
    echo "Installing foo..."
    

    The above, despite the only difference being that a single line has been refactored into a function, will terminate if /opt/foo does not exist. This is because the fact that it worked originally is a special exception to set -e's behaviour. When a && b returns nonzero, it is ignored by set -e. However, now that it's a function, the exit code of the function is equal to the exit code of that command, and the function returning nonzero will silently terminate the script.

  • set -e
    IFS=$'\n' read -d '' -r -a config_vars < config
    

    The above will read the array config_vars from the file config. As the author might intend, it terminates with an error if config is missing. As the author might not intend, it silently terminates if config does not end in a newline. If set -e were not used here, then config_vars would contain all lines of the file whether or not it ended in a newline.

    Users of Sublime Text (and other text editors which handle newlines incorrectly), beware.

  • set -e
    
    should_audit_user() {
        local group groups="$(groups "$1")"
        for group in $groups; do
            if [ "$group" = audit ]; then return 0; fi
        done
        return 1
    }
    
    if should_audit_user "$user"; then
        logger 'Blah'
    fi
    

    The author here might reasonably expect that if for some reason the user $user does not exist, then the groups command will fail and the script will terminate instead of letting the user perform some task unaudited. However, in this case the set -e termination never takes effect. If $user cannot be found for some reason, instead of terminating the script, the should_audit_user function will just return incorrect data as if set -e was not in effect.

    This applies to any function invoked from the condition part of an if statement, no matter how deeply nested, no matter where it is defined, no matter even if you run set -e inside it again. Using if at any point completely disables the effect of set -e until the condition block is fully executed. If the author is not aware of this pitfall, or does not know their entire call stack in all possible situations in which a function can be called, then they will write buggy code and the false sense of security provided by set -e will be at least partially to blame.

    Even if the author is fully aware of this pitfall, the workaround is to write code in the same way as one would write it without set -e, effectively rendering that switch less than useless; not only does the author have to write manual error handling code as if set -e were not in effect, but the presence of set -e may have fooled them into thinking that they do not have to.

Some further drawbacks of set -e:

  • It encourages sloppy code. Error handlers are completely forgotten about, in the hopes that whatever failed will report the error in some sensible way. However, with examples like let x++ above, this is not the case. If the script dies unexpectedly, it is usually silently, which hinders debugging. If the script does not die and you expected it to (see previous bullet point), then you have a more subtle and insidious bug on your hands.
  • It leads people into a false sense of security. See again the if-condition bullet point.
  • The places where the shell terminates are not consistent between shells or shell versions. It is possible to accidentally write a script which behaves differently on an older version of bash due to the behaviour of set -e having been tweaked between those versions.

set -e is a contentious issue, and some people aware of the issues surrounding it recommend against it, while some others just recommend taking care while it is active to know the pitfalls. There are many shell scripting newbies who recommend set -e on all scripts as a catch-all for error conditions, but in real life it does not work that way.

set -e is no substitute for education.

Score_Under
  • 273
  • 3
  • 7
9

set -e tells bash, in a script, to exit whenever anything returns a non-zero return value.

i could see how that would be annoying, and buggy, not sure about dangerous, unless you had opened up permissions on something, and before you could restrict them again, your script died.

cpbills
  • 2,692
  • 17
  • 12
  • That's interesting. I didn't think about permissions opened up in a script that dies prematurely but I could see that being considered dangerous. The fun part of this quiz is that you can't use any reference materials such as man or google and if you can't answer fully don't answer it at all. – egorgry May 19 '10 at 17:02
  • 6
    that's just silly, i would reconsider this employer... kidding (sort of). it's good to have a strong base of knowledge, but half of IT work is knowing /where/ to find the information... hopefully they were smart enough to take that into consideration when scoring applicants. on a side note, i see /no/ use for `set -e`, in fact, to me it speaks of laziness to ignore error checking in your scripts... so keep that in mind, with this employer, too... if they use it a lot, what do their scripts look like, that you will inevitably have to maintain... – cpbills May 19 '10 at 17:45
  • excellent point @cpbills. I also see no use for set -e in a script and it's probably why it stumped me so much. I'd like to post all the questions since some are really good and some are really wacky and random like this one. – egorgry May 19 '10 at 18:46
  • I've seen it in many system cron jobs... – Andrew May 20 '10 at 00:29
2

I'd say it's dangerous because you don't control he flow of your script anymore. The script can terminate as long as any of the commands that the script invokes returns a non-zero. So all you have to do is to do something that alters the behavior or output of any of the components, and you get to terminate the main script. It might be more of a style problem, but it definitely has consequences. What if that main script of yours supposed to set some flag, and it didn't because it terminated early? You'd end up faulting the rest of the system if it assumes the flag should be there, or working with an unexpected default or old value.

Marcin
  • 2,281
  • 1
  • 16
  • 14
  • 1
    Totally agree with this answer. It's not *inherently* dangerous, but it's an opportunity to have an unexpected early exit. – pboin May 20 '10 at 00:18
  • 2
    What this reminded me of is a C++ prof of mine that was always pulling points from my assignments for not having single entry point/single exit point in a program. I thought it was a purely a 'principle' thing, but this set -e business definitely demos how and why things can completely spin out of control. Ultimately programs are about control and determinism, and with premature termination you give up both. – Marcin May 20 '10 at 11:45
0

As a more concrete example of @Marcin's answer which has personally bitten me, imagine there was a rm foo/bar.txt line somewhere in your script. Usually no big deal if foo/bar.txt doesn't actually exist. However with set -e present now your script will terminate early there! Oops.

Suan
  • 163
  • 1
  • 5