37

I usually work with Ubuntu LTS servers which from what I understand symlink /bin/sh to /bin/dash. A lot of other distros though symlink /bin/sh to /bin/bash.

From that I understand that if a script uses #!/bin/sh on top it may not run the same way on all servers?

Is there a suggested practice on which shell to use for scripts when one wants maximum portability of those scripts between servers?

cherouvim
  • 744
  • 3
  • 18
  • 37
  • There are slight differences between the various shells. If portability is the most important to you then use `#!/bin/sh` and do not use anything else than what the original shell provided. – Thorbjørn Ravn Andersen Aug 01 '17 at 06:29

3 Answers3

68

There are roughly four levels of portability for shell scripts (as far as the shebang line is concerned):

  1. Most portable: use a #!/bin/sh shebang and use only the basic shell syntax specified in the POSIX standard. This should work on pretty much any POSIX/unix/linux system. (Well, except Solaris 10 and earlier which had the real legacy Bourne shell, predating POSIX so non compliant, as /bin/sh.)

  2. Second most portable: use a #!/bin/bash (or #!/usr/bin/env bash) shebang line, and stick to bash v3 features. This'll work on any system that has bash (in the expected location).

  3. Third most portable: use a #!/bin/bash (or #!/usr/bin/env bash) shebang line, and use bash v4 features. This'll fail on any system that has bash v3 (e.g. macOS, which has to use it for licensing reasons).

  4. Least portable: use a #!/bin/sh shebang and use bash extensions to the POSIX shell syntax. This will fail on any system that has something other than bash for /bin/sh (such as recent Ubuntu versions). Don't ever do this; it's not just a compatibility issue, it's just plain wrong. Unfortunately, it's an error a lot of people make.

My recommendation: use the most conservative of the first three that supplies all of the shell features that you need for the script. For max portability, use option #1, but in my experience some bash features (like arrays) are helpful enough that I'll go with #2.

The worst thing you can do is #4, using the wrong shebang. If you're not sure what features are basic POSIX and which are bash extensions, either stick with a bash shebang (i.e. option #2), or test the script thoroughly with a very basic shell (like dash on your Ubuntu LTS servers). The Ubuntu wiki has a good list of bashisms to watch out for.

There's some very good info about the history and differences between shells in the Unix & Linux question "What does it mean to be sh compatible?" and the Stackoverflow question "Difference between sh and bash".

Also, be aware that the shell isn't the only thing that differs between different systems; if you're used to linux, you're used to the GNU commands, which have a lot of nonstandard extensions you may not find on other unix systems (e.g. bsd, macOS). Unfortunately, there's no simple rule here, you just have to know the range of variation for the commands you're using.

One of the nastiest commands in terms of portability is one of the most basic: echo. Any time you use it with any options (e.g. echo -n or echo -e), or with any escapes (backslashes) in the string to print, different versions will do different things. Any time you want to print a string without a linefeed after it, or with escapes in the string, use printf instead (and learn how it works -- it's more complicated than echo is). The ps command is also a mess.

Another general thing-to-watch-for is recent/GNUish extensions to command option syntax: old (standard) command format is that the command is followed by options (with a single dash, and each option is a single letter), followed by command arguments. Recent (and often non-portable) variants include long options (usually introduced with --), allowing options to come after arguments, and using -- to separate options from arguments.

Gordon Davisson
  • 11,036
  • 3
  • 27
  • 33
  • 12
    The fourth option is simply a wrong idea. Please do not use it. – pabouk - Ukraine stay strong Jul 30 '17 at 10:51
  • 3
    @pabouk I agree completely, so I edited my answer to make this clearer. – Gordon Davisson Jul 30 '17 at 18:11
  • 1
    Your first statement is slightly misleading. The POSIX standard doesn't specify anything about the shebang outside telling using it leads to unspecified behavior. Moreover, POSIX doesn't specify where the posix shell must be located, only its name (`sh`), so `/bin/sh` is not guaranteed to be the right path. The most portable is then not to specify any shebang at all, or to adapt the shebang to the OS used. – jlliagre Jul 30 '17 at 18:39
  • @jlliagre do you mean then to have the downstream admin edit the script as necessary? Zsh, for example, will usually fail to execute a script if said script does not designate an interpreter IIRC. – can-ned_food Jul 30 '17 at 19:52
  • 2
    I fell into #4 with a script of mine very recently, and just couldn't figure out why it wasn't working; after all, the same commands worked solidly, and did exactly what I wanted them to do, when I tried them directly in the shell. As soon as I changed `#!/bin/sh` to `#!/bin/bash` though, the script worked perfectly. (To my defense, that script had evolved over time from one that really only needed sh-isms, to one that relied on bash-like behavior.) – user Jul 30 '17 at 20:03
  • 1
    @jlliagre It's not POSIX, but the Linux Filesystem Hierarchy Standard does specify that `sh` (*The Bourne command shell*) is required to exist in `/bin`. Of course, that doesn't imply much else. http://www.pathname.com/fhs/pub/fhs-2.3.html#REQUIREMENTS2 – user Jul 30 '17 at 20:06
  • @can-ned_food Not necessarily the downstream admin but for example some post-install process that would fix all scripts belonging to an installed package. Unfortunately, there is no portable way to make sure a POSIX shell script will be executed by the right shell. #1 is a pain for Solaris 10 and older users. #2 is better as far as Solaris is concerned but would on systems missing `/bin/bash`. Of course, if the question is limited to Linux distributions, #1 makes sense and is the best approach. – jlliagre Jul 30 '17 at 21:57
  • 2
    @MichaelKjörling The (true) Bourne shell is almost never bundled with Linux distributions and is not POSIX compliant anyway. The POSIX standard shell was created from `ksh` behavior, not Bourne. What most Linux distributions following the FHS do is having `/bin/sh` being a symbolic link to the shell they select to provide POSIX compatibility, usually `bash` or `dash`. – jlliagre Jul 30 '17 at 22:08
  • @jlliagre Thanks, I added a note about Solaris 10. – Gordon Davisson Jul 31 '17 at 00:17
  • Don't forget `echo` is also a bad idea when you're echoing a non-constant string. – user541686 Jul 31 '17 at 04:16
  • Another popular unportable command: `realpath` / `readlink -f`. – Vi. Jul 31 '17 at 13:57
  • "Unfortunately, it's an error a lot of people make." And sometimes an error that can't be resolved easily. Petalinux is one package that basically forces you to use bash as /bin/sh because it hardcodes the path in lots of its files and assumes /bin/sh is bash. – JAB Jul 31 '17 at 20:31
  • Question; if you stick to strict `sh` POSIX syntax only (as in #1 - Most Compatible), will this be fully backwards compatible with `ksh` and `zsh`? – Andy J Aug 01 '17 at 00:28
  • 1
    @AndyJ: Mostly, but not quite exactly; see the "POSIX MODE" section of the ksh man page for a few obscure differences. zsh will attempt to be fully POSIX-compliant if invoked under the name /bin/sh, so it *should* be ok... – Gordon Davisson Aug 01 '17 at 01:06
  • The `printf` command is also much faster than `echo`. And since writing to the screen is usually by far the slowest thing a script does, always using `printf` instead of `echo` is the single best practice for speeding up script responsiveness. – DocSalvager Aug 06 '17 at 02:59
  • @DocSalvager There normally isn't a significant speed difference between `printf` and `echo`. Are you using a shell that has a builtin `printf` command, but uses the /bin/echo binary? In my experience, spawning subprocesses to run external programs (like /bin/echo) is one of the things that commonly slows down shell scripts. – Gordon Davisson Aug 06 '17 at 04:16
  • Third most portable: use the `#!/usr/bin/env zsh` shebang and use zsh extensions, which cover basically everything that bash4 can do, and more. If you use bash4 features, the script won't work on bash3 systems, and it won't be immediately obvious for the user why it fails. With the `zsh` shebang, the dependency is clear. Moreover, the Z shell has had its features for a long time, so an old version will probably work fine. If you have an old system with bash3, it will be easier to install (possibly old) zsh than to install bash4. Also note that MacOS only has bash3 and zsh in the base system. – michau Jul 05 '19 at 13:02
5

In the ./configure script which prepares the TXR language for building, I wrote the following prologue for better portability. The script will bootstrap itself even if #!/bin/sh is a non-POSIX-conforming old Bourne Shell. (I build every release on a Solaris 10 VM).

#!/bin/sh

# use your own variable name instead of txr_shell;
# adjust to taste: search for your favorite shells

if test x$txr_shell = x ; then
  for shell in /bin/bash /usr/bin/bash /usr/xpg4/bin/sh ; do
    if test -x $shell ; then
       txr_shell=$shell
       break
    fi
  done
  if test x$txr_shell = x ; then
    echo "No known POSIX shell found: falling back on /bin/sh, which may not work"
    txr_shell=/bin/sh
  fi
  export txr_shell
  exec $txr_shell $0 ${@+"$@"}
fi

# rest of the script here, executing in upgraded shell

The idea here is that we find a better shell than the one we are running under, and re-execute the script using that shell. The txr_shell environment variable is set, so that the re-executed script knows it is the re-executed recursive instance.

(In my script, the txr_shell variable is also subsequently used, for exactly two purposes: firstly it is printed as part of an informative message in the output of the script. Secondly, it is installed as the SHELL variable in the Makefile, so that make will use this shell too for executing recipes.)

On a system where /bin/sh is dash, you can see that the above logic will find /bin/bash and re-execute the script with that.

On a Solaris 10 box, the /usr/xpg4/bin/sh will kick in if no Bash is found.

The prologue is written in a conservative shell dialect, using test for file existence tests, and the ${@+"$@"} trick for expanding arguments catering to some broken old shells (which would simply be "$@" if we were in a POSIX conforming shell).

Kaz
  • 487
  • 2
  • 11
  • One wouldn't need the `x` hackery if proper quoting were in use, since the situations that mandated that idiom surround now-deprecated test invocations with `-a` or `-o` combining multiple tests. – Charles Duffy Jul 31 '17 at 22:56
  • @CharlesDuffy Indeed; the `test x$whatever` that I'm perpetrating there looks like an onion in the varnish. If we can't trust the broken old shell to do quoting, then the final `${@+"$@"}` attempt is pointless. – Kaz Jul 31 '17 at 22:57
3

All variations of the Bourne shell language are objectively terrible in comparison to modern scripting languages like Perl, Python, Ruby, node.js, and even (arguably) Tcl. If you have to do anything even a little bit complicated, you will be happier in the long run if you use one of the above instead of a shell script.

The one and only advantage the shell language still has over those newer languages is that something calling itself /bin/sh is guaranteed to exist on anything that purports to be Unix. However, that something may not even be POSIX-compliant; many of the legacy proprietary Unixes froze the language implemented by /bin/sh and the utilities in the default PATH prior to the changes demanded by Unix95 (yes, Unix95, twenty years ago and counting). There might be a set of Unix95, or even POSIX.1-2001 if you're lucky, tools in a directory not on the default PATH (e.g. /usr/xpg4/bin) but they aren't guaranteed to exist.

However, the basics of Perl are more likely to be present on an arbitrarily selected Unix installation than Bash is. (By "the basics of Perl" I mean /usr/bin/perl exists and is some, possibly quite old, version of Perl 5, and if you're lucky the set of modules that shipped with that version of the interpreter are also available.)

Therefore:

If you are writing something that has to work everywhere that purports to be Unix (such as a "configure" script), you need to use #! /bin/sh, and you need to not use any extensions whatsoever. Nowadays I would write POSIX.1-2001-compliant shell in this circumstance, but I would be prepared to patch out POSIXisms if someone asked for support for rusty iron.

But if you are not writing something that has to work everywhere, then the moment you are tempted to use any Bashism at all, you should stop and rewrite the entire thing in a better scripting language instead. Your future self will thank you.

(So when is it appropriate to use Bash extensions? To first order: never. To second order: only to extend the Bash interactive environment — e.g. to provide smart tab-completion and fancy prompts.)

zwol
  • 1,305
  • 2
  • 12
  • 22
  • The other day I had to write a script that did something like `find ... -print0 |while IFS="" read -rd "" file; do ...; done |tar c --null -T -`. It wouldn't be possible to make it work correctly without bashisms (`read -d ""`) and GNU extensions (`find -print0`, `tar --null`). Reimplementing that line in Perl would be much longer and clumsier. This is the kind of situation when using bashisms and other non-POSIX extensions is the Right Thing to do. – michau Jul 05 '19 at 14:57
  • @michau It might not qualify as a one-liner anymore, depending on what goes in those ellipses, but I think you're not taking the "rewrite the _entire thing_ in a better scripting language" as literally as I meant it. My way looks something like `perl -MFile::Find -e 'our @tarcmd = ("tar", "c"); find(sub { return unless ...; ...; push @tarcmd, $File::Find::name }, "."); exec @tarcmd` Notice that not only do you not need any bashisms, you don't need any GNU tar extensions. (If command line length is a concern or you want the scan and tar to run in parallel, then you do need `tar --null -T`.) – zwol Jul 05 '19 at 23:43
  • The thing is, `find` returned over 50,000 filenames in my scenario, so your script is likely to hit the argument length limit. A correct implementation that doesn't use non-POSIX extensions would have to build the tar output incrementally, or perhaps use Archive::Tar::Stream in the Perl code. Of course, it can be done, but in this case the Bash script is much quicker to write and has less boilerplate, and therefore less room for bugs. – michau Jul 06 '19 at 00:28
  • BTW, I could use Perl instead of Bash in a quick and concise way: `find ... -print0 |perl -0ne '... print ...' |tar c --null -T -`. But the questions are: 1) I'm using `find -print0` and `tar --null` anyway, so what's the point of avoiding the bashism `read -d ""`, which is exactly the same kind of thing (a GNU extension for handling null separator that, unlike Perl, may become a POSIX thing someday)? 2) Is Perl really more widespread than Bash? I've been using Docker images recently, and a lot of them have Bash but no Perl. New scripts are more likely to be run there, than on ancient Unices. – michau Jul 06 '19 at 06:27
  • I would argue that, for the sake of gentle learning curve, intuitive concise code, rich out-of-the-box API, and decent performance, either PHP or Peal is the way to go for an embedded scripting language intended to do small/administrative/cron/one-time tasks. – Jack G Jun 29 '20 at 16:41