How can function definition be part of pipe sequence in POSIX shell grammar?

The POSIX shell grammar at http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_10_02

says

pipe_sequence    :                             command
                 | pipe_sequence '|' linebreak command
                 ;
command          : simple_command
                 | compound_command
                 | compound_command redirect_list
                 | function_definition

which means, function definition can be a term in a pipe sequence. How is this possible? The function definition cannot have standard input or output and it is not a command that can be executed. Only a function call, which is a simple command, can be executed.

Added after the first comment and answer:

If we split off function_definition from command here, and add it as another alternative wherever else command appears, then yes we are complicating the grammar a little.

But the payoff is much more important: the implementation of such shell, is much easier.

Because if you allow function definitions in a pipe, you have to deal with questions such as what is the scope of the function, and in what environment does it run. I don't believe such questions are in fact answered in the standard at all.

What is worse: a little more complexity in the grammar, or much more work and complexity for the implementer. If the former, then is this not a case of "tail wagging the dog"?

user322908

Posted 2016-06-29T15:32:14.667

Reputation: 739

1It seems to work in bash, albeit it does nothing: ls | f () { sed 's/^/=/;s/$/=/'; }. – choroba – 2016-06-29T15:50:02.033

Answers

It's actually easier for the implementer to not have to worry about this. When doing a pipe each component is run in its own subshell (except maybe the first in bash, or the last in ksh88/ksh93 if the command is a native one). Thus the function definition in the middle of a pipeline would be defined for the shell instance for that component of the pipe, but not visible outside... and this is all automatic based on the semantics of pipelines.

If you wanted to prevent function definitions (or alias definitions, or silly commands such as cd...) inside a pipeline then you've complicated the implementation :-)

Stephen Harris

Posted 2016-06-29T15:32:14.667

Reputation: 198

Pipe commands aren't executed in subshells, at least in Bash. They are direct child processes of the shell. – Daniel B – 2016-06-29T18:09:57.573

1Bash forks; the child process is another instance of this bash shell; variables and other changes (functions, directory changes, etc) set in that shell are not propagated back to the parent. For the intent of this question, that's can be treated as if it was a subshell . FWIW the manpage says Each command in a pipeline is executed as a separate process (i.e., in a subshell). :-) – Stephen Harris – 2016-06-29T18:19:49.640

Only Bash builtins (like function definitions) will be executes in a subshell (well, maybe). Everything else is already a separate process. – Daniel B – 2016-06-29T18:22:03.337

@DanielB my question was really about the POSIX standard... where does it comment in the standard, whether a term of the pipeline executes in a separate process, subshell, or whatever? – user322908 – 2016-06-29T18:27:07.927

@user322908 http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_09_01_01

– Daniel B – 2016-06-29T18:31:38.233

@DanielB I am sorry to be such a pain... but I still don't see whether terms in pipeline execute in a separate process/subshell, in the link you posted. – user322908 – 2016-06-29T21:17:56.870

1Section 2.12 "Additionally, each command of a multi-command pipeline is in a subshell environment" – Stephen Harris – 2016-06-29T21:20:18.283

Why wouldn’t it be possible? Is it pointless? Definitely. But it works:

$ function asdf { echo "bla"; } | hexdump -C; echo EOF
EOF

Similarly:

$ function asdf { echo "bla"; } | asdf | hexdump -C; echo EOF
-bash: asdf: command not found
EOF

Defining a function is a “command” like any other. It doesn’t have any output and doesn’t take any input, though. You could even do a variable assignment. Pointless again, of course, but not an error.

The “why” is probably: KISS. You wouldn’t want to pollute your grammar with needless complexity.

Update: Upon further examination I found out that Bash doesn't even bother running the pipe commands after a function definition.

Daniel B

Posted 2016-06-29T15:32:14.667

Reputation: 40 502

I think I did not explain myself fully in the original question. I edited the question to make it more to the point. Your answer is enlightening, thank you, but please look at my edit. – user322908 – 2016-06-29T17:00:39.343

You say the implementation is much easier, but is it? By having less special cases, you need less code. Less code means fewer headaches and also fewer bugs. Functions in Bash also don't have a scope. It's not object-oriented. There's only the environment. At the time a function is executed, of course. – Daniel B – 2016-06-29T18:03:54.843