Command substitution

Last updated

In computing, command substitution is a facility that allows a command to be run and its output to be pasted back on the command line as arguments to another command. Command substitution first appeared in the Bourne shell, [1] introduced with Version 7 Unix in 1979, and has remained a characteristic of all later Unix shells. The feature has since been adopted in other programming languages as well, including Perl, PHP, Ruby and Microsoft's Powershell under Windows. It also appears in Microsoft's CMD.EXE in the FOR command and the ( ) command.

Contents

Syntax and semantics

Shells typically implement command substitution by creating a child process to run the first command with its standard output piped back to the shell, which reads that output, parsing it into words separated by whitespace. Because the shell can't know it has all the output from the child until the pipe closes or the child dies, it waits until then before it starts another child process to run the second command.

This C shell example shows how one might search for all the C files containing the string malloc using fgrep and then edit any that are found using the vi editor. The syntactical notation shown here, ` ... `, using backquotes as delimiters, is the original style and is supported by all the common Unix shells.

#!/bin/csh vi `fgrep -l malloc *.c`

Objections have been raised to both the syntax, how it's typed, and the semantics, how it works.

While easy to type, an important factor for an interactive command processor, the syntax has been criticized as awkward to nest, putting one command substitution inside another, because both the left and the right delimiters are the same. [2] The KornShell (ksh) [3] solved this with an alternative notation, $( ... ), borrowing from the notational style used for variable substitution. Today, most UNIX shells support this syntax. Microsoft's PowerShell also uses this notation, with the same semantics.

#!/bin/bash vi $(fgrep -l malloc *.c)

The semantics, breaking the output into words at whitespace, has also been criticized. It worked well on early Unix systems where filenames never contained spaces but it doesn't work at all well on modern Windows and Linux systems where filenames certainly can contain spaces. [4] In either of these previous examples, if any of the filenames matched by the *.c wildcard contains a space, that filename will be broken into two separate arguments to vi, clearly not what was intended. Hamilton C shell solved this with a double backquote notation, `` ... ``, that parses into words only at line breaks. [5]

This is an example of command substitution using the () operator in PowerShell:

$MyVariable=(ls)echo $MyVariable

Expression substitution

A related facility, expression substitution, is found in the languages Common Lisp and Scheme, invoked by using the comma-at operator in an expression marked with the backquote (or "quasiquote") operator, and in ABC, by using an expression enclosed between backquotes inside a text display (string literal). For example, the ABC command WRITE '2 + 2 = `2+2`' produces the output 2 + 2 = 4.

See also

Related Research Articles

<span class="mw-page-title-main">AWK</span> Data-driven programming language made by Alfred Aho, Peter Weinberger and Brian Kernighan

AWK (awk) is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

<span class="mw-page-title-main">Bash (Unix shell)</span> GNU replacement for the Bourne shell

Bash is a Unix shell and command language written by Brian Fox for the GNU Project as a free software replacement for the Bourne shell. First released in 1989, it has been used as the default login shell for most Linux distributions. Bash was one of the first programs Linus Torvalds ported to Linux, alongside GCC. A version is also available for Windows 10 and Windows 11 via the Windows Subsystem for Linux. It is also the default user shell in Solaris 11. Bash was also the default shell in versions of Apple macOS from 10.3 to 10.15, which changed the default shell to zsh, although Bash remains available as an alternative shell.

sed Standard UNIX utility for editing streams of data

sed is a Unix utility that parses and transforms text, using a simple, compact programming language. It was developed from 1973 to 1974 by Lee E. McMahon of Bell Labs, and is available today for most operating systems. sed was based on the scripting features of the interactive editor ed and the earlier qed. It was one of the earliest tools to support regular expressions, and remains in use for text processing, most notably with the substitution command. Popular alternative tools for plaintext string manipulation and "stream editing" include AWK and Perl.

<span class="mw-page-title-main">Shell script</span> Script written for the shell, or command line interpreter, of an operating system

A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manipulation, program execution, and printing text. A script which sets up the environment, runs the program, and does any necessary cleanup or logging, is called a wrapper.

<span class="mw-page-title-main">Unix shell</span> Command-line interpreter for Unix operating system

A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to control the execution of the system using shell scripts.

In computer science, Backus–Naur form or Backus normal form (BNF) is a metasyntax notation for context-free grammars, often used to describe the syntax of languages used in computing, such as computer programming languages, document formats, instruction sets and communication protocols. It is applied wherever exact descriptions of languages are needed: for instance, in official language specifications, in manuals, and in textbooks on programming language theory.

<span class="mw-page-title-main">Bourne shell</span> Command-line interpreter for operating systems

The Bourne shell (sh) is a shell command-line interpreter for computer operating systems.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txt textfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard standing for "any string of characters except /" and *.txt is a glob pattern. The other common wildcard is the question mark (?), which stands for one character. For example, mv ?.txt shorttextfiles/ will move all files named with a single character followed by .txt from the current directory to directory shorttextfiles, while ??.txt would match all files whose name consists of 2 characters followed by .txt.

In computing, umask is a command that determines the settings of a mask that controls how file permissions are set for newly created files. It may also affect how the file permissions are changed explicitly. umask is also a function that sets the mask, or it may refer to the mask itself, which is formally known as the file mode creation mask. The mask is a grouping of bits, each of which restricts how its corresponding permission is set for newly created files. The bits in the mask may be changed by invoking the umask command.

<span class="mw-page-title-main">Pipeline (Unix)</span>

In Unix-like computer operating systems, a pipeline is a mechanism for inter-process communication using message passing. A pipeline is a set of processes chained together by their standard streams, so that the output text of each process (stdout) is passed directly as input (stdin) to the next one. The second process is started as the first process is still executing, and they are executed concurrently. The concept of pipelines was championed by Douglas McIlroy at Unix's ancestral home of Bell Labs, during the development of Unix, shaping its toolbox philosophy. It is named by analogy to a physical pipeline. A key feature of these pipelines is their "hiding of internals". This in turn allows for more clarity and simplicity in the system.

In computing, a here document is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace in the text.

test is a command-line utility found in Unix, Plan 9, and Unix-like operating systems that evaluates conditional expressions. test was turned into a shell builtin command in 1981 with UNIX System III and at the same time made available under the alternate name [.

This comparison of programming languages compares the features of language syntax (format) for over 50 computer programming languages.

mtrace is the memory debugger included in the GNU C Library.

A batch file is a script file in DOS, OS/2 and Microsoft Windows. It consists of a series of commands to be executed by the command-line interpreter, stored in a plain text file. A batch file may contain any command the interpreter accepts interactively and use constructs that enable conditional branching and looping within the batch file, such as IF, FOR, and GOTO labels. The term "batch" is from batch processing, meaning "non-interactive execution", though a batch file might not process a batch of multiple data.

getopts is a built-in Unix shell command for parsing command-line arguments. It is designed to process command line arguments that follow the POSIX Utility Syntax Guidelines, based on the C interface of getopt.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interpreter or command-line processor uses a command-line interface (CLI) to receive commands from a user in the form of lines of text. This provides a means of setting parameters for the environment, invoking executables and providing information to them as to what actions they are to perform. In some cases the invocation is conditional based on conditions established by the user or previous executables. Such access was first provided by computer terminals starting in the mid-1960s. This provided an interactive environment not available with punched cards or other input methods.

<span class="mw-page-title-main">Hamilton C shell</span>

Hamilton C shell is a clone of the Unix C shell and utilities for Microsoft Windows created by Nicole Hamilton at Hamilton Laboratories as a completely original work, not based on any prior code. It was first released on OS/2 on December 12, 1988 and on Windows NT in July 1992. The OS/2 version was discontinued in 2003 but the Windows version continues to be actively supported.

In computing, process substitution is a form of inter-process communication that allows the input or output of a command to appear as a file. The command is substituted in-line, where a file name would normally occur, by the command shell. This allows programs that normally only accept files to directly read from or write to another program.

References

  1. Dahdah, Howard. "The A-Z of Programming Languages: Bourne shell, or sh, An in-depth interview with Steve Bourne, creator of the Bourne shell, or sh", Computerworld, March 5, 2009.
  2. Unix Power Tools: 45.31 Nested Command Substitution.
  3. Rosenblatt, Bill; Arnold Robbins (2002). Learning the Korn Shell (2 ed.). O'Reilly Media, Inc. p. 127. ISBN   978-0-596-00195-7 . Retrieved 2010-07-20. The syntax of command substitution is:  $(Unix command)  The command inside the parenthesis is run, and anything the command writes to standard output (and to standard error) is returned as the value of the expression.
  4. Johnson, Chris (2009), "8", Pro Bash Programming: Scripting the Linux Shell, New York, NY: Springer-Verlag New York, Inc., p. 84, ISBN   9781430219989 , retrieved December 19, 2014, File names containing spaces are an abomination, but they are so common nowadays that scripts must take their possibility (or should I say inevitability?) into account. ... The result of command substitution is subject to word splitting
  5. Hamilton C shell User guide: I/O redirection: Command substitution, Hamilton Laboratories, archived from the original on December 19, 2014, retrieved December 19, 2014