Getopts

Last updated
getopts
Developer(s) Various open-source and commercial developers
Initial release1986;38 years ago (1986)
Operating system Unix, Unix-like, IBM i
Type Command

getopts is a built-in Unix shell command for parsing command-line arguments. It is designed to process command line arguments that follow the POSIX Utility Syntax Guidelines, based on the C interface of getopt.

Contents

The predecessor to getopts was the external program getopt by Unix System Laboratories.

History

The original getopt had several problems: it could not handle whitespace or shell metacharacters in arguments, and there was no ability to disable the output of error messages. [1]

getopts was first introduced in 1986 in the Bourne shell shipped with Unix SVR3. It uses the shell's own variables to track the position of current and argument positions, OPTIND and OPTARG, and returns the option name in a shell variable. [2] Earlier versions of the Bourne shell did not have getopts.

In 1995, getopts was included in the Single UNIX Specification version 1 / X/Open Portability Guidelines Issue 4. [3] As a result, getopts is now available in shells including the Bourne shell, KornShell, Almquist shell, Bash and Zsh. [4]

The getopts command has also been ported to the IBM i operating system. [5]

The modern usage of getopt was partially revived mainly due to an enhanced implementation in util-linux. This version, based on the BSD getopt, not only fixed the two complaints around the old getopt, but also introduced the capability for parsing GNU-style long options and optional arguments for options, features that getopts lacks. [6] The various BSD distributions, however, stuck to the old implementation. [1]

Usage

The usage synopsis of getopt and getopts is similar to its C sibling:

getopt optstring[parameters] getopts optstringvarname[parameters]

The way one uses the commands however varies a lot:

Enhancements

In various getopts

In spring 2004 (Solaris 10 beta development), the libc implementation for getopt() was enhanced to support long options. As a result, this new feature was also available in the built-in command getopts of the Bourne Shell. This is triggered by parenthesized suffixes in the optstring specifying long aliases. [7]

KornShell and Zsh both have an extension for long arguments. The former is defined as in Solaris, [8] while the latter is implemented via a separate zparseopts command. [9]

KornShell additionally implements optstring extensions for options beginning with + instead of -. [8]

In Linux getopt

An alternative to getopts is the Linux enhanced version of getopt, the external command line program.

The Linux enhanced version of getopt has the extra safety of getopts plus more advanced features. It supports long option names (e.g. --help) and the options do not have to appear before all the operands (e.g. command operand1 operand2 -a operand3 -b is permitted by the Linux enhanced version of getopt but does not work with getopts). It also supports escaping metacharacters for shells (like tcsh and POSIX sh) and optional arguments. [6]

Comparison

Program
Feature
POSIX getoptsSolaris/ksh getoptsUnix/BSD getoptLinux getopt
Splits options for easy parsingYesYesYesYes
Allows suppressing error messagesYesYesNoYes
Safe with whitespace and metacharactersYesYesNoYes
Allows operands to be mixed with optionsNoYesNoYes
Supports long optionsEmulationYesNoYes
Optional argumentsError handlingError handlingNoYes

Examples

Suppose we are building a Wikipedia downloader in bash that takes three options and zero extra arguments:

wpdown -a article name -l [language] -v

When possible, we allow the following long arguments:

-a   --article -l   --language, --lang -v   --verbose

For clarity, no help text is included, and we assume there is a program that downloads any webpage. In addition, all programs are of the form:

#!/bin/bashVERBOSE=0ARTICLE=''LANG=en  # [EXAMPLE HERE]if((VERBOSE>2));thenprintf'%s\n''Non-option arguments:'printf'%q '"${remaining[@]]}"fiif((VERBOSE>1));thenprintf'Downloading %s:%s\n'"$LANG""$ARTICLE"fiif[[!$ARTICLE]];thenprintf'%s\n'"No articles!">&2exit1fi  save_webpage"https://${LANG}.wikipedia.org/wiki/${ARTICLE}"

Using old getopt

The old getopt does not support optional arguments:

# parse everything; if it fails we bailargs=`getopt'a:l:v'$*`||exit# now we have the sanitized args... replace the original with itset--$argswhiletrue;docase$1in(-v)((VERBOSE++));shift;;(-a)ARTICLE=$2;shift2;;(-l)LANG=$2;shift2;;(--)shift;break;;(*)exit1;;# erroresacdoneremaining=("$@")

This script will also break with any article title with a space or a shell metacharacter (like ? or *) in it.

Using getopts

Getopts give the script the look and feel of the C interface, although in POSIX optional arguments are still absent:

#!/bin/shwhilegetopts':a:l:v'opt;docase$optin(v)((VERBOSE++));;(a)ARTICLE=$OPTARG;;(l)LANG=$OPTARG;;(:)# "optional arguments" (missing option-argument handling)case$OPTARGin(a)exit1;;# error, according to our syntax(l):;;# acceptable but does nothingesac;;esacdoneshift$((OPTIND-1))# remaining is "$@"

Since we are no longer operating on shell options directly, we no longer need to shift them within the loop. However, a slicing operation is required to remove the parsed options and leave the remaining arguments.

It is fairly simple to emulate long option support of flags by treating --fast as an argument fast to an option -. That is, -: is added to the optstring, and - is added as a case for opt, within which OPTARG is evaluated for a match to fast. Supporting long options with an argument is more tedious, but is possible when the options and arguments are delineated by =. [10]

Using Linux getopt

Linux getopt escapes its output and an "eval" command is needed to have the shell interpret it. The rest is unchanged:

#!/bin/bash# We use "${@}" instead of "${*}" to preserve argument-boundary informationARGS=$(getopt--options'a:l::v'--longoptions'article:,lang::,language::,verbose'--"${@}")||exiteval"set -- ${ARGS}"whiletrue;docase"${1}"in(-v|--verbose)((VERBOSE++))shift;;(-a|--article)ARTICLE="${2}"shift2;;(-l|--lang|--language)# handle optional: getopt normalizes it into an empty stringif[-n"${2}"];thenLANG="${2}"fishift2;;(--)shiftbreak;;(*)exit1# error;;esacdoneremaining_args=("${@}")

See also

Related Research Articles

<span class="mw-page-title-main">Bash (Unix shell)</span> GNU replacement for the Bourne shell

Bash is a Unix shell and command language written by Brian Fox for the GNU Project as a free software replacement for the Bourne shell. The shell's name is an acronym for Bourne-Again SHell, a pun on the name of the Bourne shell that it replaces and the notion of being "born again". First released in 1989, it has been used as the default login shell for most Linux distributions and it was one of the first programs Linus Torvalds ported to Linux, alongside GCC. It is available on nearly all modern operating systems.

<span class="mw-page-title-main">Shell script</span> Script written for the shell, or command line interpreter, of an operating system

A shell script is a computer program designed to be run by a Unix shell, a command-line interpreter. The various dialects of shell scripts are considered to be scripting languages. Typical operations performed by shell scripts include file manipulation, program execution, and printing text. A script which sets up the environment, runs the program, and does any necessary cleanup or logging, is called a wrapper.

<span class="mw-page-title-main">Unix shell</span> Command-line interpreter for Unix operating system

A Unix shell is a command-line interpreter or shell that provides a command line user interface for Unix-like operating systems. The shell is both an interactive command language and a scripting language, and is used by the operating system to control the execution of the system using shell scripts.

<span class="mw-page-title-main">Bourne shell</span> Command-line interpreter for operating systems

The Bourne shell (sh) is a shell command-line interpreter for computer operating systems.

<span class="mw-page-title-main">C shell</span> Unix shell

The C shell is a Unix shell created by Bill Joy while he was a graduate student at University of California, Berkeley in the late 1970s. It has been widely distributed, beginning with the 2BSD release of the Berkeley Software Distribution (BSD) which Joy first distributed in 1978. Other early contributors to the ideas or the code were Michael Ubell, Eric Allman, Mike O'Brien and Jim Kulp.

Almquist shell is a lightweight Unix shell originally written by Kenneth Almquist in the late 1980s. Initially a clone of the System V.4 variant of the Bourne shell, it replaced the original Bourne shell in the BSD versions of Unix released in the early 1990s.

The printf family of functions in the C programming language are a set of functions that take a format string as input among a variable sized list of other values and produce as output a string that corresponds to the format specifier and given input values. The string is written in a simple template language: characters are usually copied literally into the function's output, but format specifiers, which start with a % character, indicate the location and method to translate a piece of data to characters. The design has been copied to expose similar functionality in other programming languages.

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.

In computing, echo is a command that outputs the strings that are passed to it as arguments. It is a command available in various operating system shells and typically used in shell scripts and batch files to output status text to the screen or a computer file, or as a source part of a pipeline.

stat (system call) Unix system call

stat is a Unix system call that returns file attributes about an inode. The semantics of stat vary between operating systems. As an example, Unix command ls uses this system call to retrieve information on files that includes:

<span class="mw-page-title-main">Comparison of command shells</span>

A command shell is a command-line interface to interact with and manipulate a computer's operating system.

test is a command-line utility found in Unix, Plan 9, and Unix-like operating systems that evaluates conditional expressions. test was turned into a shell builtin command in 1981 with UNIX System III and at the same time made available under the alternate name [.

In computing, a shebang is the character sequence consisting of the characters number sign and exclamation mark at the beginning of a script. It is also called sharp-exclamation, sha-bang, hashbang, pound-bang, or hash-pling.

sum is a legacy utility available on some Unix and Unix-like operating systems. This utility outputs a 16-bit checksum of each argument file, as well as the number of blocks they take on disk. Two different checksum algorithms are in use. POSIX abandoned sum in favor of cksum.

Qshell is an optional command-line interpreter (shell) for the IBM i operating system. Qshell is based on POSIX and X/Open standards. It is a Bourne-like shell that also includes features of KornShell. The utilities are external programs that provide additional functions. The development team of Qshell had to deal with platform-specific issues such as translating between ASCII and EBCDIC. The shell supports interactive mode as well as batch processing and can run shell scripts from Unix-like operating systems with few or no modifications.

In computer programming, a usage message or help message is a brief message displayed by a program that utilizes a command-line interface for execution. This message usually consists of the correct command line usage for the program and includes a list of the correct command-line arguments or options acceptable to said program.

In a Unix shell, the full stop called the dot command (.) is a command that evaluates commands in a computer file in the current execution context. In the C shell, a similar functionality is provided as the source command, and this name is seen in "extended" POSIX shells as well.

Getopt is a C library function used to parse command-line options of the Unix/POSIX style. It is a part of the POSIX specification, and is universal to Unix-like systems. It is also the name of a Unix program for parsing command line arguments in shell scripts.

Different command-line argument parsing methods are used by different programming languages to parse command-line arguments.

<span class="mw-page-title-main">Command-line interface</span> Computer interface that uses text

A command-line interface (CLI) is a means of interacting with a computer program by inputting lines of text called command-lines. Command-line interfaces emerged in the mid-1960s, on computer terminals, as an interactive and more user-friendly alternative to the non-interactive interface available with punched cards.

References

  1. 1 2 3 getopt(1)    FreeBSD General Commands Manual
  2. Mascheck, Sven. "The Traditional Bourne Shell Family" . Retrieved 2010-12-01.
  3. 1 2 "getopts". The Open Group (POSIX 2018).
  4. "Bash Reference Manual".
  5. IBM. "IBM System i Version 7.2 Programming Qshell" (PDF). IBM . Retrieved 2020-09-05.
  6. 1 2 getopt(1)    Linux General Commands Manual
  7. "getopt(3)". Oracle Solaris 11.2 Information Library.
  8. 1 2 "ksh getopts -- parse options from shell script command line". www.mkssoftware.com.
  9. zshmodules(1)    Linux General Commands Manual
  10. "A simple CLI parser in Bash".