Plan 9:[[MIT License]]"},"website":{"wt":""}},"i":0}}]}" id="mwBQ">.mw-parser-output .infobox-subbox{padding:0;border:none;margin:-3px;width:auto;min-width:100%;font-size:100%;clear:none;float:none;background-color:transparent}.mw-parser-output .infobox-3cols-child{margin:auto}.mw-parser-output .infobox .navbar{font-size:100%}body.skin-minerva .mw-parser-output .infobox-header,body.skin-minerva .mw-parser-output .infobox-subheader,body.skin-minerva .mw-parser-output .infobox-above,body.skin-minerva .mw-parser-output .infobox-title,body.skin-minerva .mw-parser-output .infobox-image,body.skin-minerva .mw-parser-output .infobox-full-data,body.skin-minerva .mw-parser-output .infobox-below{text-align:center}html.skin-theme-clientpref-night .mw-parser-output .infobox-full-data div{background:#1f1f23!important;color:#f8f9fa}@media(prefers-color-scheme:dark){html.skin-theme-clientpref-os .mw-parser-output .infobox-full-data div{background:#1f1f23!important;color:#f8f9fa}}
Original author(s) | Ken Thompson (AT&T Bell Laboratories) |
---|---|
Developer(s) | Various open-source and commercial developers |
Initial release | November 3, 1971 |
Written in | C |
Operating system | Multics, Unix, Unix-like, V, Plan 9, Inferno, MSX-DOS, IBM i |
Platform | Cross-platform |
Type | Command |
License | coreutils: GPLv3+ Plan 9: MIT License |
In computing, sort is a standard command line program of Unix and Unix-like operating systems, that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator. The command supports a number of command-line options that can vary by implementation. For instance the "-r
" flag will reverse the sort order.
A sort
command that invokes a general sort facility was first implemented within Multics. [1] Later, it appeared in Version 1 Unix. This version was originally written by Ken Thompson at AT&T Bell Laboratories. By Version 4 Thompson had modified it to use pipes, but sort retained an option to name the output file because it was used to sort a file in place. In Version 5, Thompson invented "-" to represent standard input. [2]
The version of sort bundled in GNU coreutils was written by Mike Haertel and Paul Eggert. [3] This implementation employs the merge sort algorithm.
Similar commands are available on many other operating systems, for example a sort command is part of ASCII's MSX-DOS2 Tools for MSX-DOS version 2. [4]
The sort command has also been ported to the IBM i operating system. [5]
sort [OPTION]... [FILE]...
With no FILE
, or when FILE
is -
, the command reads from standard input.
Name | Description | Unix | Plan 9 | Inferno | FreeBSD | Linux | MSX-DOS | IBM i |
---|---|---|---|---|---|---|---|---|
-b, --ignore-leading-blanks | Ignores leading blanks. | Yes | Yes | No | Yes | Yes | No | Yes |
-c | Check that input file is sorted. | No | Yes | No | Yes | Yes | No | Yes |
-C | Like -c, but does not report the first bad line. | No | No | No | Yes | Yes | No | No |
-d, --dictionary-order | Considers only blanks and alphanumeric characters. | Yes | Yes | No | Yes | Yes | No | Yes |
-f, --ignore-case | Fold lower case to upper case characters. | Yes | Yes | No | Yes | Yes | No | Yes |
-g, --general-numeric-sort, --sort=general-numeric | Compares according to general numerical value. | Yes | Yes | No | Yes | Yes | No | No |
-h, --human-numeric-sort, --sort=human-numeric | Compare human readable numbers (e.g., 2K 1G). | Yes | No | No | Yes | Yes | No | No |
-i, --ignore-nonprinting | Considers only printable characters. | Yes | Yes | No | Yes | Yes | No | Yes |
-k, --key=POS1[,POS2] | Start a key at POS1 (origin 1), end it at POS2 (default end of line) | No | No | No | Yes | Yes | No | No |
-m | Merge only; input files are assumed to be presorted. | No | Yes | No | Yes | Yes | No | Yes |
-M, --month-sort, --sort=month | Compares (unknown) < 'JAN' < ... < 'DEC'. | Yes | Yes | No | Yes | Yes | No | No |
-n, --numeric-sort, --sort=numeric | Compares according to string numerical value. | Yes | Yes | Yes | Yes | Yes | No | Yes |
-oOUTPUT | Uses OUTPUT file instead of standard output. | No | Yes | No | Yes | Yes | No | Yes |
-r, --reverse | Reverses the result of comparisons. | Yes | Yes | Yes | Yes | Yes | No | Yes |
-R, --random-sort, --sort=random | Shuffles, but groups identical keys. See also: shuf | Yes | No | No | Yes | Yes | No | No |
-s | Stabilizes sort by disabling last-resort comparison. | No | No | No | Yes | Yes | No | No |
-Ssize, --buffer-size=size | Use size for the maximum size of the memory buffer. | No | No | No | Yes | No | No | No |
-tx | 'Tab character' separating fields is x. | No | Yes | No | No | Yes | No | Yes |
-tchar, --field-separator=char | Uses char instead of non-blank to blank transition. | No | No | No | Yes | Yes | No | No |
-Tdir, --temporary-directory=dir | Uses dir for temporaries. | No | Yes | No | Yes | Yes | No | No |
-u, --unique | Unique processing to suppress all but one in each set of lines having equal keys. | No | Yes | No | Yes | Yes | No | Yes |
-V, --version-sort | Natural sort of (version) numbers within text | No | No | No | Yes | Yes | No | No |
-w | Like -i, but ignore only tabs and spaces. | No | Yes | No | No | No | No | No |
-z, --zero-terminated | End lines with 0 byte, not newline | No | No | No | Yes | Yes | No | No |
--help | Display help and exit | No | No | No | Yes | Yes | No | No |
--version | Output version information and exit | No | No | No | Yes | Yes | No | No |
/R | Reverses the result of comparisons. | No | No | No | No | No | Yes | No |
/S | Specify the number of digits to determine how many digits of each line should be judged. | No | No | No | No | No | Yes | No |
/A | Sort by ASCII code. | No | No | No | No | No | Yes | No |
/H | Include hidden files when using wild cards. | No | No | No | No | No | Yes | No |
$ catphonebook Smith, Brett 555-4321Doe, John 555-1234Doe, Jane 555-3214Avery, Cory 555-4132Fogarty, Suzie 555-2314
$ sortphonebook Avery, Cory 555-4132Doe, Jane 555-3214Doe, John 555-1234Fogarty, Suzie 555-2314Smith, Brett 555-4321
The -n
option makes the program sort according to numerical value. The du command produces output that starts with a number, the file size, so its output can be piped to sort to produce a list of files sorted by (ascending) file size:
$ du/bin/*|sort-n 4 /bin/domainname24 /bin/ls102 /bin/sh304 /bin/csh
The find command with the ls option prints file sizes in the 7th field, so a list of the LaTeX files sorted by file size is produced by:
$ find.-name"*.tex"-ls|sort-k7n
Use the -k
option to sort on a certain column. For example, use "-k 2
" to sort on the second column. In old versions of sort, the +1
option made the program sort on the second column of data (+2
for the third, etc.). This usage is deprecated.
$ catzipcode Adam 12345Bob 34567Joe 56789Sam 45678Wendy 23456
$ sort-k2nzipcode Adam 12345Wendy 23456Bob 34567Sam 45678Joe 56789
The -k m,n
option lets you sort on a key that is potentially composed of multiple fields (start at column m
, end at column n
):
$ catquota fred 2000bob 1000an 1000chad 1000don 1500eric 500
$ sort-k2,2n-k1,1quota eric 500an 1000bob 1000chad 1000don 1500fred 2000
Here the first sort is done using column 2. -k2,2n
specifies sorting on the key starting and ending with column 2, and sorting numerically. If -k2
is used instead, the sort key would begin at column 2 and extend to the end of the line, spanning all the fields in between. -k1,1
dictates breaking ties using the value in column 1, sorting alphabetically by default. Note that bob, and chad have the same quota and are sorted alphabetically in the final output.
$ sort-k2,2,-k1,1-t'|'zipcode Adam|12345Wendy|23456Sam|45678Joe|56789Bob|34567
Sorting a file with tab separated values requires a tab character to be specified as the column delimiter. This illustration uses the shell's dollar-quote notation [6] [7] to specify the tab as a C escape sequence.
$ sort-k2,2-t$'\t'phonebookDoe, John 555-1234Fogarty, Suzie 555-2314Doe, Jane 555-3214Avery, Cory 555-4132Smith, Brett 555-4321
The -r
option just reverses the order of the sort:
$ sort-rk2nzipcode Joe 56789Sam 45678Bob 34567Wendy 23456Adam 12345
The GNU implementation has a -R --random-sort
option based on hashing; this is not a full random shuffle because it will sort identical lines together. A true random sort is provided by the Unix utility shuf.
The GNU implementation has a -V --version-sort
option which is a natural sort of (version) numbers within text. Two text strings that are to be compared are split into blocks of letters and blocks of digits. Blocks of letters are compared alpha-numerically, and blocks of digits are compared numerically (i.e., skipping leading zeros, more digits means larger, otherwise the leftmost digits that differ determine the result). Blocks are compared left-to-right and the first non-equal block in that loop decides which text is larger. This happens to work for IP addresses, Debian package version strings and similar tasks where numbers of variable length are embedded in strings.
grep
is a command-line utility for searching plaintext datasets for lines that match a regular expression. Its name comes from the ed command g/re/p
, which has the same effect. grep
was originally developed for the Unix operating system, but later available for all Unix-like systems and some others such as OS-9.
uniq
is a utility command on Unix, Plan 9, Inferno, and Unix-like operating systems which, when fed a text file or standard input, outputs the text with adjacent identical lines collapsed to one, unique line of text.
In computing, ls
is a command to list computer files and directories in Unix and Unix-like operating systems. It is specified by POSIX and the Single UNIX Specification.
The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. comm is specified in the POSIX standard. It has been widely available on Unix-like operating systems since the mid to late 1980s.
head is a program on Unix and Unix-like operating systems used to display the beginning of a text file or piped data.
The computer tool patch is a Unix program that updates text files according to instructions contained in a separate file, called a patch file. The patch file is a text file that consists of a list of differences and is produced by running the related diff program with the original and updated file as arguments. Updating files with patch is often referred to as applying the patch or simply patching the files.
join
is a command in Unix and Unix-like operating systems that merges the lines of two sorted text files based on the presence of a common field. It is similar to the join operator used in relational databases but operating on text files.
tr is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. It is an abbreviation of translate or transliterate, indicating its operation of replacing or removing specific characters in its input data set.
wc
is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. The program reads either standard input or a list of computer files and generates one or more of the following statistics: newline count, word count, and byte count. If a list of files is provided, both individual file and total statistics follow.
paste is a Unix command line utility which is used to join files horizontally by outputting lines consisting of the sequentially corresponding lines of each file specified, separated by tabs, to the standard output.
du
is a standard Unix program used to estimate file space usage—space used under a particular directory or files on a file system. A Windows commandline version of this program is part of Sysinternals suite by Mark Russinovich.
In computing, more
is a command to view the contents of a text file one screen at a time. It is available on Unix and Unix-like systems, DOS, Digital Research FlexOS, IBM/Toshiba 4690 OS, IBM OS/2, Microsoft Windows and ReactOS. Programs of this sort are called pagers. more
is a very basic pager, originally allowing only forward navigation through a file, though newer implementations do allow for limited backward movement.
tail is a program available on Unix, Unix-like systems, FreeDOS and MSX-DOS used to display the tail end of a text file or piped data.
In computing, tee
is a command in command-line interpreters (shells) using standard streams which reads standard input and writes it to both standard output and one or more files, effectively duplicating its input. It is primarily used in conjunction with pipes and filters. The command is named after the T-splitter used in plumbing.
The dump
command is a program on Unix and Unix-like operating systems used to back up file systems. It operates on blocks, below filesystem abstractions such as files and directories. Dump can back up a file system to a tape or another disk. It is often used across a network by piping its output through bzip2 then SSH.
In computing, sleep is a command in Unix, Unix-like and other operating systems that suspends program execution for a specified time.
The tsort program is a command line utility on Unix and Unix-like platforms, that performs a topological sort on its input. As of 2017, it is part of the POSIX.1 standard.
sum is a legacy utility available on some Unix and Unix-like operating systems. This utility outputs a 16-bit checksum of each argument file, as well as the number of blocks they take on disk. Two different checksum algorithms are in use. POSIX abandoned sum
in favor of cksum.
spell is the standard English language spell checker for Unix, Plan 9, and Unix-like operating systems.
cat
is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to (con)catenate files . It has been ported to a number of operating systems.
Words of the form $'string' are treated specially. The word expands to string, with backslash-escaped characters replaced as specified by the ANSI C standard.
The $'...' string literal syntax was added to ksh93 to solve the problem of entering special characters in scripts. It uses ANSI-C rules to translate the string between the '...'.