Tr (Unix)

Last updated
tr
Original author(s) Douglas McIlroy
(AT&T Bell Laboratories)
Developer(s) Various open-source and commercial developers
Initial releaseNovember 1973;48 years ago (1973-11)
Repository
Written in C
Operating system Unix, Unix-like, Plan 9, Inferno, OS-9, MSX-DOS, IBM i
Platform Cross-platform
Type Command
License coreutils: GPLv3+
Plan 9: MIT License

tr is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. It is an abbreviation of translate or transliterate, indicating its operation of replacing or removing specific characters in its input data set.

Contents

Overview

The utility reads a byte stream from its standard input and writes the result to the standard output. As arguments, it takes two sets of characters (generally of the same length), and replaces occurrences of the characters in the first set with the corresponding elements from the second set. For example,

tr 'abcd' 'jkmn'

maps all characters a to j, b to k, c to m, and d to n.

The character set may be abbreviated by using character ranges. The previous example could be written:

tr 'a-d' 'jkmn'

In POSIX-compliant versions of tr, the set represented by a character range depends on the locale's collating order, so it is safer to avoid character ranges in scripts that might be executed in a locale different from that in which they were written. Ranges can often be replaced with POSIX character sets such as [:alpha:].

The s flag causes tr to compress sequences of identical adjacent characters in its output to a single token. For example,

tr -s '\n'

replaces sequences of one or more newline characters with a single newline.

The d flag causes tr to delete all tokens of the specified set of characters from its input. In this case, only a single character set argument is used. The following command removes carriage return characters.

tr -d '\r'

The c flag indicates the complement of the first set of characters. The invocation

tr -cd '[:alnum:]'

therefore removes all non-alphanumeric characters.

Implementations

The original version of tr was written by Douglas McIlroy and was introduced in Version 4 Unix. [1]

The version of tr bundled in GNU coreutils was written by Jim Meyering. [2] The command is available as a separate package for Microsoft Windows as part of the UnxUtils collection of native Win32 ports of common GNU Unix-like utilities. [3] It is also available in the OS-9 shell. [4] A tr command is also part of ASCII's MSX-DOS2 Tools for MSX-DOS version 2. [5] The tr command has also been ported to the IBM i operating system. [6]

Most versions of tr, including GNU tr and classic Unix tr, operate on single-byte characters and are not Unicode compliant. An exception is the Heirloom Toolchest implementation, which provides basic Unicode support.

Ruby and Perl also have an internal tr operator, which operates analogously. [7] [8] Tcl's string map command is more general in that it maps strings to strings while tr maps characters to characters. [9]

See also

Related Research Articles

AWK Data-driven programming language made by Alfred Aho, Peter Weinberger and Brian Kernighan

AWK (awk) is a domain-specific language designed for text processing and typically used as a data extraction and reporting tool. Like sed and grep, it is a filter, and is a standard feature of most Unix-like operating systems.

grep is a command-line utility for searching plain-text data sets for lines that match a regular expression. Its name comes from the ed command g/re/p, which has the same effect. grep was originally developed for the Unix operating system, but later available for all Unix-like systems and some others such as OS-9.

uniq is a utility command on Unix, Plan 9, Inferno, and Unix-like operating systems which, when fed a text file or standard input, outputs the text with adjacent identical lines collapsed to one, unique line of text.

ls Command to list files in Unix and Unix-like operating systems

In computing, ls is a command to list computer files in Unix and Unix-like operating systems. ls is specified by POSIX and the Single UNIX Specification. When invoked without any arguments, ls lists the files in the current working directory. The command is also available in the EFI shell. In other environments, such as DOS, OS/2, and Microsoft Windows, similar functionality is provided by the dir command. The numerical computing environments MATLAB and GNU Octave include an ls function with similar functionality.

Newline Special characters in computing signifying the end of a line of text

Newline is a control character or sequence of control characters in a character encoding specification that is used to signify the end of a line of text and the start of a new one.

head (Unix) Program on Unix and Unix-like systems

head is a program on Unix and Unix-like operating systems used to display the beginning of a text file or piped data.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

patch (Unix)

The computer tool patch is a Unix program that updates text files according to instructions contained in a separate file, called a patch file. The patch file is a text file that consists of a list of differences and is produced by running the related diff program with the original and updated file as arguments. Updating files with patch is often referred to as applying the patch or simply patching the files.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txt textfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard standing for "any string of characters except /" and *.txt is a glob pattern. The other common wildcard is the question mark (?), which stands for one character. For example, mv ?.txt shorttextfiles/ will move all files named with a single character followed by .txt from the current directory to directory shorttextfiles, while ??.txt would match all files whose name consists of 2 characters followed by .txt.

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.

wc (Unix)

wc is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. The program reads either standard input or a list of computer files and generates one or more of the following statistics: newline count, word count, and byte count. If a list of files is provided, both individual file and total statistics follow.

In computing, a here document is a file literal or input stream literal: it is a section of a source code file that is treated as if it were a separate file. The term is also used for a form of multiline string literals that use similar syntax, preserving line breaks and other whitespace in the text.

In Unix-like and some other operating systems, find is a command-line utility that locates files based on some user-specified criteria and either prints the pathname of each matched object or, if another action is requested, performs that action on each matched object.

more (command)

In computing, more is a command to view the contents of a text file one screen at a time. It is available on Unix and Unix-like systems, DOS, Digital Research FlexOS, IBM/Toshiba 4690 OS, IBM OS/2, Microsoft Windows and ReactOS. Programs of this sort are called pagers. more is a very basic pager, originally allowing only forward navigation through a file, though newer implementations do allow for limited backward movement.

tail is a program available on Unix, Unix-like systems, FreeDOS and MSX-DOS used to display the tail end of a text file or piped data.

In computing, sleep is a command in Unix, Unix-like and other operating systems that suspends program execution for a specified time.

sort (Unix) Standard UNIX utility

In computing, sort is a standard command line program of Unix and Unix-like operating systems, that prints the lines of its input or concatenation of all files listed in its argument list in sorted order. Sorting is done based on one or more sort keys extracted from each line of input. By default, the entire input is taken as sort key. Blank space is the default field separator. The command supports a number of command-line options that can vary by implementation. For instance the "-r" flag will reverse the sort order.

unix2dos is a tool to convert line breaks in a text file from Unix format to DOS format and vice versa. When invoked as unix2dos the program will convert a Unix text file to DOS format, when invoked as dos2unix it will convert a DOS text file to UNIX format.

In Unix and Unix-like operating systems, printf is a shell builtin that formats and prints data.

cat (Unix)

cat is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to concatenate files. It has been ported to a number of operating systems.

References

  1. McIlroy, M. D. (1987). A Research Unix reader: annotated excerpts from the Programmer's Manual, 1971–1986 (PDF) (Technical report). Computing Science. AT&T Bell Laboratories. 139.
  2. "Tr(1): Translate/Delete char - Linux man page".
  3. "Native Win32 ports of some GNU utilities". unxutils.sourceforge.net.
  4. Paul S. Dayan (1992). The OS-9 Guru - 1 : The Facts. Galactic Industrial Limited. ISBN   0-9519228-0-7.
  5. MSX-DOS2 Tools User's Manual by ASCII Corporation
  6. IBM. "IBM System i Version 7.2 Programming Qshell" (PDF). Retrieved 2020-09-05.
  7. "tr (String) - APIdock". APIdock. Retrieved 12 August 2015.
  8. "tr - perldoc.perl.org". perldoc.perl.org. Retrieved 12 August 2015.
  9. "Tcl Built-In Commands - string manual page" . Retrieved 12 August 2015.