Sum (Unix)

Last updated
sum
Original author(s) Ken Thompson
Developer(s) AT&T Bell Laboratories
Initial releaseNovember 3, 1971;50 years ago (1971-11-03)
Operating system Unix, Unix-like, Inferno
Platform Cross-platform
Type Command
License coreutils: GPLv3+

sum is a legacy utility available on some Unix and Unix-like operating systems. This utility outputs a 16-bit checksum of each argument file, as well as the number of blocks they take on disk. [1] Two different checksum algorithms are in use. POSIX abandoned sum in favor of cksum.

Contents

Overview

The sum program is generally only useful for historical interest. It is not part of POSIX. Two algorithms are typically available: a BSD checksum and a SYSV checksum. Both are weaker than the already weak 32-bit CRC used by cksum. [2]

The default algorithm on FreeBSD and GNU implementations is the BSD checksum. Switching between the two algorithms is done via command line options. [2] [1]

The two commonly used algorithms are as follows.

The BSD sum, -r in GNU sum and -o1 in FreeBSD cksum:

The above algorithm appeared in Seventh Edition Unix.

The System V sum, -s in GNU sum and -o2 in FreeBSD cksum:

Syntax

The sum utility is invoked from the command line according to the following syntax:

sum [OPTION]... [FILE]...

with the possible option parameters being:

When no file parameter is given, or when FILE is -, the standard input is used as input file.

Example of use:

$ echo Hello > testfile $ sum testfile 36978     1

Example of -s use in GNU sum:

$ echo Hello > testfile $ sum -s testfile 510 1 testfile

Example of using standard input, -r and printf to avoid newline:

$ printf Hello | sum -r 08401     1

See also

Related Research Articles

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

ls Command to list files and directories in Unix and Unix-like operating systems

In computing, ls is a command to list computer files and directories in Unix and Unix-like operating systems. It is specified by POSIX and the Single UNIX Specification.

compress is a Unix shell compression program based on the LZW compression algorithm. Compared to more modern compression utilities such as gzip and bzip2, compress performs faster and with less memory usage, at the cost of a significantly lower compression ratio.

ln (Unix) Unix file management utility

The ln command is a standard Unix command utility used to create a hard link or a symbolic link (symlink) to an existing file or directory. The use of a hard link allows multiple filenames to be associated with the same file since a hard link points to the inode of a given file, the data of which is stored on disk. On the other hand, symbolic links are special files that refer to other files by name.

md5sum is a computer program that calculates and verifies 128-bit MD5 hashes, as described in RFC 1321. The MD5 hash functions as a compact digital fingerprint of a file. As with all such hashing algorithms, there is theoretically an unlimited number of files that will have any given MD5 hash. However, it is very unlikely that any two non-identical files in the real world will have the same MD5 hash, unless they have been specifically created to have the same hash.

dd is a command-line utility for Unix, Plan 9, Inferno, and Unix-like operating systems and beyond, the primary purpose of which is to convert and copy files. On Unix, device drivers for hardware and special device files appear in the file system just like normal files; dd can also read and/or write from/to these files, provided that function is implemented in their respective driver. As a result, dd can be used for tasks such as backing up the boot sector of a hard drive, and obtaining a fixed amount of random data. The dd program can also perform conversions on the data as it is copied, including byte order swapping and conversion to and from the ASCII and EBCDIC text encodings.

join is a command in Unix and Unix-like operating systems that merges the lines of two sorted text files based on the presence of a common field. It is similar to the join operator used in relational databases but operating on text files.

wc (Unix) Unix command utility

wc is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. The program reads either standard input or a list of computer files and generates one or more of the following statistics: newline count, word count, and byte count. If a list of files is provided, both individual file and total statistics follow.

pax is an archiving utility available for various operating systems and defined since 1995. Rather than sort out the incompatible options that have crept up between tar and cpio, along with their implementations across various versions of Unix, the IEEE designed new archive utility pax that could support various archive formats with useful options from both archivers. The pax command is available on Unix and Unix-like operating systems and on IBM i, Microsoft Windows NT, and Windows 2000.

In computing, cut is a command line utility on Unix and Unix-like operating systems which is used to extract sections from each line of input — usually from a file. It is currently part of the GNU coreutils package and the BSD Base System.

cksum

cksum is a command in Unix and Unix-like operating systems that generates a checksum value for a file or stream of data. The cksum command reads each file given in its arguments, or standard input if no arguments are provided, and outputs the file's 32-bit cyclic redundancy check (CRC) checksum and byte count. The CRC output by cksum is different from the CRC-32 used in zip, PNG and zlib.

df (Unix) Standard Unix command

df is a standard Unix command used to display the amount of available disk space for file systems on which the invoking user has appropriate read access. df is typically implemented using the statfs or statvfs system calls.

cmp (Unix)

In computing, cmp is a command-line utility on Unix and Unix-like operating systems that compares two files of any type and writes the results to the standard output. By default, cmp is silent if the files are the same; if they differ, the byte and line number at which the first difference occurred is reported. The command is also available in the OS-9 shell.

cpio is a general file archiver utility and its associated file format. It is primarily installed on Unix-like computer operating systems. The software utility was originally intended as a tape archiving program as part of the Programmer's Workbench (PWB/UNIX), and has been a component of virtually every Unix operating system released thereafter. Its name is derived from the phrase copy in and out, in close description of the program's use of standard input and standard output in its operation.

The tsort program is a command line utility on Unix and Unix-like platforms, that performs a topological sort on its input. As of 2017, it is part of the POSIX.1 standard.

The BSD checksum algorithm was a commonly used, legacy checksum algorithm. It has been implemented in old BSD and is also available through the sum command line utility.

The SYSV checksum algorithm was a commonly used, legacy checksum algorithm. It has been implemented in UNIX System V and is also available through the sum command line utility.

In Unix and Unix-like operating systems, printf is a shell builtin that formats and prints data.

The csplit command in Unix and Unix-like operating systems is a utility that is used to split a file into two or more smaller files determined by context lines.

cat (Unix) Unix command utility

cat is a standard Unix utility that reads files sequentially, writing them to standard output. The name is derived from its function to (con)catenate files. It has been ported to a number of operating systems.

References

  1. 1 2 sum(1) — manual pages from GNU coreutils
  2. 1 2 sum(1)    FreeBSD General Commands Manual