Md5sum

Last updated
md5sum
Original author(s) Ulrich Drepper, Scott Miller, David Madore [1]
Operating system Unix and Unix-like
Platform Cross-platform
Type Command
License coreutils: GNU GPL v3

md5sum is a computer program that calculates and verifies 128-bit MD5 hashes, as described in RFC 1321. The MD5 hash functions as a compact digital fingerprint of a file. As with all such hashing algorithms, there is theoretically an unlimited number of files that will have any given MD5 hash. However, it is very unlikely that any two non-identical files in the real world will have the same MD5 hash, unless they have been specifically created to have the same hash. [2]

Contents

The underlying MD5 algorithm is no longer deemed secure. Thus, while md5sum is well-suited for identifying known files in situations that are not security related, it should not be relied on if there is a chance that files have been purposefully and maliciously tampered. In the latter case, the use of a newer hashing tool such as sha256sum is recommended.

md5sum is used to verify the integrity of files, as virtually any change to a file will cause its MD5 hash to change. Most commonly, md5sum is used to verify that a file has not changed as a result of a faulty file transfer, a disk error or non-malicious meddling. The md5sum program is included in most Unix-like operating systems or compatibility layers such as Cygwin.

The original C code was written by Ulrich Drepper and extracted from a 2001 release of glibc. [3]

Examples

All of the following files are assumed to be in the current directory.

Create MD5 hash file hash.md5

$md5sumfiletohashA.txtfiletohashB.txtfiletohashC.txt>hash.md5 

File produced

File contains hash and filename pairs:

$ cathash.md5 595f44fec1e92a71d3e9e77456ba80d1  filetohashA.txt71f920fa275127a7b60fa4d4d41432a3  filetohashB.txt43c191bf6d6c3f263a8cd0efd4a058ab  filetohashC.txt

Please note:

  • There must be two spaces or a space and an asterisk between each md5sum value and filename to be compared (the second space indicates text mode, the asterisk binary mode). Otherwise, the following error will result: no properly formatted MD5 checksum lines found. Many programs don't distinguish between the two modes, but some utils do.
  • The file must be also be UNIX line ending formatted, otherwise this will be seen: md5sum: WARNING: x listed files could not be read. dos2unix will convert it quickly if it is DOS/Windows formatted.

Check MD5

$ md5sum-chash.md5 filetohashA.txt: OKfiletohashB.txt: OKfiletohashC.txt: OK

Check single MD5

$ echo'D43F2404CA13E22594E5C8B04D3BBB81  filetohashA.txt'|md5sum-c filetohashA.txt: OK

On non-GNU systems

md5sum is specific to systems that use GNU coreutils or a clone such as BusyBox. On FreeBSD and OpenBSD the utilities are called md5, sha1, sha256, and sha512. These versions offer slightly different options and features. Additionally, FreeBSD offers the "SKEIN" family of message digests. [4]

See also

Related Research Articles

<span class="mw-page-title-main">Checksum</span> Data used to detect errors in other data

A checksum is a small-sized block of data derived from another block of digital data for the purpose of detecting errors that may have been introduced during its transmission or storage. By themselves, checksums are often used to verify data integrity but are not relied upon to verify data authenticity.

gzip GNU file compression/decompression tool

gzip is a file format and a software application used for file compression and decompression. The program was created by Jean-loup Gailly and Mark Adler as a free software replacement for the compress program used in early Unix systems, and intended for use by GNU. Version 0.1 was first publicly released on 31 October 1992, and version 1.0 followed in February 1993.

The MD5 message-digest algorithm is a widely used hash function producing a 128-bit hash value. MD5 was designed by Ronald Rivest in 1991 to replace an earlier hash function MD4, and was specified in 1992 as RFC 1321.

rsync File synchronization protocol and software

rsync is a utility for transferring and synchronizing files between a computer and a storage drive and across networked computers by comparing the modification times and sizes of files. It is commonly found on Unix-like operating systems and is under the GPL-3.0-or-later license.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own, such as devices that use magnetic tape. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization. POSIX abandoned tar in favor of pax, yet tar sees continued widespread use.

In software development, Make is a command-line interface software tool that performs actions ordered by configured dependencies as defined in a configuration file called a makefile. It is commonly used for build automation to build executable code from source code. But, not limited to building, Make can perform any operation available via the operating system shell.

<span class="mw-page-title-main">John the Ripper</span> Password cracking software tool

John the Ripper is a free password cracking software tool. Originally developed for the Unix operating system, it can run on fifteen different platforms. It is among the most frequently used password testing and breaking programs as it combines a number of password crackers into one package, automatically detects password hash types, and includes a customizable cracker. It can be run against various encrypted password formats including several crypt password hash types most commonly found on various Unix versions, Kerberos AFS, and Windows NT/2000/XP/2003 LM hash. Additional modules have extended its ability to include MD4-based password hashes and passwords stored in LDAP, MySQL, and others.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

passwd Tool to change passwords on Unix-like OSes

passwd is a command on Unix, Plan 9, Inferno, and most Unix-like operating systems used to change a user's password. The password entered by the user is run through a key derivation function to create a hashed version of the new password, which is saved. Only the hashed version is stored; the entered password is not saved for security reasons.

Simple file verification (SFV) is a file format for storing CRC32 checksums of files to verify the integrity of files. SFV is used to verify that a file has not been corrupted, but it does not otherwise verify the file's authenticity. The .sfv file extension is usually used for SFV files.

File verification is the process of using an algorithm for verifying the integrity of a computer file, usually by checksum. This can be done by comparing two files bit-by-bit, but requires two copies of the same file, and may miss systematic corruptions which might occur to both files. A more popular approach is to generate a hash of the copied file and comparing that to the hash of the original file.

fstab is a system file commonly found in the directory /etc on Unix and Unix-like computer systems. In Linux, it is part of the util-linux package. The fstab file typically lists all available disk partitions and other types of file systems and data sources that may not necessarily be disk-based, and indicates how they are to be initialized or otherwise integrated into the larger file system structure.

cksum Unix command

cksum is a command in Unix and Unix-like operating systems that generates a checksum value for a file or stream of data. The cksum command reads each file given in its arguments, or standard input if no arguments are provided, and outputs the file's 32-bit cyclic redundancy check (CRC) checksum and byte count. The CRC output by cksum is different from the CRC-32 used in zip, PNG and zlib.

In computing, a dynamic linker is the part of an operating system that loads and links the shared libraries needed by an executable when it is executed, by copying the content of libraries from persistent storage to RAM, filling jump tables and relocating pointers. The specific operating system and executable format determine how the dynamic linker functions and how it is implemented.

<span class="mw-page-title-main">Metalink</span> File format that describes one or more computer files available for download

Metalink is an extensible metadata file format that describes one or more computer files available for download. It specifies files appropriate for the user's language and operating system; facilitates file verification and recovery from data corruption; and lists alternate download sources.

<span class="mw-page-title-main">Tag editor</span> Software for editing the metadata of media files

A tag editor is an app that can add, edit, or remove embedded metadata on multimedia file formats. Content creators, such as musicians, photographers, podcasters, and video producers, may need to properly label and manage their creations, adding such details as title, creator, date of creation, and copyright notice.

sha1sum is a computer program that calculates and verifies SHA-1 hashes. It is commonly used to verify the integrity of files. It is installed by default on most Linux distributions. Typically distributed alongside sha1sum are sha224sum, sha256sum, sha384sum and sha512sum, which use a specific SHA-2 hash function and b2sum, which uses the BLAKE2 cryptographic hash function.

sum is a legacy utility available on some Unix and Unix-like operating systems. This utility outputs a 16-bit checksum of each argument file, as well as the number of blocks they take on disk. Two different checksum algorithms are in use. POSIX abandoned sum in favor of cksum.

BagIt is a set of hierarchical file system conventions designed to support disk-based storage and network transfer of arbitrary digital content. A "bag" consists of a "payload" and "tags," which are metadata files intended to document the storage and transfer of the bag. A required tag file contains a manifest listing every file in the payload together with its corresponding checksum. The name, BagIt, is inspired by the "enclose and deposit" method, sometimes referred to as "bag it and tag it."

crypt is a POSIX C library function. It is typically used to compute the hash of user account passwords. The function outputs a text string which also encodes the salt, and identifies the hash algorithm used. This output string forms a password record, which is usually stored in a text file.

References

  1. https://linux.die.net/man/1/md5sum
  2. "HowToMD5SUM - Community Help Wiki". help.ubuntu.com. Retrieved 2018-11-02.
  3. "md5sum: Compute MD5 Checksums". rdrr.io. Retrieved 2018-11-02.
  4. md5(1)    FreeBSD General Commands Manual