Diff3

Last updated
diff3
Developer(s) AT&T Bell Laboratories
Initial releaseJanuary 1979;43 years ago (1979-01)
Operating system Unix and Unix-like
Type Command

diff3 is a Unix utility to compare three files and show any differences among them. diff3 can also merge files, implementing a three-way merge.

Contents

History and implementations

diff3 originally appeared in Version 7 Unix of 1979. A very similar version was found in Berkeley Software Distribution (BSD) and was inherited into OpenBSD and probably Solaris. [1] [2]

On most modern Unix-like systems, even BSD-derived ones like FreeBSD and NetBSD, the diff3 provided is GNU's version in diffutils (formerly GNU Tools). This version has the ability to merge files on its own without relying on ed. It internally invokes diff to do the comparison, but can use any other compatible tool too. [3]

The command is available as a separate package for Microsoft Windows as part of the UnxUtils collection of native Win32 ports of common GNU Unix-like utilities. [4]

"Diff3" has also become a generic name for the three-way-merge algorithm, specifically one based on reconciling two different diffs stemming from the first source.

Merging from a common ancestor

When two people have made changes to copies of the same file, diff3 can produce a merged output that contains both sets of changes together with warnings about conflicts. diff3 can merge three or more sets of changes to a file by merging two change sets at a time.

diff3 can incorporate changes from two modified versions into a common preceding version. This enables users to merge the sets of changes represented by the two newer files. This can be enabled using a command like this: diff3 mine older yours.

This is like subtracting the file older from the file yours and adding the result to the file mine, or as merging into mine the changes that would turn older into yours. This merging is well-defined as long as mine and older match in the neighborhood of each such change. This fails to be true when all three input files differ or when only older differs; this is a conflict. When all three input files differ, the conflict is an overlap.

diff3 has several methods to handle overlaps and conflicts. It can omit overlaps or conflicts, or select only overlaps, or mark conflicts with special <<<<<<< and >>>>>>> lines.

diff3 can output the merge results as an ed script that can be applied to the first file to yield the merged output. However, directly generating the merged output bypasses some problems with ed.

See also

Related Research Articles

troff, short for "typesetter roff", is the major component of a document processing system developed by AT&T Corporation for the Unix operating system. troff and the related nroff were both developed from the original roff.

In computing, the utility diff is a data comparison tool that computes and displays the differences between the contents of files. Unlike edit distance notions used for other purposes, diff is line-oriented rather than character-oriented, but it is like Levenshtein distance in that it tries to determine the smallest set of deletions and insertions to create one file from the other. The utility displays the changes in one of several standard formats, such that both humans or computers can parse the changes, and use them for patching.

In computing, tar is a computer software utility for collecting many files into one archive file, often referred to as a tarball, for distribution or backup purposes. The name is derived from "tape archive", as it was originally developed to write data to sequential I/O devices with no file system of their own. The archive data sets created by tar contain various file system parameters, such as name, timestamps, ownership, file-access permissions, and directory organization.

ls

In computing, ls is a command to list computer files in Unix and Unix-like operating systems. ls is specified by POSIX and the Single UNIX Specification. When invoked without any arguments, ls lists the files in the current working directory. The command is also available in the EFI shell. In other environments, such as DOS, OS/2, and Microsoft Windows, similar functionality is provided by the dir command. The numerical computing environments MATLAB and GNU Octave include an ls function with similar functionality.

In software development, Make is a build automation tool that automatically builds executable programs and libraries from source code by reading files called Makefiles which specify how to derive the target program. Though integrated development environments and language-specific compiler features can also be used to manage a build process, Make remains widely used, especially in Unix and Unix-like operating systems.

comm Standard UNIX utility for comparing files

The comm command in the Unix family of computer operating systems is a utility that is used to compare two files for common and distinct lines. comm is specified in the POSIX standard. It has been widely available on Unix-like operating systems since the mid to late 1980s.

patch (Unix)

The computer tool patch is a Unix program that updates text files according to instructions contained in a separate file, called a patch file. The patch file is a text file that consists of a list of differences and is produced by running the related diff program with the original and updated file as arguments. Updating files with patch is often referred to as applying the patch or simply patching the files.

xargs is a command on Unix and most Unix-like operating systems used to build and execute commands from standard input. It converts input from standard input into arguments to a command.

dd is a command-line utility for Unix, Unix-like operating systems and beyond, the primary purpose of which is to convert and copy files.

tr (Unix) Command in Unix-like operating systems

tr is a command in Unix, Plan 9, Inferno, and Unix-like operating systems. It is an abbreviation of translate or transliterate, indicating its operation of replacing or removing specific characters in its input data set.

pax (command)

pax is an archiving utility available for various operating systems and defined since 1995. Rather than sort out the incompatible options that have crept up between tar and cpio, along with their implementations across various versions of Unix, the IEEE designed a new archive utility that could support various archive formats with useful options from both archivers. The pax command is available on Unix and Unix-like operating systems and on IBM i, Microsoft Windows NT, and Windows 2000.

paste is a Unix command line utility which is used to join files horizontally by outputting lines consisting of the sequentially corresponding lines of each file specified, separated by tabs, to the standard output. It is effectively the horizontal equivalent to the utility cat command which operates on the vertical plane of two or more files.

cmp (Unix)

In computing, cmp is a command-line utility on Unix and Unix-like operating systems that compares two files of any type and writes the results to the standard output. By default, cmp is silent if the files are the same; if they differ, the byte and line number at which the first difference occurred is reported. The command is also available in the OS-9 shell.

Source Code Control System (SCCS) is a version control system designed to track changes in source code and other text files during the development of a piece of software. This allows the user to retrieve any of the previous versions of the original source code and the changes which are stored. It was originally developed at Bell Labs beginning in late 1972 by Marc Rochkind for an IBM System/370 computer running OS/360.

init UNIX system component

In Unix-based computer operating systems, init is the first process started during booting of the computer system. Init is a daemon process that continues running until the system is shut down. It is the direct or indirect ancestor of all other processes and automatically adopts all orphaned processes. Init is started by the kernel during the booting process; a kernel panic will occur if the kernel is unable to start it. Init is typically assigned process identifier 1.

tail is a program available on Unix, Unix-like systems, FreeDOS and MSX-DOS used to display the tail end of a text file or piped data.

yes (Unix) Unix command

yes is a command on Unix and Unix-like operating systems, which outputs an affirmative response, or a user-defined string of text, continuously until killed.

The tsort program is a command line utility on Unix and Unix-like platforms, that performs a topological sort on its input. As of 2017, it is part of the POSIX.1 standard.

The fmt command in Unix, Plan 9, Inferno, and Unix-like operating systems is used to format natural language text for humans to read.

sum is a legacy utility available on some Unix and Unix-like operating systems. This utility outputs the checksum of each argument file, as well as the number of blocks they take on disk.

References

  1. diff3(1)    OpenBSD General Commands Manual
  2. "diff3 - man pages section 1: User Commands". Solaris 11.3 documentation.
  3. diff3(1)    Linux General Commands Manual
  4. "Native Win32 ports of some GNU utilities". unxutils.sourceforge.net.