Fork (software development)

Last updated

A timeline chart showing the evolution of Linux distributions, with each split in the diagram being called "a fork" Linux Distribution Timeline.svg
A timeline chart showing the evolution of Linux distributions, with each split in the diagram being called "a fork"

In software engineering, a project fork happens when developers take a copy of source code from one software package and start independent development on it, creating a distinct and separate piece of software.[ example needed ] The term often implies not merely a development branch, but also a split in the developer community; as such, it is a form of schism. [1] Grounds for forking are varying user preferences and stagnated or discontinued development of the original software.

Contents

Free and open-source software is that which, by definition, may be forked from the original development team without prior permission, and without violating copyright law. However, licensed forks of proprietary software (e.g. Unix) also happen.

Etymology

The word "fork" has been used to mean "to divide in branches, go separate ways" as early as the 14th century. [2] In the software environment, the word evokes the fork system call, which causes a running process to split itself into two (almost) identical copies that (typically) diverge to perform different tasks. [3]

In the context of software development, "fork" was used in the sense of creating a revision control "branch" by Eric Allman as early as 1980, in the context of Source Code Control System: [4]

Creating a branch "forks off" a version of the program.

The term was in use on Usenet by 1983 for the process of creating a subgroup to move topics of discussion to. [5]

"Fork" is not known to have been used in the sense of a community schism during the origins of Lucid Emacs (now XEmacs) (1991) or the Berkeley Software Distributions (BSDs) (1993–1994); Russ Nelson used the term "shattering" for this sort of fork in 1993, attributing it to John Gilmore. [6] However, "fork" was in use in the present sense by 1995 to describe the XEmacs split, [7] and was an understood usage in the GNU Project by 1996. [8]

Forking of free and open-source software

Free and open-source software may be legally forked without prior approval of those currently developing, managing, or distributing the software per both The Free Software Definition and The Open Source Definition: [9]

The freedom to distribute copies of your modified versions to others (freedom 3). By doing this, you can give the whole community a chance to benefit from your changes. Access to the source code is a precondition for this.

3. Derived Works: The license must allow modifications and derived works, and must allow them to be distributed under the same terms as the license of the original software.

In free software, forks often result from a schism over different goals or personality clashes. In a fork, both parties assume nearly identical code bases, but typically only the larger group, or whoever controls the web site, will retain the full original name and the associated user community. Thus, there is a reputation penalty associated with forking. [9] The relationship between the different teams can be cordial or very bitter. On the other hand, a friendly fork or a soft fork is a fork that does not intend to compete, but wants to eventually merge with the original.

Eric S. Raymond, in his essay Homesteading the Noosphere , [12] stated that "The most important characteristic of a fork is that it spawns competing projects that cannot later exchange code, splitting the potential developer community". He notes in the Jargon File: [13]

Forking is considered a Bad Thing—not merely because it implies a lot of wasted effort in the future, but because forks tend to be accompanied by a great deal of strife and acrimony between the successor groups over issues of legitimacy, succession, and design direction. There is serious social pressure against forking. As a result, major forks (such as the Gnu-Emacs/XEmacs split, the fissioning of the 386BSD group into three daughter projects, and the short-lived GCC/EGCS split) are rare enough that they are remembered individually in hacker folklore.

David A. Wheeler notes [9] four possible outcomes of a fork, with examples:

  1. The death of the fork. This is by far the most common case. It is easy to declare a fork, but considerable effort to continue independent development and support.
  2. A re-merging of the fork (e.g., egcs becoming "blessed" as the new version of GNU Compiler Collection.)
  3. The death of the original (e.g. the X.Org Server succeeding and XFree86 dying.)
  4. Successful branching, typically with differentiation (e.g., OpenBSD and NetBSD.)

Distributed revision control (DVCS) tools have popularised a less emotive use of the term "fork", blurring the distinction with "branch". [14] With a DVCS such as Mercurial or Git, the normal way to contribute to a project, is to first create a personal branch of the repository, independent of the main repository, and later seek to have your changes integrated with it. Sites such as GitHub, Bitbucket and Launchpad provide free DVCS hosting expressly supporting independent branches, such that the technical, social and financial barriers to forking a source code repository are massively reduced, and GitHub uses "fork" as its term for this method of contribution to a project.

Forks often restart version numbering from numbers typically used for initial versions of programs like 0.0.1, 0.1, or 1.0 even if the original software was at another version such as 3.0, 4.0, or 5.0. An exception is sometimes made when the forked software is designed to be a drop-in replacement for the original project, e.g. MariaDB for MySQL [15] or LibreOffice for OpenOffice.org.

The BSD licenses permit forks to become proprietary software, and copyleft proponents say that commercial incentives thus make proprietisation almost inevitable. (Copyleft licenses can, however, be circumvented via dual-licensing with a proprietary grant in the form of a Contributor License Agreement.) Examples include macOS (based on the proprietary NeXTSTEP and the open source FreeBSD), Cedega and CrossOver (proprietary forks of Wine, though CrossOver tracks Wine and contributes considerably), EnterpriseDB (a fork of PostgreSQL, adding Oracle compatibility features [16] ), Supported PostgreSQL with their proprietary ESM storage system, [17] and Netezza's [18] proprietary highly scalable derivative of PostgreSQL. Some of these vendors contribute back changes to the community project, while some keep their changes as their own competitive advantages.

Forking proprietary software

In proprietary software, the copyright is usually held by the employing entity, not by the individual software developers. Proprietary code is thus more commonly forked when the owner needs to develop two or more versions, such as a windowed version and a command line version, or versions for differing operating systems, such as a word processor for IBM PC compatible machines and Macintosh computers. Generally, such internal forks will concentrate on having the same look, feel, data format, and behavior between platforms so that a user familiar with one can also be productive or share documents generated on the other. This is almost always an economic decision to generate a greater market share and thus pay back the associated extra development costs created by the fork.

A notable proprietary fork not of this kind is the many varieties of proprietary Unix—almost all derived from AT&T Unix under license and all called "Unix", but increasingly mutually incompatible. [19] See Unix wars.

See also

Related Research Articles

<span class="mw-page-title-main">Free software</span> Software licensed to be freely used, modified and distributed

Free software, libre software, libreware or rarely known as freedom-respecting software is computer software distributed under terms that allow users to run the software for any purpose as well as to study, change, and distribute it and any adapted versions. Free software is a matter of liberty, not price; all users are legally free to do what they want with their copies of a free software regardless of how much is paid to obtain the program. Computer programs are deemed "free" if they give end-users ultimate control over the software and, subsequently, over their devices.

<span class="mw-page-title-main">GNU</span> Free software collection

GNU is an extensive collection of free software, which can be used as an operating system or can be used in parts with other operating systems. The use of the completed GNU tools led to the family of operating systems popularly known as Linux. Most of GNU is licensed under the GNU Project's own General Public License (GPL).

<span class="mw-page-title-main">KornShell</span> Bourne shell backward compatible Unix shell created by David Korn

KornShell (ksh) is a Unix shell which was developed by David Korn at Bell Labs in the early 1980s and announced at USENIX on July 14, 1983. The initial development was based on Bourne shell source code. Other early contributors were Bell Labs developers Mike Veach and Pat Sullivan, who wrote the Emacs and vi-style line editing modes' code, respectively. KornShell is backward-compatible with the Bourne shell and includes many features of the C shell, inspired by the requests of Bell Labs users.

<span class="mw-page-title-main">XEmacs</span>

XEmacs is a graphical- and console-based text editor which runs on almost any Unix-like operating system as well as Microsoft Windows. XEmacs is a fork, based on a version of GNU Emacs from the late 1980s. Any user can download, use, and modify XEmacs as free software available under the GNU General Public License version 2 or any later version.

The GNU C Library, commonly known as glibc, is the GNU Project implementation of the C standard library. It is a wrapper around the system calls of the Linux kernel for application use. Despite its name, it now also directly supports C++. It was started in the 1980s by the Free Software Foundation (FSF) for the GNU operating system.

<span class="mw-page-title-main">OTRS</span> Service management software

OTRS is a service management suite. The suite contains an agent portal, admin dashboard and customer portal. In the agent portal, teams process tickets and requests from customers. There are various ways in which this information, as well as customer and related data can be viewed. As the name implies, the admin dashboard allows system administrators to manage the system: Options are many, but include roles and groups, process automation, channel integration, and CMDB/database options. The third component, the customer portal, is much like a customizable webpage where information can be shared with customers and requests can be tracked on the customer side.

<span class="mw-page-title-main">Free and open-source software</span> Software whose source code is available and which is permissively licensed

Free and open-source software (FOSS) is software that is available under a license that grants the right to use, modify, and distribute the software, modified or not, to everyone free of charge. The public availability of the source code is, therefore, a necessary but not sufficient condition. FOSS is an inclusive umbrella term for free software and open-source software. FOSS is in contrast to proprietary software, where the software is under restrictive copyright or licensing and the source code is hidden from the users.

<span class="mw-page-title-main">GForge</span>

GForge is a commercial service originally based on the Alexandria software behind SourceForge, a web-based project management and collaboration system which was licensed under the GPL. Open source versions of the GForge code were released from 2002 to 2009, at which point the company behind GForge focused on their proprietary service offering which provides project hosting, version control, code reviews, ticketing, release management, continuous integration and messaging. The FusionForge project emerged in 2009 to pull together open-source development efforts from the variety of software forks which had sprung up.

<span class="mw-page-title-main">Watcom C/C++</span>

Watcom C/C++ is an integrated development environment (IDE) product from Watcom International Corporation for the C, C++, and Fortran programming languages. Watcom C/C++ was a commercial product until it was discontinued, then released under the Sybase Open Watcom Public License as Open Watcom C/C++. It features tools for developing and debugging code for DOS, OS/2, Windows, and Linux operating systems, which are based upon 16-bit x86, 32-bit IA-32, or 64-bit x86-64 compatible processors.

<span class="mw-page-title-main">Mercurial</span> Distributed revision-control tool for software developers

Mercurial is a distributed revision control tool for software developers. It is supported on Microsoft Windows, Linux, and other Unix-like systems, such as FreeBSD and macOS.

<span class="mw-page-title-main">History of free and open-source software</span>

The history of free and open-source software begins at the advent of computer software in the early half of the 20th century. In the 1950s and 1960s, computer operating software and compilers were delivered as a part of hardware purchases without separate fees. At the time, source code—the human-readable form of software—was generally distributed with the software, providing the ability to fix bugs or add new functions. Universities were early adopters of computing technology. Many of the modifications developed by universities were openly shared, in keeping with the academic principles of sharing knowledge, and organizations sprung up to facilitate sharing.

<span class="mw-page-title-main">GNU Emacs</span> GNU version of the Emacs text editor

GNU Emacs is a free software text editor. It was created by GNU Project founder Richard Stallman, based on the Emacs editor developed for Unix operating systems. GNU Emacs has been a central component of the GNU project and a flagship project of the free software movement. Its tag line is "the extensible self-documenting text editor."

<span class="mw-page-title-main">Drizzle (database server)</span>

Drizzle is a discontinued free software/open-source relational database management system (DBMS) that was forked from the now-defunct 6.0 development branch of the MySQL DBMS.

<span class="mw-page-title-main">Free-software license</span> License allowing software modification and redistribution

A free-software license is a notice that grants the recipient of a piece of software extensive rights to modify and redistribute that software. These actions are usually prohibited by copyright law, but the rights-holder of a piece of software can remove these restrictions by accompanying the software with a software license which grants the recipient these rights. Software using such a license is free software as conferred by the copyright holder. Free-software licenses are applied to software in source code and also binary object-code form, as the copyright law recognizes both forms.

BSD licenses are a family of permissive free software licenses, imposing minimal restrictions on the use and distribution of covered software. This is in contrast to copyleft licenses, which have share-alike requirements. The original BSD license was used for its namesake, the Berkeley Software Distribution (BSD), a Unix-like operating system. The original version has since been revised, and its descendants are referred to as modified BSD licenses.

<span class="mw-page-title-main">GNU General Public License</span> Series of free software licenses

The GNU General Public Licenses are a series of widely used free software licenses, or copyleft licenses, that guarantee end users the freedoms to run, study, share, and modify the software. The license was the first copyleft for general use and was originally written by Richard Stallman, the founder of the Free Software Foundation (FSF), for the GNU Project. The license grants the recipients of a computer program the rights of the Free Software Definition. The licenses in the GPL series are all copyleft licenses, which means that any derivative work must be distributed under the same or equivalent license terms. It is more restrictive than the Lesser General Public License and even further distinct from the more widely-used permissive software licenses such as BSD, MIT, and Apache.

<span class="mw-page-title-main">Open-core model</span> Business model monetizing commercial open-source software

The open-core model is a business model for the monetization of commercially produced open-source software. The open-core model primarily involves offering a "core" or feature-limited version of a software product as free and open-source software, while offering "commercial" versions or add-ons as proprietary software. The term was coined by Andrew Lampitt in 2008.

<span class="mw-page-title-main">KDE Projects</span>

KDE Projects are projects maintained by the KDE community, a group of people developing and advocating free software for everyday use, for example KDE Plasma and KDE Frameworks or applications such as Amarok, Krita or Digikam. There are also non-coding projects like designing the Breeze desktop theme and iconset, which is coordinated by KDE's Visual Design Group. Even non-Qt applications like GCompris, which started as a GTK-based application, or web-based projects like WikiToLearn are officially part of KDE.

The history of the Berkeley Software Distribution begins in the 1970s.

References

  1. "Schism", with its connotations, is a common usage, e.g.
  2. Entry 'fork' in Online Etymology Dictionary Archived 25 May 2012 at the Wayback Machine
  3. "The term fork is derived from the POSIX standard for operating systems: the system call used so that a process generates a copy of itself is called fork()." Robles, Gregorio; González-Barahona, Jesús M. (2012). A Comprehensive Study of Software Forks: Dates, Reasons and Outcomes (PDF). OSS 2012 The Eighth International Conference on Open Source Systems. doi: 10.1007/978-3-642-33442-9_1 . Archived (PDF) from the original on 2 December 2013. Retrieved 20 October 2012.
  4. Allman, Eric. "An Introduction to the Source Code Control System." Archived 6 November 2014 at the Wayback Machine Project Ingres, University of California at Berkeley, 1980.
  5. Can somebody fork off a "net.philosophy"? (John Gilmore, net.misc, 18 January 1983)
  6. Shattering — good or bad? (Russell Nelson, gnu.misc.discuss, 1 October 1993)
  7. Re: Hey Franz: 32K Windows SUCK!!!!! (Bill Dubuque, cu.cs.macl.info, 21 September 1995)
  8. Lignux? (Marcus G. Daniels, gnu.misc.discuss, 7 June 1996)
  9. 1 2 3 Why Open Source Software / Free Software (OSS/FS, FLOSS, or FOSS)? Look at the Numbers!: Forking Archived 5 April 2006 at the Wayback Machine (David A. Wheeler)
  10. Stallman, Richard. "The Free Software Definition". Free Software Foundation. Archived from the original on 14 October 2013. Retrieved 15 October 2013.
  11. "The Open Source Definition". The Open Source Initiative. 7 July 2006. Archived from the original on 15 October 2013. Retrieved 15 October 2013.
  12. Raymond, Eric S. (15 August 2002). "Promiscuous Theory, Puritan Practice". catb.org. Archived from the original on 6 October 2006.
  13. Forked Archived 8 November 2011 at the Wayback Machine (Jargon File), first added to v4.2.2 Archived 14 January 2012 at the Wayback Machine , 20 August 2000)
  14. e.g.Willis, Nathan (15 January 2015). "An "open governance" fork of Node.js". LWN.net. Archived from the original on 21 April 2015. Retrieved 15 January 2015. Forks are a natural part of the open development model—so much so that GitHub famously plasters a "fork your own copy" button on almost every page. See also Nyman, Linus (2015). Understanding Code Forking in Open Source Software (PhD). Hanken School of Economics. p. 57. hdl:10138/153135. Where practitioners have previously had rather narrow definitions of a fork, [...] the term now appears to be used much more broadly. Actions that would traditionally have been called a branch, a new distribution, code fragmentation, a pseudo-fork, etc. may all now be called forks by some developers. This appears to be in no insignificant part due to the broad definition and use of the term fork by GitHub.
  15. Forked a project, where do my version numbers start? Archived 26 August 2011 at the Wayback Machine
  16. EnterpriseDB Archived 13 November 2006 at the Wayback Machine
  17. Fujitsu Supported PostgreSQL Archived 20 August 2006 at the Wayback Machine
  18. Netezza Archived 13 November 2006 at the Wayback Machine
  19. Fear of forking Archived 17 December 2012 at the Wayback Machine – An essay about forking in free software projects, by Rick Moen