Software repository

Last updated

A software repository, colloquially known as a "repo" for short, is a storage location from which software packages may be retrieved and installed on a computer. These repositories often house metadata about the packages stored in the repository. One can often install or update local software using a given package manager installed on the local machine by accessing the packages stored on the repository through it.

Package manager software tools for handling software packages

A package manager or package-management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system in a consistent manner.



Many software publishers and other organizations maintain servers on the Internet for this purpose, either free of charge or for a subscription fee. Repositories may be solely for particular programs, such as CPAN for the Perl programming language, or for an entire operating system. Operators of such repositories typically provide a package management system, tools intended to search for, install and otherwise manipulate software packages from the repositories. For example, many Linux distributions use Advanced Packaging Tool (APT), commonly found in Debian based distributions, or yum found in Red Hat based distributions. There are also multiple independent package management systems, such as pacman, used in Arch Linux and equo, found in Sabayon Linux.

Internet Global system of connected computer networks

The Internet is the global system of interconnected computer networks that use the Internet protocol suite (TCP/IP) to link devices worldwide. It is a network of networks that consists of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies. The Internet carries a vast range of information resources and services, such as the inter-linked hypertext documents and applications of the World Wide Web (WWW), electronic mail, telephony, and file sharing.

The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. CPAN can denote either the archive network or the Perl program that acts as an interface to the network and as an automated software installer. Most software on CPAN is free and open source software. CPAN was conceived in 1993 and has been active online since October 1995. It is based on the CTAN model and began as a place to unify the structure of scattered Perl archives.

Perl interpreted programming language

Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" usually refers to Perl 5, but it may also refer to its redesigned "sister language", Perl 6.

As software repositories are designed to include useful packages, major repositories are designed to be malware free. If a computer is configured to use a digitally signed repository from a reputable vendor, and is coupled with an appropriate permissions system, this significantly reduces the threat of malware to these systems. As a side effect, many systems that have these capabilities do not require anti-malware software such as anti-virus software. [1]

MalwareMustDie organization

MalwareMustDie, NPO as a whitehat security research workgroup, has been launched from August 2012. MalwareMustDie is a registered Nonprofit organization as a media for IT professionals and security researchers gathered to form a work flow to reduce malware infection in the internet. The group is known of their malware analysis blog. They have a list of Linux malware research and botnet analysis that they have completed. The team communicates information about malware in general and advocates for better detection for Linux malware.

Most file systems have methods to assign permissions or access rights to specific users and groups of users. These permissions control the ability of the users to view, change, navigate, and execute the contents of the file system.

Antivirus software computer software to defend against malicious computer viruses

Antivirus software, or anti-virus software, also known as anti-malware, is a computer program used to prevent, detect, and remove malware.

Most major Linux distributions have many repositories around the world that mirror the main repository.

Linux distribution Operating system based on the Linux kernel

A Linux distribution is an operating system made from a software collection, which is based upon the Linux kernel and, often, a package management system. Linux users usually obtain their operating system by downloading one of the Linux distributions, which are available for a wide variety of systems ranging from embedded devices and personal computers to powerful supercomputers.

Package management system vs. package development process

A package management system is different from a package development process.

A software package development process is a system for developing software packages. Packages make it easier to reuse and share code, e.g., via a software repository. A formal system for package checking can help expose bugs, thereby potentially making it easier to produce trustworthy software. This in turn can help improve productivity for people who produce and use software, as part of a software development process or software development methodology.

A typical use of a package management system is to facilitate the integration of code from possibly different sources into a coherent stand-alone operating unit. Thus, a package management system might be used to produce a distribution of Linux, possibly a distribution tailored to a specific restricted application.

A package development process, by contrast, is used to manage the co-development of code and documentation of a collection of functions or routines with a common theme, producing thereby a package of software functions that typically will not be complete and usable by themselves. A good package development process will help users conform to good documentation and coding practices, integrating some level of unit testing. The table below provides examples of package development processes.

Selected repositories

The following table lists a few languages with repositories for contributed software. The "Autochecks" column describes the routine checks done.

Very few people have the ability to test their software under multiple operating-systems with different versions of the core code and with other contributed packages they may use. For R, the Comprehensive R Archive Network (CRAN) runs tests routinely. To see how this is valuable, suppose Sally contributes a package A. Sally only runs the current version of the software under one version of Microsoft Windows, and has only tested it in that environment. At more or less regular intervals, CRAN tests Sally's contribution under a dozen combinations of operating systems and versions of the core R language software. If one of them generates an error, she gets that error message. With luck, that error message may suffice to allow her to fix the error, even if she cannot replicate it with the hardware and software she has. Next, suppose John contributes to the repository a package B that uses a package A. Package B passes all the tests and is made available to users. Later, Sally submits an improved version of A, which unfortunately, breaks B. The autochecks make it possible to provide information to John so he can fix the problem.

This example exposes both a strength and a weakness in the R contributed-package system: CRAN supports this kind of automated testing of contributed packages, but packages contributed to CRAN need not specify the versions of other contributed packages that they use. Procedures for requesting specific versions of packages exist, but contributors might not use those procedures.

Beyond this, a repository such as CRAN running regular checks of contributed packages actually provides an extensive if ad hoc test suite for development versions of the core language. If Sally (in the example above) gets an error message she does not understand or thinks is inappropriate, especially from a development version of the language, she can (and often does with R) ask the core development-team for the language for help. In this way, the repository can contribute to improving the quality of the core language software.

Language / purpose Package Development Process RepositoryInstall methodsCollaborative development platformAutochecks
Haskell Common Architecture for Building Applications and Libraries [2] Hackage
Java Maven [3]
Julia [4]
Common Lisp Quicklisp [5]
.NET NuGet NuGet [6]
Node.js NPM [7]
Perl CPAN PPM [8]
PHP PEAR, Composer PECL, Packagist
Python Setuptools PyPI pip, EasyInstall, PyPM, Anaconda
R R CMD check process [9] [10] CRAN [11] install.packages [12] R-Forge [13] Roughly weekly on 12 platforms or combinations of different versions of R (devel, prerel, patched, release) with up to 7 different operating systems (different versions of Linux, Windows, and Mac).
Ruby RubyGems Ruby Application Archive RubyForge
Rust Cargo [14] Crates [15] Cargo [14]

(Parts of this table were copied from a "List of Top Repositories by Programming Language" on Stack Overflow [16] )

Many other programming languages, among them C, C++, and Fortran, do not possess a central software repository with universal scope. Notable repositories with limited scope include:

Repository managers

Software to manage repositories (repository managers) includes:

See also

Related Research Articles

APT (Package Manager) Free software package management system

Advanced Package Tool, or APT, is a free-software user interface that works with core libraries to handle the installation and removal of software on Debian, Ubuntu, and related Linux distributions. APT simplifies the process of managing software on Unix-like computer systems by automating the retrieval, configuration and installation of software packages, either from precompiled files or by compiling source code.

Arch Linux is a Linux distribution for computers based on x86-64 architectures.

Dependency hell is a colloquial term for the frustration of some software users who have installed software packages which have dependencies on specific versions of other software packages.

Apache Maven build automation tool used primarily for Java projects

Maven is a build automation tool used primarily for Java projects.

LAMP (software bundle) software bundle

LAMP is an archetypal model of web service stacks, named as an acronym of the names of its original four open-source components: the Linux operating system, the Apache HTTP Server, the MySQL relational database management system (RDBMS), and the PHP programming language. The LAMP components are largely interchangeable and not limited to the original selection. As a solution stack, LAMP is suitable for building dynamic web sites and web applications.

RubyGems is a package manager for the Ruby programming language that provides a standard format for distributing Ruby programs and libraries, a tool designed to easily manage the installation of gems, and a server for distributing them. It was created by Chad Fowler, Jim Weirich, David Alan Black, Paul Brannan and Richard Kilmer during RubyConf 2004.

Conary (package manager)

Conary is a free software package management system created by rPath and distributed under the terms of the Apache License Version 2.0. It was relicensed from the GPLv3 in 2013. It focuses on installing packages through automated dependency resolution against distributed online repositories, and providing a concise and easy-to-use Python-based description language to specify how to build a package. It is used by Foresight Linux and rPath Linux.

Apache Ivy is a transitive package manager. It is a sub-project of the Apache Ant project, with which Ivy works to resolve project dependencies. An external XML file defines project dependencies and lists the resources necessary to build a project. Ivy then resolves and downloads resources from an artifact repository: either a private repository or one publicly available on the Internet.

Apache Buildr

Buildr is an open-source build system mainly intended to build Java applications. It gives the developer a full-blown scripting language (Ruby) while writing their build scripts, which are usually missing in XML-based building environments such as Apache Ant or Apache Maven.

npm (software) Node Package Manager

npm is a package manager for the JavaScript programming language. It is the default package manager for the JavaScript runtime environment Node.js. It consists of a command line client, also called npm, and an online database of public and paid-for private packages, called the npm registry. The registry is accessed via the client, and the available packages can be browsed and searched via the npm website. The package manager and the registry are managed by npm, Inc.

LuaRocks is a package manager for the Lua programming language that provides a standard format for distributing Lua modules, a tool designed to easily manage the installation of rocks, and a server for distributing them. While not included with the Lua distribution, it has been called the "de facto package manager for community-contributed Lua modules".

Joinup is a collaboration platform created by the European Commission. It is funded by the European Union via its Interoperability Solutions for Public Administrations Programme.

Anaconda (Python distribution) package manager, environment manager, and Python (and related packages) distribution

Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. Package versions are managed by the package management system conda. The Anaconda distribution is used by over 15 million users and includes more than 1500 popular data-science packages suitable for Windows, Linux, and MacOS.

A binary repository manager is a software tool designed to optimize the download and storage of binary files used and produced in software development. It centralizes the management of all the binary artifacts generated and used by the organization to overcome the complexity arising from the diversity of binary artifact types, their position in the overall workflow and the dependencies between them.


ProGet is a Package management system, designed by the Inedo software company. It allows users to host and manage personal or enterprise-wide packages, applications, and components. It was originally designed as a private NuGet manager and symbol and source server. Beginning in 2015, ProGet has expanded support, added enterprise grade features, and is targeted to fit into a DevOps methodology. Enterprises utilize ProGet to “package applications and components” with the aim of ensuring software is built only once, and deployed consistently across environments.


  1. itmWEB: Coping with Computer Viruses Archived October 14, 2007, at the Wayback Machine
  2. "The Haskell Cabal | Overview". Retrieved 2019-03-25.
  3. "Maven – Welcome to Apache Maven". Retrieved 2019-03-25.
  4. "Julia Package Listing". Retrieved 2019-03-25.
  5. "Quicklisp beta". Retrieved 2019-03-25.
  6. karann-msft. "NuGet Package Manager UI Reference". Retrieved 2019-03-25.
  7. "npm". Retrieved 2019-03-25.
  8. "Installing Perl Modules -". Retrieved 2019-03-25.
  9. Leisch, Friedrich. "Creating R Packages: A Tutorial" (PDF).
  10. Graves, Spencer B.; Dorai-Raj, Sundar. "Creating R Packages, Using CRAN, R-Forge, And Local R Archive Networks And Subversion (SVN) Repositories" (PDF).
  11. "The Comprehensive R Archive Network". Retrieved 2019-03-25.
  12. "R Installation and Administration". Retrieved 2019-03-25.
  13. "R-Forge: Welcome". Retrieved 2019-03-25.
  14. 1 2 "The Cargo Book". Documentation. Rust Programming Language. Retrieved 2019-08-26.
  15. "Rust Package Registry". Retrieved 2019-08-26.
  16. "List of Top Repositories by Programming Language". Stack Overflow. Retrieved 2010-04-14.
  17. "Apache Archiva: The Build Artifact Repository Manager". The Apache Software Foundation. Retrieved 2013-04-17. Apache Archiva[...] is an extensible repository management software that helps taking care of your own personal or enterprise-wide build artifact repository.
  18. "ProGet". Inedo. Retrieved 2016-02-11. Consistency, continuity, compliance – all in one centralized universal package manager with ProGet.
  19. "Artifactory. Manage Your Binaries". JFrog. Retrieved 2014-10-20. As the first Binary Repository Management solution, Artifactory has changed the way binaries are controlled, stored and managed throughout the software release cycle.
  20. "MyGet: Hosted NuGet, NPM, Bower and Vsix". MyGet. Retrieved 2013-03-13. MyGet hosts thousands of NuGet, Bower and NPM repositories used by companies and individual developers worldwide. MyGet comes with built-in Build Services, and also provides friction-free integration with GitHub, BitBucket and Visual Studio Online.
  21. Canals, Armando (2018-06-25). "Continuous package publishing, part I: introduction to package management in CI/CD". [packagecloud] hosts private and public package repositories for many different package types and works seamlessly with different package managers.
  22. "Package Drone" . Retrieved 2015-01-23. The idea is to have a workflow of Tycho Compile -> publish to repo -> Tycho Compile (using deployed artifacts). And some repository tools like cleanup, freezing, validation.
  23. "Nexus Repository Manager". Sonatype. Retrieved 2014-05-21. Nexus Pro gives you more information, more control, and better collaboration across your team than ever before. And it works with build tools like Ant, Ivy, Gradle, Maven, SBT and others. Use Nexus as the foundation for your complete Component Lifecycle Management approach.
  24. "Pulp | software repository management". Retrieved 2017-07-11.