Software repository

Last updated

A software repository, colloquially known as a "repo" for short, is a storage location from which software packages may be retrieved and installed on a computer.

Contents

Overview

Many software publishers and other organizations maintain servers on the Internet for this purpose, either free of charge or for a subscription fee. Repositories may be solely for particular programs, such as CPAN for the Perl programming language, or for an entire operating system. Operators of such repositories typically provide a package management system, tools intended to search for, install and otherwise manipulate software packages from the repositories. For example, many Linux distributions use Advanced Packaging Tool (APT), commonly found in Debian based distributions, or yum found in Red Hat based distributions. There are also multiple independent package management systems, such as pacman, used in Arch Linux and equo, found in Sabayon Linux.

Internet Global system of connected computer networks

The Internet is the global system of interconnected computer networks that use the Internet protocol suite (TCP/IP) to link devices worldwide. It is a network of networks that consists of private, public, academic, business, and government networks of local to global scope, linked by a broad array of electronic, wireless, and optical networking technologies. The Internet carries a vast range of information resources and services, such as the inter-linked hypertext documents and applications of the World Wide Web (WWW), electronic mail, telephony, and file sharing.

The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. CPAN can denote either the archive network itself, or the Perl program that acts as an interface to the network and as an automated software installer. Most software on CPAN is free and open source software. CPAN was conceived in 1993 and active online since October 1995. It is based on the CTAN model and began as a place to unify the structure of scattered Perl archives.

Perl interpreted programming language

Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages, Perl 5 and Perl 6.

As software repositories are designed to include useful packages, major repositories are designed to be malware free. If a computer is configured to use a digitally signed repository from a reputable vendor, and is coupled with an appropriate permissions system, this significantly reduces the threat of malware to these systems. As a side effect, many systems that have these capabilities do not require anti-malware software such as anti-virus software. [1]

MalwareMustDie organization

MalwareMustDie, NPO as a whitehat security research workgroup, has been launched from August 2012. MalwareMustDie is a registered Nonprofit organization as a media for IT professionals and security researchers gathered to form a work flow to reduce malware infection in the internet. The group is known of their malware analysis blog. They have a list of Linux malware research and botnet analysis that they have completed. The team communicates information about malware in general and advocates for better detection for Linux malware.

Most file systems have methods to assign permissions or access rights to specific users and groups of users. These permissions control the ability of the users to view, change, navigate, and execute the contents of the file system.

Antivirus software computer software to defend against malicious computer viruses

Antivirus software, or anti-virus software, also known as anti-malware, is a computer program used to prevent, detect, and remove malware.

Most major Linux distributions have many repositories around the world that mirror the main repository.

A Linux distribution is an operating system made from a software collection, which is based upon the Linux kernel and, often, a package management system. Linux users usually obtain their operating system by downloading one of the Linux distributions, which are available for a wide variety of systems ranging from embedded devices and personal computers to powerful supercomputers.

Package management system vs. package development process

A package management system is different from a package development process.

Package manager software tools for handling software packages

A package manager or package management system is a collection of software tools that automates the process of installing, upgrading, configuring, and removing computer programs for a computer's operating system in a consistent manner.

A software package development process is a system for developing software packages. Packages make it easier to reuse and share code, e.g., via a software repository. A formal system for package checking can help expose bugs, thereby potentially making it easier to produce trustworthy software. This in turn can help improve productivity for people who produce and use software, as part of a software development process or software development methodology.

A typical use of a package management system is to facilitate the integration of code from possibly different sources into a coherent stand-alone operating unit. Thus, a package management system might be used to produce a distribution of Linux, possibly a distribution tailored to a specific restricted application.

A package development process, by contrast, is used to manage the co-development of code and documentation of a collection of functions or routines with a common theme, producing thereby a package of software functions that typically will not be complete and usable by themselves. A good package development process will help users conform to good documentation and coding practices, integrating some level of unit testing. The table below provides examples of package development processes.

Selected repositories

The following table lists a few languages with repositories for contributed software. The "Autochecks" column describes the routine checks done.

Very few people have the ability to test their software under multiple operating-systems with different versions of the core code and with other contributed packages they may use. For R, the Comprehensive R Archive Network (CRAN) runs tests routinely. To see how this is valuable, suppose Sally contributes a package A. Sally only runs the current version of the software under one version of Microsoft Windows, and has only tested it in that environment. At more or less regular intervals, CRAN tests Sally's contribution under a dozen combinations of operating systems and versions of the core R language software. If one of them generates an error, she gets that error message. With luck, that error message may suffice to allow her to fix the error, even if she cannot replicate it with the hardware and software she has. Next, suppose John contributes to the repository a package B that uses a package A. Package B passes all the tests and is made available to users. Later, Sally submits an improved version of A, which unfortunately, breaks B. The autochecks make it possible to provide information to John so he can fix the problem.

This example exposes both a strength and a weakness in the R contributed-package system: CRAN supports this kind of automated testing of contributed packages, but packages contributed to CRAN need not specify the versions of other contributed packages that they use. Procedures for requesting specific versions of packages exist, but contributors might not use those procedures.

Beyond this, a repository such as CRAN running regular checks of contributed packages actually provides an extensive if ad hoc test suite for development versions of the core language. If Sally (in the example above) gets an error message she does not understand or thinks is inappropriate, especially from a development version of the language, she can (and often does with R) ask the core development-team for the language for help. In this way, the repository can contribute to improving the quality of the core language software.

Language / purpose Package Development Process RepositoryHow to installCollaborative development platformAutochecks
Haskell Common Architecture for Building Applications and Libraries (CABAL) Hackage
Java Maven
Julia
Common Lisp Quicklisp
.NET NuGet NuGet
Node.js NPM
Perl CPAN PPM
PHP PEAR, Composer PECL, Packagist
Python Setuptools PyPI pip, EasyInstall, PyPM, Anaconda
R R CMD check process [2] [3] CRAN install.packages R-Forge Roughly weekly on 12 platforms or combinations of different versions of R (devel, prerel, patched, release) with up to 7 different operating systems (different versions of Linux, Windows, and Mac).
Ruby RubyGems Ruby Application Archive RubyForge
TeX, LaTeX CTAN

(Parts of this table were copied from a "List of Top Repositories by Programming Language" on Stack Overflow [4] )

Many other programming languages, among them C, C++, and Fortran, do not possess a central software repository with universal scope. Notable repositories with limited scope include:

Repository managers

Software to manage repositories (repository managers) includes:

See also

Related Research Articles

Slackware Linux distribution

Slackware is a Linux distribution created by Patrick Volkerding in 1993. Originally based on Softlanding Linux System, Slackware has been the basis for many other Linux distributions, most notably the first versions of SUSE Linux distributions, and is the oldest distribution that is still maintained.

Portage (software) Linux package management system

Portage is a package management system originally created for and used by Gentoo Linux and also by Chrome OS, Sabayon, and Funtoo Linux among others. Portage is based on the concept of ports collections. Gentoo is sometimes referred to as a meta-distribution due to the extreme flexibility of Portage, which makes it operating-system-independent. The Gentoo/Alt project is concerned with using Portage to manage other operating systems, such as BSDs, macOS and Solaris. The most notable of these implementations is the Gentoo/FreeBSD project.

Arch Linux is a Linux distribution for computers based on x86-64 architectures.

Dependency hell is a colloquial term for the frustration of some software users who have installed software packages which have dependencies on specific versions of other software packages.

Puppy Linux lightweight GNU/Linux distribution

Puppy Linux is an operating system and lightweight Linux distribution that focuses on ease of use and minimal memory footprint. The entire system can be run from RAM with current versions generally taking up about 210 MB, allowing the boot medium to be removed after the operating system has started. Applications such as AbiWord, Gnumeric and MPlayer are included, along with a choice of lightweight web browsers and a utility for downloading other packages. The distribution was originally developed by Barry Kauler and other members of the community, until Kauler retired in 2013. The tool Woof can build a Puppy Linux distribution from the binary packages of other Linux distributions.

Maven is a build automation tool used primarily for Java projects.

LAMP (software bundle) software bundle

LAMP is an archetypal model of web service stacks, named as an acronym of the names of its original four open-source components: the Linux operating system, the Apache HTTP Server, the MySQL relational database management system (RDBMS), and the PHP programming language. The LAMP components are largely interchangeable and not limited to the original selection. As a solution stack, LAMP is suitable for building dynamic web sites and web applications.

AppImage

AppImage is a format for distributing portable software on Linux without needing superuser permissions to install the application. It tries also to allow Linux distribution-agnostic binary software deployment for application developers, also called Upstream packaging. Released first in 2004 under the name klik, it was continuously developed, then renamed in 2011 to PortableLinuxApps and later in 2013 to AppImage.

RubyGems is a package manager for the Ruby programming language that provides a standard format for distributing Ruby programs and libraries, a tool designed to easily manage the installation of gems, and a server for distributing them. It was created by Chad Fowler and Richard Kilmer during RubyConf 2004.

Software distributions, of which Linux distributions form a sizable proportion, are commonly referred to as distros, with rolling release distributions commonly referred to as rolling distros. When used as an adjective, instead of a noun, rolling release is often shortened to rolling, when referring to distributions, software, or development models.

RPM Package Manager software package management system

RPM Package Manager (RPM) is a free and open-source package management system. The name RPM refers to the following: the .rpm file format, files in the .rpm file format, software packaged in such files, and the package manager program itself. RPM was intended primarily for Linux distributions; the file format is the baseline package format of the Linux Standard Base.

Homebrew (package management software) open-source package management system for macOS and Linux

Homebrew is a free and open-source software package management system that simplifies the installation of software on Apple's macOS operating system and Linux. The name means building software on your Mac depending on taste. Originally written by Max Howell, the package manager has gained popularity in the Ruby on Rails community and earned praise for its extensibility. Homebrew has been recommended for its ease of use as well as its integration into the command line. Homebrew is a non-profit project member of the Software Freedom Conservancy, and is run entirely by unpaid volunteers.

LuaRocks is a package manager for the Lua programming language that provides a standard format for distributing Lua modules, a tool designed to easily manage the installation of rocks, and a server for distributing them. While not included with the Lua distribution, it has been called the "de facto package manager for community-contributed Lua modules".

Anaconda (Python distribution) package manager, environment manager, and Python (and related packages) distribution

Anaconda is a free and open-source distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. Package versions are managed by the package management system conda. The Anaconda distribution is used by over 12 million users and includes more than 1400 popular data-science packages suitable for Windows, Linux, and MacOS.

A binary repository manager is a software tool designed to optimize the download and storage of binary files used and produced in software development. It centralizes the management of all the binary artifacts generated and used by the organization to overcome the complexity arising from the diversity of binary artifact types, their position in the overall workflow and the dependencies between them.

References

  1. itmWEB: Coping with Computer Viruses Archived October 14, 2007, at the Wayback Machine
  2. Leisch, Friedrich. "Creating R Packages: A Tutorial" (PDF).
  3. Graves, Spencer B.; Dorai-Raj, Sundar. "Creating R Packages, Using CRAN, R-Forge, And Local R Archive Networks And Subversion (SVN) Repositories" (PDF).
  4. "List of Top Repositories by Programming Language". Stack Overflow. Retrieved 2010-04-14.
  5. "Apache Archiva: The Build Artifact Repository Manager". The Apache Software Foundation. Retrieved 2013-04-17. Apache Archiva[...] is an extensible repository management software that helps taking care of your own personal or enterprise-wide build artifact repository.
  6. "ProGet". Inedo. Retrieved 2016-02-11. Consistency, continuity, compliance – all in one centralized universal package manager with ProGet.
  7. "Artifactory. Manage Your Binaries". JFrog. Retrieved 2014-10-20. As the first Binary Repository Management solution, Artifactory has changed the way binaries are controlled, stored and managed throughout the software release cycle.
  8. "MyGet: Hosted NuGet, NPM, Bower and Vsix". MyGet. Retrieved 2013-03-13. MyGet hosts thousands of NuGet, Bower and NPM repositories used by companies and individual developers worldwide. MyGet comes with built-in Build Services, and also provides friction-free integration with GitHub, BitBucket and Visual Studio Online.
  9. Canals, Armando (2018-06-25). "Continuous package publishing, part I: introduction to package management in CI/CD". circleci.com. [packagecloud] hosts private and public package repositories for many different package types and works seamlessly with different package managers.
  10. "Package Drone" . Retrieved 2015-01-23. The idea is to have a workflow of Tycho Compile -> publish to repo -> Tycho Compile (using deployed artifacts). And some repository tools like cleanup, freezing, validation.
  11. "Nexus Repository Manager". Sonatype. Retrieved 2014-05-21. Nexus Pro gives you more information, more control, and better collaboration across your team than ever before. And it works with build tools like Ant, Ivy, Gradle, Maven, SBT and others. Use Nexus as the foundation for your complete Component Lifecycle Management approach.
  12. "Pulp | software repository management". pulpproject.org. Retrieved 2017-07-11.