CPAN

Last updated
CPAN logo The logo of CPAN.png
CPAN logo

The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. [1] CPAN can denote either the archive network or the Perl program that acts as an interface to the network and as an automated software installer (somewhat like a package manager). Most software on CPAN is free and open source software. [2]

Contents

History

CPAN was conceived in 1993 and has been active online since October 1995. [3] It is based on the CTAN model and began as a place to unify the structure of scattered Perl archives. [4]

Role

Like many programming languages, Perl has mechanisms to use external libraries of code, making one file contain common routines used by several programs. Perl calls these modules. Perl modules are typically installed in one of several directories whose paths are placed in the Perl interpreter when it is first compiled; on Unix-like operating systems, common paths include /usr/lib/perl5, /usr/local/lib/perl5, and several of their subdirectories.

Perl comes with a small set of core modules. Some of these perform bootstrapping tasks, such as ExtUtils::MakeMaker, [5] which is used to create Makefiles for building and installing other extension modules; others, like List::Util, [6] are merely commonly used.

CPAN's main purpose is to help programmers locate modules and programs not included in the Perl standard distribution. Its structure is decentralized. Authors maintain and improve their own modules. Forking, and creating competing modules for the same task or purpose, is common. There is a third-party bug tracking system that is automatically set up for any uploaded distribution, but authors may opt to use a different bug tracking system such as GitHub. Similarly, though GitHub is a popular location to store the source for distributions, it may be stored anywhere the author prefers, or may not be publicly accessible at all. Maintainers may grant permissions to others to maintain or take over their modules, and permissions may be granted by admins for those wishing to take over abandoned modules. Previous versions of updated distributions are retained on CPAN until deleted by the uploader, and a secondary mirror network called BackPAN retains distributions even if they are deleted from CPAN. [7] Also, the complete history of the CPAN and all its modules is available as the GitPAN project, [8] allowing to easily see the complete history for all the modules and for easy maintenance of forks. CPAN is also used to distribute new versions of Perl, as well as related projects, such as Parrot and Raku.

Structure

Files on the CPAN are referred to as distributions. A distribution may consist of one or more modules, documentation files, or programs packaged in a common archiving format, such as a gzipped tar archive or a ZIP file. Distributions will often contain installation scripts (usually called Makefile.PL or Build.PL) and test scripts which can be run to verify the contents of the distribution are functioning properly. New distributions are uploaded to the Perl Authors Upload Server, or PAUSE (see the section Uploading distributions with PAUSE).

In 2003, distributions started to include metadata files, called META.yml , indicating the distribution's name, version, dependencies, and other useful information; however, not all distributions contain metadata. When metadata is not present in a distribution, the PAUSE's software will try to analyze the code in the distribution to look for the same information; this is not necessarily very reliable. In 2010, version 2 of this specification was created [9] to be used via a new file called META.json , with the YAML format file often also included for backward compatibility.

With thousands of distributions, CPAN needs to be structured to be useful. Authors often place their modules in the natural hierarchy of Perl module names (such as Apache::DBI or Lingua::EN::Inflect) according to purpose or domain, though this is not enforced.

CPAN module distributions usually have names in the form of CGI-Application-3.1 (where the :: used in the module's name has been replaced with a dash, and the version number has been appended to the name), but this is only a convention; many prominent distributions break the convention, especially those that contain multiple modules. Security restrictions prevent a distribution from ever being replaced with an identical filename, so virtually all distribution names do include a version number.

Components

The distribution infrastructure of CPAN consists of its worldwide network of more than 250 mirrors in more than 60 countries. [10] Each full mirror hosts around 31 gigabytes of data. [11]

Most mirrors update themselves hourly, daily or bidaily from the CPAN master site. [12] Some sites are major FTP servers which mirror lots of other software, but others are simply servers owned by companies that use Perl heavily. There are at least two mirrors on every continent except Antarctica.

Several search engines have been written to help Perl programmers sort through the CPAN. The official search.cpan.org includes textual search, a browsable index of modules, and extracted copies of all distributions currently on the CPAN. On 16 May 2018, the Perl Foundation announced that search.cpan.org would be shut down on 29 June 2018 (after 19 years of operation), due to its aging codebase and maintenance burden. Users will be transitioned and redirected to the third-party alternative MetaCPAN. [13] [14]

CPAN Testers are a group of volunteers, who will download and test distributions as they are uploaded to CPAN. This enables the authors to have their modules tested on many platforms and environments that they would otherwise not have access to, thus helping to promote portability, as well as a degree of quality. Smoke testers send reports, which are then collated and used for a variety of presentation websites, including the main reports site, statistics and dependencies.

Authors can upload new distributions to the CPAN through the Perl Authors Upload Server (PAUSE). To do so, they must request a PAUSE account.

Once registered, they may use a web interface at pause.perl.org, or an FTP interface to upload files to their directory and delete them. Modules in the upload will only be indexed as canonical if the module name has not been used before (granting first-come permission to the uploader), or if the uploader has permission for that name, and if the module is a higher version than any existing entry. [15] This can be specified through PAUSE's web interface.

CPAN.pm, CPANPLUS, and cpanminus

There is also a Perl core module named CPAN; it is usually differentiated from the repository itself by using the name CPAN.pm. CPAN.pm is mainly an interactive shell which can be used to search for, download, and install distributions. An interactive shell called cpan is also provided in the Perl core, and is the usual way of running CPAN.pm. After a short configuration process and mirror selection, it uses tools available on the user's computer to automatically download, unpack, compile, test, and install modules. It is also capable of updating itself.

An effort to replace CPAN.pm with something cleaner and more modern resulted in the CPANPLUS (or CPAN++) set of modules. CPANPLUS separates the back-end work of downloading, compiling, and installing modules from the interactive shell used to issue commands. It also supports several advanced features, such as cryptographic signature checking and test result reporting. Finally, CPANPLUS can uninstall a distribution. CPANPLUS was added to the Perl core in version 5.10.0, and removed from it in version 5.20.0.

A smaller, leaner modern alternative to these CPAN installers was developed called cpanminus. cpanminus was designed to have a much smaller memory footprint as often required in limited memory environments, and to be usable as a standalone script such that it can even install itself, requiring only the expected set of core Perl modules to be available. It is also available from CPAN as the module App::cpanminus, which installs the cpanm script. It does not maintain or rely on a persistent configuration, but is configured only by the environment and command-line options. cpanminus does not have an interactive shell component. It recognizes the cpanfile format for specifying prerequisites, useful in ad-hoc Perl projects that may not be designed for CPAN installation. cpanminus also has the ability to uninstall distributions.

Each of these modules can check a distribution's dependencies and recursively install any prerequisites, either automatically or with individual user approval. Each support FTP and HTTP and can work through firewalls and proxies.

Influence

Experienced Perl programmers often comment that half of Perl's power is in the CPAN. It has been called Perl's killer app. [16] It is roughly equivalent to Composer for PHP; the PyPI (Python Package Index) repository for Python; RubyGems for Ruby; CRAN for R; npm for Node.js; LuaRocks for Lua; Maven for Java; and Hackage for Haskell. CPAN's use of arbitrated name spaces, a testing regime and a well defined documentation style makes it unique.

Given its importance to the Perl developer community, the CPAN both shapes and is shaped by Perl's culture. Its "self-appointed master librarian", Jarkko Hietaniemi, often takes part in the April Fools' Day jokes; on 1 April 2002 the site was temporarily named to CJAN, where the "J" stood for "Java". In 2003, the www.cpan.org domain name was redirected to Matt's Script Archive, a site infamous in the Perl community for having badly written code. [17] [18] [19]

Some of the distributions on the CPAN are distributed as jokes. The Acme:: hierarchy is reserved for joke modules; for instance, Acme::Don't adds a don't function that doesn't run the code given to it (to complement the do built-in, which does). Even outside the Acme:: hierarchy, some modules are still written largely for amusement; one example is Lingua::Romana::Perligata, which can be used to write Perl programs in a subset of Latin.

In 2005, a group of Perl developers who also had an interest in JavaScript got together to create JSAN, the JavaScript Archive Network. The JSAN is a near-direct port of the CPAN infrastructure for use with the JavaScript language, which for most of its lifespan did not have a cohesive "community".

In 2008, after a chance meeting with CPAN admin Adam Kennedy at the Open Source Developers Conference, Linux kernel developer Rusty Russell created the CCAN, the Comprehensive C Archive Network. The CCAN is a direct port of the CPAN architecture for use with the C language.

CRAN, the Comprehensive R Archive Network, is a set of mirrors hosting the R programming language distribution(s), documentation, and contributed extensions. [20]

Related Research Articles

<span class="mw-page-title-main">Perl</span> Interpreted programming language first released in 1987

Perl is a family of two high-level, general-purpose, interpreted, dynamic programming languages. "Perl" refers to Perl 5, but from 2000 to 2019 it also referred to its redesigned "sister language", Perl 6, before the latter's name was officially changed to Raku in October 2019.

<span class="mw-page-title-main">JAR (file format)</span> Java archive file format

A JAR file is a package file format typically used to aggregate many Java class files and associated metadata and resources into one file for distribution.

In computing, the Perl DBI offers a standardized way for programmers using the Perl programming language to embed database communication within their programs. The latest DBI module for Perl from CPAN can run on a range of operating systems.

In computer programming, glob patterns specify sets of filenames with wildcard characters. For example, the Unix Bash shell command mv *.txt textfiles/ moves all files with names ending in .txt from the current directory to the directory textfiles. Here, * is a wildcard standing for "any string of characters except /" and *.txt is a glob pattern. The other common wildcard is the question mark (?), which stands for one character. For example, mv ?.txt shorttextfiles/ will move all files named with a single character followed by .txt from the current directory to directory shorttextfiles, while ??.txt would match all files whose name consists of 2 characters followed by .txt.

A Perl module is a discrete component of software for the Perl programming language. Technically, it is a particular set of conventions for using Perl's package mechanism that has become universally adopted.

PerlMonks is a community website covering all aspects of Perl programming and other related topics such as web applications and system administration. It is often referred to by users as 'The Monastery'. The name PerlMonks, and the general style of the website, is designed to both humorously reflect the almost religious zeal that programmers sometimes have for their favorite language, and also to engender an atmosphere of calm reflection and consideration for other users.

Pugs is a compiler and interpreter for the Raku programming language, started on February 1, 2005, by Audrey Tang.

<span class="mw-page-title-main">LAMP (software bundle)</span> Acronym for a common web hosting solution

LAMP is an acronym denoting one of the most common software stacks for many of the web's most popular applications. However, LAMP now refers to a generic software stack model and its components are largely interchangeable.

<span class="mw-page-title-main">Catalyst (software)</span>

Catalyst is an open source web application framework written in Perl, that closely follows the model–view–controller (MVC) architecture, and supports a number of experimental web patterns. It is written using Moose, a modern object system for Perl. Its design is heavily inspired by frameworks such as Ruby on Rails, Maypole, and Spring.

The Template Toolkit (TT) is a template engine used primarily for building web sites, but is also suitable for creating any type of digital document, such as a PDF or LaTeX file. Template Toolkit is based on a mini-language and does not allow direct Perl in its templates by default, unlike some competing products. This forces developers to separate business logic into Perl libraries, leaving only presentation logic in their templates. It is written in Perl, with some popular accessories in C. It is released under a free software licence.

Perl Package Manager (PPM) is a Perl utility intended to simplify the tasks of locating, installing, upgrading and removing software packages. It can determine if the most recent version of a software package is installed on a system, and can install or upgrade that package from a local or remote host.

A software repository, or repo for short, is a storage location for software packages. Often a table of contents is also stored, along with metadata. A software repository is typically managed by source control or repository managers. Package managers allow automatically installing and updating repositories.

Strawberry Perl is a distribution of the Perl programming language for the Microsoft Windows platform. Additionally, strawberry contains a fully featured Mingw-w64 C/C++ compiler with many libraries included. While most other distributions rely on the user having software development tools already set up to install certain Perl components, Strawberry Perl ships with the most commonly used tools preconfigured and packaged. It is a dramatic departure from other Perl distributions, and has influenced other distributions to provide such development tools in their own distribution.

<span class="mw-page-title-main">Padre (software)</span> Perl software development platform

Padre is a multi-language software development platform comprising an IDE and a plug-in system to extend it. It is written primarily in Perl and is used to develop applications in this language.

XZ Utils is a set of free software command-line lossless data compressors, including lzma and xz, for Unix-like operating systems and, from version 5.0 onwards, Microsoft Windows.

npm (software) JavaScript package manager

npm is a package manager for the JavaScript programming language maintained by npm, Inc. npm is the default package manager for the JavaScript runtime environment Node.js. It consists of a command line client, also called npm, and an online database of public and paid-for private packages, called the npm registry. The registry is accessed via the client, and the available packages can be browsed and searched via the npm website. The package manager and the registry are managed by npm, Inc.

The following outline is provided as an overview of and topical guide to the Perl programming language:

<span class="mw-page-title-main">Yarn (package manager)</span> JavaScript package manager

Yarn is a software packaging system developed in 2016 by Facebook for the Node.js JavaScript runtime environment. An alternative to the npm package manager, Yarn was created as a collaboration of Facebook, Exponent, Google, and Tilde to solve consistency, security, and performance problems with large codebases.

References

  1. "CPAN front page" . Retrieved 27 January 2016.
  2. "How are Perl and the CPAN modules licensed?". Most, though not all, modules on CPAN are licensed under the GNU General Public License (GPL) or the Artistic license...
  3. "The Timeline of Perl and its Culture".
  4. "Grokking the CPAN" (PDF). I propose that we cooperate to create a unified structure, much like the CTAN project which has managed to create a collection of canonical sites for TeX
  5. "ExtUtils::MakeMaker - Create a module Makefile - Perldoc Browser". perldoc.perl.org. Retrieved 18 November 2020.
  6. "List::Util - A selection of general-utility list subroutines - Perldoc Browser". perldoc.perl.org. Retrieved 18 November 2020.
  7. "BackPAN" . Retrieved 20 December 2019.
  8. "What is Gitpan?". GitHub . 2 December 2015. Retrieved 16 November 2016.
  9. "CPAN::Meta::History" . Retrieved 20 December 2019.
  10. "CPAN Mirror Network" . Retrieved 16 November 2016.
  11. "How to mirror CPAN". CPAN.org. Retrieved 15 November 2016.
  12. "CPAN Status and Statistics" . Retrieved 9 May 2010.
  13. "The end of an era: Saying goodbye to search.cpan.org". log.perl.org. Retrieved 22 May 2018.
  14. "Saying goodbye to search.cpan.org". perl.com. Retrieved 26 June 2018.
  15. "PAUSE Operating Model". GitHub . Retrieved 20 December 2019.
  16. "Re: Killer Apps in PERL" . Retrieved 24 February 2013.
  17. "Elements of Programming with Perl". 12 October 2000. Retrieved 25 April 2013.
  18. "Exploit this formmail.pl for fun and, well, fun". 7 August 2001. Retrieved 25 April 2013.
  19. "Matt's Script Archive Strikes Again!". 4 July 2001. Retrieved 25 April 2013.
  20. "What is CRAN?" . Retrieved 20 December 2019.