Package development process

Last updated September 01, 2024

A software package development process is a system for developing software packages. Such packages are used to reuse and share code, e.g., via a software repository. A package development process includes a formal system for package checking that usually exposes bugs, thereby potentially making it easier to produce trustworthy software (Chambers' prime directive).^[1] It may also include a standard for documentation, thereby making it easier for new users to learn how to use it.

Discussion

In this context, a package is a collection of functions written for use in a single language such as Python or R. It may also be bundled with documentation. For many programming languages, there are software repositories where people share such packages.

For example, a Python package combines documentation, code and initial set up and possibly examples that could be used as unit tests in a single file with a "py" extension.

By contrast, an R package has documentation with examples in files separate from the code, possibly bundled with other material such as sample data sets and introductory vignettes. The source code for an R package is contained in a directory with a master "description" file and separate subdirectories for documentation, code, or data sets suitable for unit or regression testing.^[2] A formal package compilation process ^[3]^[4] checks for errors of various types. This includes checking for syntax errors on both the documentation markup language and the code, as well as comparing the arguments between documentation and code. Examples in the documentation are tested and produce error messages if they fail. This can be used as a primitive form of unit testing; more formal unit tests and regression testing can be included. This can improve software development productivity by making it easier to find bugs as the code is being developed. In addition, the documentation makes it easier to share code with others. It also makes it easier for a developer to use code written months or even years earlier. Routine checks are made of packages contributed to the Comprehensive R Archive Network (CRAN) and under development in the companion open-source collaborative development web site, R-Forge. These checks compile the packages repeatedly on different platforms under different versions of the core R language. The results are made available to package maintainers. In this way, package contributors become aware of problems they might otherwise never encounter themselves, because they otherwise would not have easy access to those alternative test results.

An interesting research question would be to compare the quality of contributions to different software repositories and try to relate that to features of the language and accompanying package development process. This could include trying to compare the rate of growth of contributed software to the degree of formality and enforcement of standards for documentation, testing, and coding.

Related Research Articles

Computer programming or coding is the composition of sequences of instructions, called programs, that computers can follow to perform tasks. It involves designing and implementing algorithms, step-by-step specifications of procedures, by writing code in one or more programming languages. Programmers typically use high-level programming languages that are more easily intelligible to humans than machine code, which is directly executed by the central processing unit. Proficient programming usually requires expertise in several different subjects, including knowledge of the application domain, details of programming languages and generic code libraries, specialized algorithms, and formal logic.

<span class="mw-page-title-main">Software testing</span> Checking software against a standard

Software testing is the act of checking whether software satisfies expectations.

Design by contract (DbC), also known as contract programming, programming by contract and design-by-contract programming, is an approach for designing software.

Unit testing, a.k.a. component or module testing, is a form of software testing by which isolated source code is tested to validate expected behavior.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined to accomplish a task, much as one might use multiple hands to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

Maven is a build automation tool used primarily for Java projects. Maven can also be used to build and manage projects written in C#, Ruby, Scala, and other languages. The Maven project is hosted by The Apache Software Foundation, where it was formerly part of the Jakarta Project.

eric is a free integrated development environment (IDE) used for computer programming. Since it is a full featured IDE, it provides by default all necessary tools needed for the writing of code and for the professional management of a software project.

The following tables provide a comparison of numerical analysis software.

A software repository, or repo for short, is a storage location for software packages. Often a table of contents is also stored, along with metadata. A software repository is typically managed by source or version control, or repository managers. Package managers allow automatically installing and updating repositories, sometimes called "packages".

The Wing Python IDE is a family of integrated development environments (IDEs) from Wingware created specifically for the Python programming language with support for editing, testing, debugging, inspecting/browsing, and error-checking Python code.

The Python Package Index, abbreviated as PyPI and also known as the Cheese Shop, is the official third-party software repository for Python. It is analogous to the CPAN repository for Perl and to the CRAN repository for R. PyPI is run by the Python Software Foundation, a charity. Some package managers, including pip, use PyPI as the default source for packages and their dependencies.

Software construction is a software engineering discipline. It is the detailed creation of working meaningful software through a combination of coding, verification, unit testing, integration testing, and debugging. It is linked to all the other software engineering disciplines, most strongly to software design and software testing.

<span class="mw-page-title-main">Knitr</span> Report generation engine with R

knitr is a software engine for dynamic report generation with R. It is a package in the programming language R that enables integration of R code into LaTeX, LyX, HTML, Markdown, AsciiDoc, and reStructuredText documents. The purpose of knitr is to allow reproducible research in R through the means of literate programming. It is licensed under the GNU General Public License.

mlpack is a free, open-source and header-only software library for machine learning and artificial intelligence written in C++, built on top of the Armadillo library and the ensmallen numerical optimization library. mlpack has an emphasis on scalability, speed, and ease-of-use. Its aim is to make machine learning possible for novice users by means of a simple, consistent API, while simultaneously exploiting C++ language features to provide maximum performance and maximum flexibility for expert users. mlpack has also a light deployment infrastructure with minimum dependencies, making it perfect for embedded systems and low resource devices. Its intended target users are scientists and engineers.

Anaconda is a distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. The distribution includes data-science packages suitable for Windows, Linux, and macOS. It is developed and maintained by Anaconda, Inc., which was founded by Peter Wang and Travis Oliphant in 2012. As an Anaconda, Inc. product, it is also known as Anaconda Distribution or Anaconda Individual Edition, while other products from the company are Anaconda Team Edition and Anaconda Enterprise Edition, neither of which is free.

This article discusses a set of tactics useful in software testing. It is intended as a comprehensive list of tactical approaches to software quality assurance and general application of the test method.

R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN. The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in data science.

References

↑ Chambers, John M. (2008). Software for Data Analysis: Programming with R. Springer. ISBN 978-0-387-75935-7.
↑ Writing R Extensions.
↑ Leisch, Friedrich. "Creating R Packages: A Tutorial" (PDF).
↑ Graves, Spencer B.; Dorai-Raj, Sundar. "Creating R Packages, Using CRAN, R-Forge, And Local R Archive Networks And Subversion (SVN) Repositories" (PDF).

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Chambers, John M. (2008). Software for Data Analysis: Programming with R. Springer. ISBN 978-0-387-75935-7.

[2] Writing R Extensions.

[3] Leisch, Friedrich. "Creating R Packages: A Tutorial" (PDF).

[4] Graves, Spencer B.; Dorai-Raj, Sundar. "Creating R Packages, Using CRAN, R-Forge, And Local R Archive Networks And Subversion (SVN) Repositories" (PDF).

[1]

[2]

[3]

[4]

Package development process

Contents

Discussion

See also

Related Research Articles

References