Package development process

Last updated

A software package development process is a system for developing software packages. Packages are used to reuse and share code, e.g., via a software repository, a formal system for package checking that usually expose bugs, thereby potentially making it easier to produce trustworthy software (Chambers' prime directive). [1]

Contents

Discussion

In this context, a package is a collection of functions written for use in a single language such as Python or R, bundled with documentation. For many programming languages, there are software repositories where people share such packages.

For example, a Python package combines documentation, code and initial set up and possibly examples that could be used as unit tests in a single file with a "py" extension.

By contrast, an R package has documentation with examples in files separate from the code, possibly bundled with other material such as sample data sets and introductory vignettes. The source code for an R package is contained in a directory with a master "description" file and separate subdirectories for documentation, code, optional data sets suits for unit or regression testing, and perhaps others. [2] A formal package compilation process [3] [4] checks for errors of various types. This includes checking for syntax errors on both the documentation markup language and the code as well as comparing the arguments between documentation and code. Examples in the documentation are tested and produce error messages if they fail. This can be used as a primitive form of unit testing; more formal unit tests and regression testing can be included. This can improve software development productivity by making it easier to find bugs as the code is being developed. In addition, the documentation makes it easier to share code with others. It also makes it easier for a developer to use code written months or even years earlier. Routine checks are made of packages contributed to the Comprehensive R Archive Network (CRAN) and under development in the companion open-source collaborative development web site, R-Forge. These checks compile the packages repeatedly on different platforms under different versions of the core R language. The results are made available to package maintainers. In this way, package contributors become aware of problems they might otherwise never encounter themselves, because they otherwise would not have easy access to those alternative test results.

An interesting research question would be to compare the quality of contributions to different software repositories and try to relate that to features of the language and accompanying package development process. This could include trying to compare the rate of growth of contributed software to the degree of formality and enforcement of standards for documentation, testing, and coding.

See also

Related Research Articles

The Comprehensive Perl Archive Network (CPAN) is a repository of over 250,000 software modules and accompanying documentation for 39,000 distributions, written in the Perl programming language by over 12,000 contributors. CPAN can denote either the archive network or the Perl program that acts as an interface to the network and as an automated software installer. Most software on CPAN is free and open source software.

Software testing is the act of examining the artifacts and the behavior of the software under test by validation and verification. Software testing can also provide an objective, independent view of the software to allow the business to appreciate and understand the risks of software implementation. Test techniques include, but are not necessarily limited to:

In software engineering, version control is a class of systems responsible for managing changes to computer programs, documents, large web sites, or other collections of information. Version control is a component of software configuration management.

In computer programming, unit testing is a software testing method by which individual units of source code—sets of one or more computer program modules together with associated control data, usage procedures, and operating procedures—are tested to determine whether they are fit for use. It is a standard step in development and implementation approaches such as Agile.

A programming tool or software development tool is a computer program that software developers use to create, debug, maintain, or otherwise support other programs and applications. The term usually refers to relatively simple programs, that can be combined to accomplish a task, much as one might use multiple hands to fix a physical object. The most basic tools are a source code editor and a compiler or interpreter, which are used ubiquitously and continuously. Other tools are used more or less depending on the language, development methodology, and individual engineer, often used for a discrete task, like a debugger or profiler. Tools may be discrete programs, executed separately – often from the command line – or may be parts of a single large program, called an integrated development environment (IDE). In many cases, particularly for simpler use, simple ad hoc techniques are used instead of a tool, such as print debugging instead of using a debugger, manual timing instead of a profiler, or tracking bugs in a text file or spreadsheet instead of a bug tracking system.

<span class="mw-page-title-main">R (programming language)</span> Programming language for statistics

R is a programming language for statistical computing and graphics. Created by statisticians Ross Ihaka and Robert Gentleman, R is used for data mining, bioinformatics, and data analysis.

In software testing, test automation is the use of software separate from the software being tested to control the execution of tests and the comparison of actual outcomes with predicted outcomes. Test automation can automate some repetitive but necessary tasks in a formalized testing process already in place, or perform additional testing that would be difficult to do manually. Test automation is critical for continuous delivery and continuous testing.

<span class="mw-page-title-main">Continuous integration</span> Software development practice based on frequent submission of granular changes

In software engineering, continuous integration (CI) is the practice of merging all developers' working copies to a shared mainline several times a day. Nowadays it is typically implemented in such a way that it triggers an automated build with testing. Grady Booch first proposed the term CI in his 1991 method, although he did not advocate integrating several times a day. Extreme programming (XP) adopted the concept of CI and did advocate integrating more than once per day – perhaps as many as tens of times per day.

eric (software)

eric is a free integrated development environment (IDE) used for computer programming. Since it is a full featured IDE, it provides by default all necessary tools needed for the writing of code and for the professional management of a software project.

The following tables compare general and technical information for a number of statistical analysis packages.

The following tables provide a comparison of numerical analysis software.

A software repository, or repo for short, is a storage location for software packages. Often a table of contents is also stored, along with metadata. A software repository is typically managed by source or version control, or repository managers. Package managers allow automatically installing and updating repositories, sometimes called "packages".


The Wing Python IDE is a family of integrated development environments (IDEs) from Wingware created specifically for the Python programming language, with support for editing, testing, debugging, inspecting/browsing, and error checking Python code.

<span class="mw-page-title-main">Go (programming language)</span> Programming language

Go is a statically typed, compiled high-level programming language designed at Google by Robert Griesemer, Rob Pike, and Ken Thompson. It is syntactically similar to C, but also has memory safety, garbage collection, structural typing, and CSP-style concurrency. It is often referred to as Golang because of its former domain name, golang.org, but its proper name is Go.

<span class="mw-page-title-main">Sublime Text</span> Text editor

Sublime Text is a shareware text and source code editor available for Windows, macOS, and Linux. It natively supports many programming languages and markup languages. Users can customize it with themes and expand its functionality with plugins, typically community-built and maintained under free-software licenses. To facilitate plugins, Sublime Text features a Python API. The editor utilizes minimal interface and contains features for programmers including configurable syntax highlighting, code folding, search-and-replace supporting regular-expressions, terminal output window, and more. It is proprietary software, but a free evaluation version is available.

Software construction is a software engineering discipline. It is the detailed creation of working meaningful software through a combination of coding, verification, unit testing, integration testing, and debugging. It is linked to all the other software engineering disciplines, most strongly to software design and software testing.

<span class="mw-page-title-main">Anaconda (Python distribution)</span> Distribution of the Python and R languages for scientific computing

Anaconda is a distribution of the Python and R programming languages for scientific computing, that aims to simplify package management and deployment. The distribution includes data-science packages suitable for Windows, Linux, and macOS. It is developed and maintained by Anaconda, Inc., which was founded by Peter Wang and Travis Oliphant in 2012. As an Anaconda, Inc. product, it is also known as Anaconda Distribution or Anaconda Individual Edition, while other products from the company are Anaconda Team Edition and Anaconda Enterprise Edition, neither of which are free.

This article discusses a set of tactics useful in software testing. It is intended as a comprehensive list of tactical approaches to Software Quality Assurance (more widely colloquially known as Quality Assurance and general application of the test method.

References

  1. Chambers, John M. (2008). Software for Data Analysis: Programming with R. Springer. ISBN   978-0-387-75935-7.
  2. Writing R Extensions.
  3. Leisch, Friedrich. "Creating R Packages: A Tutorial" (PDF).
  4. Graves, Spencer B.; Dorai-Raj, Sundar. "Creating R Packages, Using CRAN, R-Forge, And Local R Archive Networks And Subversion (SVN) Repositories" (PDF).