Knitr

Last updated
knitr
Original author(s) Yihui Xie
Initial release17 January 2012 (2012-01-17)
Stable release
1.43 / 26 May 2023;7 months ago (2023-05-26)
Repository
Written in R
Type Cross-platform
License GNU GPL
Website yihui.org/knitr/

knitr is an engine for dynamic report generation with R. [1] [2] It is a package in the programming language R that enables integration of R code into LaTeX, LyX, HTML, Markdown, AsciiDoc, and reStructuredText documents. The purpose of knitr is to allow reproducible research in R through the means of literate programming. It is licensed under the GNU General Public License. [3]

Contents

knitr was inspired by Sweave and written with a different design for better modularization, so it is easier to maintain and extend. Sweave can be regarded as a subset of knitr in the sense that all features of Sweave are also available in knitr. Some of knitr's extensions include the R Markdown format [4] (used in reports published on RPubs [5] ), caching, TikZ graphics and support to other languages such as Python, Perl, C++, Shell scripts and CoffeeScript, and so on.

knitr is officially supported in the RStudio IDE for R, LyX, Emacs/ESS and the Architect IDE for data science.

Workflow of knitr

Knitr consists of standard e.g. Markdown document with R-code chunks integrated in the document. The code chunks can be regarded as R-scripts that

The implementation of logical conditions in R can provide text elements for the dynamic report depended on the statistical analysis. For example:

   The Wilcoxon Sign test was applied as statistical comparison of the average of two dependent samples above.     In this case, the calculated P-value was 0.56 and hence greater than the significance level (0.05 by default).    This implies that "H0: there is no difference between the results in data1 and data2" cannot be rejected. 

The text fragments are selected according to the script's results. In this example, if the P-value was lower than the significance level, different text fragments would be inserted in the dynamic report. In particular, the second sentence would swap "less" for "greater," and the third sentence would be replaced to reflect rejection of the null hypothesis. Using this workflow allows creating new reports simply by supplying new input data, ensuring the methodology is reproduced identically.

See also

Related Research Articles

<span class="mw-page-title-main">Literate programming</span> A programming approach of software development

Literate programming is a programming paradigm introduced in 1984 by Donald Knuth in which a computer program is given as an explanation of how it works in a natural language, such as English, interspersed (embedded) with snippets of macros and traditional source code, from which compilable source code can be generated. The approach is used in scientific computing and in data science routinely for reproducible research and open access purposes. Literate programming tools are used by millions of programmers today.

In computer science, a preprocessor is a program that processes its input data to produce output that is used as input in another program. The output is said to be a preprocessed form of the input data, which is often used by some subsequent programs like compilers. The amount and kind of processing done depends on the nature of the preprocessor; some preprocessors are only capable of performing relatively simple textual substitutions and macro expansions, while others have the power of full-fledged programming languages.

A lightweight markup language (LML), also termed a simple or humane markup language, is a markup language with simple, unobtrusive syntax. It is designed to be easy to write using any generic text editor and easy to read in its raw form. Lightweight markup languages are used in applications where it may be necessary to read the raw document as well as the final rendered output.

<span class="mw-page-title-main">Markdown</span> Plain text markup language

Markdown is a lightweight markup language for creating formatted text using a plain-text editor. John Gruber and Aaron Swartz created Markdown in 2004 as a markup language that is intended to be easy to read in its source code form. Markdown is widely used for blogging and instant messaging, and also used elsewhere in online forums, collaborative software, documentation pages, and readme files.

<span class="mw-page-title-main">Snippet (programming)</span> Small region of re-usable source code, machine code, or text

Snippet is a programming term for a small region of re-usable source code, machine code, or text. Ordinarily, these are formally defined operative units to incorporate into larger programming modules. Snippet management is a feature of some text editors, program source code editors, IDEs, and related software. It allows the user to avoid repetitive typing in the course of routine edit operations.

<span class="mw-page-title-main">Web template system</span> System in web publishing

A web template system in web publishing allows web designers and developers work with web templates to automatically generate custom web pages, such as the results from a search. This reuses static web page elements while defining dynamic elements based on web request parameters. Web templates support static content, providing basic structure and appearance. Developers can implement templates from content management systems, web application frameworks, and HTML editors.

<span class="mw-page-title-main">Template processor</span> Software designed to combine templates with a data model to produce result documents

A template processor is software designed to combine templates with a data model to produce result documents. The language that the templates are written in is known as a template language or templating language. For purposes of this article, a result document is any kind of formatted output, including documents, web pages, or source code, either in whole or in fragments. A template engine is ordinarily included as a part of a web template system or application framework, and may be used also as a preprocessor or filter.

TypeScript is a free and open-source high-level programming language developed by Microsoft that adds static typing with optional type annotations to JavaScript. It is designed for the development of large applications and transpiles to JavaScript. Because TypeScript is a superset of JavaScript, all JavaScript programs are syntactically valid TypeScript, but they can fail to type-check for safety reasons.

Sweave is a function in the statistical programming language R that enables integration of R code into LaTeX or LyX documents. The purpose is "to create dynamic reports, which can be updated automatically if data or analysis change".

<span class="mw-page-title-main">Org-mode</span> Open source mode for GNU Emacs

Org Mode is a mode for document editing, formatting, and organizing within the free software text editor GNU Emacs and its derivatives, designed for notes, planning, and authoring. The name is used to encompass plain text files that include simple marks to indicate levels of a hierarchy, and an editor with functions that can read the markup and manipulate hierarchy elements.

Ace is a standalone code editor written in JavaScript. The goal is to create a web-based code editor that matches and extends the features, usability, and performance of existing native editors such as TextMate, Vim, or Eclipse. It can be easily embedded in any web page and JavaScript application. Ace is developed as the primary editor for Cloud9 IDE and as the successor of the Mozilla Skywriter project.

The following outline is provided as an overview of and topical guide to the Perl programming language:

<span class="mw-page-title-main">RStudio</span> Integrated development environment for R

RStudio is an integrated development environment for R, a programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser. The RStudio IDE is a product of Posit PBC.

<span class="mw-page-title-main">Julia (programming language)</span> Dynamic programming language

Julia is a high-level, general-purpose dynamic programming language, most commonly used for numerical analysis and computational science. Distinctive aspects of Julia's design include a type system with parametric polymorphism and the use of multiple dispatch as a core programming paradigm, efficient garbage collection, and a just-in-time (JIT) compiler.

<span class="mw-page-title-main">Atom (text editor)</span> Free and open-source text and source code editor

Atom was a free and open-source text and source code editor for macOS, Linux, and Windows with support for plug-ins written in JavaScript, and embedded Git Control. Developed by GitHub, Atom was released on June 25, 2015.

<span class="mw-page-title-main">KDE Gear</span> Set of applications and supporting libraries

The KDE Gear is a set of applications and supporting libraries that are developed by the KDE community, primarily used on Linux-based operating systems but mostly multiplatform, and released on a common release schedule.

Yihui Xie is a Chinese statistician, data scientist and software engineer who formerly worked for RStudio. He is the principal author of the open-source software package Knitr for data analysis in the R programming language, and has also written the book Dynamic Documents with R and knitr.

<span class="mw-page-title-main">Notebook interface</span> Programming tool blending code and documents

A notebook interface or computational notebook is a virtual notebook environment used for literate programming, a method of writing computer programs. Some notebooks are WYSIWYG environments including executable calculations embedded in formatted documents; others separate calculations and text into separate sections. Notebooks share some goals and features with spreadsheets and word processors but go beyond their limited data models.

R packages are extensions to the R statistical programming language. R packages contain code, data, and documentation in a standardised collection format that can be installed by users of R, typically via a centralised software repository such as CRAN. The large number of packages available for R, and the ease of installing and using them, has been cited as a major factor driving the widespread adoption of the language in data science.

References

  1. Xie, Yihui (2015). Dynamic Documents with R and knitr, 2nd Edition. Chapman & Hall/CRC. ISBN   9781498716963.
  2. Xie, Yihui. "knitr: A General-Purpose Tool for Dynamic Report Generation in R" (PDF).
  3. "Knitr: A General-Purpose Package for Dynamic Report Generation in R". 29 September 2021.
  4. RStudio, Inc. "R Markdown — Dynamic Documents for R".
  5. RStudio, Inc. "Easy web publishing from R".