Sweave

Last updated

Sweave is a function in the statistical programming language R that enables integration of R code into LaTeX or LyX documents. The purpose is "to create dynamic reports, which can be updated automatically if data or analysis change". [1]

Contents

The data analysis is performed at the moment of writing the report, or more exactly, at the moment of compiling the Sweave code with Sweave (i.e., essentially with R) and subsequently with LaTeX. This can facilitate the creation of up-to-date reports for the author.

Because the Sweave files together with any external R files that might be sourced from them and the data files contain all the information necessary to trace back all steps of the data analyses, Sweave also has the potential to make research more transparent and reproducible to others. [2] However, this is only the case to the extent that the author makes the data and the R and Sweave code available. If the author only publishes the resulting PDF document or printed versions thereof, a report created using Sweave is no more transparent or reproducible than the same report created with other statistical and text preparation software.

See also

Related Research Articles

<span class="mw-page-title-main">Gnumeric</span> Free and open-source spreadsheet software

Gnumeric is a spreadsheet program that is part of the GNOME Free Software Desktop Project. Gnumeric version 1.0 was released on 31 December 2001. Gnumeric is distributed as free software under the GNU General Public License; it is intended to replace proprietary spreadsheet programs like Microsoft Excel. Gnumeric was created and developed by Miguel de Icaza, but he has since moved on to other projects. The maintainer as of 2002 was Jody Goldberg.

<span class="mw-page-title-main">Literate programming</span> A programming approach of software development

Literate programming is a programming paradigm introduced in 1984 by Donald Knuth in which a computer program is given as an explanation of how it works in a natural language, such as English, interspersed (embedded) with snippets of macros and traditional source code, from which compilable source code can be generated. The approach is used in scientific computing and in data science routinely for reproducible research and open access purposes. Literate programming tools are used by millions of programmers today.

<span class="mw-page-title-main">LaTeX</span> Document preparation software system

LaTeX is a software system for typesetting documents. LaTeX markup describes the content and layout of the document, as opposed to the formatted text found in WYSIWYG word processors like Microsoft Word, LibreOffice Writer and Apple Pages. The writer uses markup tagging conventions to define the general structure of a document, to stylise text throughout a document, and to add citations and cross-references. A TeX distribution such as TeX Live or MiKTeX is used to produce an output file suitable for printing or digital distribution.

<span class="mw-page-title-main">Microsoft Excel</span> Spreadsheet editor, part of Microsoft 365

Microsoft Excel is a spreadsheet editor developed by Microsoft for Windows, macOS, Android, iOS and iPadOS. It features calculation or computation capabilities, graphing tools, pivot tables, and a macro programming language called Visual Basic for Applications (VBA). Excel forms part of the Microsoft 365 suite of software.

TeX, stylized within the system as TeX, is a typesetting system which was designed and written by computer scientist and Stanford University professor Donald Knuth and first released in 1978. TeX is a popular means of typesetting complex mathematical formulae; it has been noted as one of the most sophisticated digital typographical systems.

gnuplot Command-line and GUI plotting program

gnuplot is a command-line and GUI program that can generate two- and three-dimensional plots of functions, data, and data fits. The program runs on all major computers and operating systems . Originally released in 1986, its listed authors are Thomas Williams, Colin Kelley, Russell Lang, Dave Kotz, John Campbell, Gershon Elber, Alexander Woo "and many others." Despite its name, this software is not part of the GNU Project.

Reproducibility, closely related to replicability and repeatability, is a major principle underpinning the scientific method. For the findings of a study to be reproducible means that results obtained by an experiment or an observational study or in a statistical analysis of a data set should be achieved again with a high degree of reliability when the study is replicated. There are different kinds of replication but typically replication studies involve different researchers using the same methodology. Only after one or several such successful replications should a result be recognized as scientific knowledge.

<span class="mw-page-title-main">LyX</span>

LyX is an open source, graphical user interface document processor based on the LaTeX typesetting system. Unlike most word processors, which follow the WYSIWYG paradigm, LyX has a WYSIWYM approach, where what shows up on the screen roughly depicts the semantic structure of the page and is only an approximation of the document produced by TeX.

<span class="mw-page-title-main">GNU TeXmacs</span> Open-source word processor

GNU TeXmacs is a scientific word processor and typesetting component of the GNU Project. It originated as a variant of GNU Emacs with TeX functionalities, though it shares no code with those programs, while using TeX fonts. It is written and maintained by Joris van der Hoeven and a group of developers. The program produces structured documents with a WYSIWYG user interface. New document styles can be created by the user. The editor provides high-quality typesetting algorithms and TeX and other fonts for publishing professional looking documents.

<span class="mw-page-title-main">Device independent file format</span> Typesetting file format

The device independent file format (DVI) is the output file format of the TeX typesetting program, designed by David R. Fuchs and implemented by Donald E. Knuth in 1982. Unlike the TeX markup files used to generate them, DVI files are not intended to be human-readable; they consist of binary data describing the visual layout of a document in a manner not reliant on any specific image format, display hardware or printer. DVI files are typically used as input to a second program which translates DVI files to graphical data. For example, most TeX software packages include a program for previewing DVI files on a user's computer display; this program is a driver. Drivers are also used to convert from DVI to popular page description languages and for printing.

<span class="mw-page-title-main">BibTeX</span> Reference management software for formatting lists of references

BibTeX is both a bibliographic flat-file database file format and a software program for processing these files to produce lists of references (citations). The BibTeX file format is a widely used standard with broad support by reference management software.

gretl

gretl is an open-source statistical package, mainly for econometrics. The name is an acronym for GnuRegression, Econometrics and Time-seriesLibrary.

<span class="mw-page-title-main">Emacs Speaks Statistics</span>

Emacs Speaks Statistics (ESS) is an Emacs package for programming in statistical languages. It adds two types of modes to emacs:

  1. ESS modes for editing statistical languages like R, SAS and Julia; and
  2. inferior ESS (iESS) modes for interacting with statistical processes like R and SAS.

Natural-language programming (NLP) is an ontology-assisted way of programming in terms of natural-language sentences, e.g. English. A structured document with Content, sections and subsections for explanations of sentences forms a NLP document, which is actually a computer program. Natural language programming is not to be mixed up with natural language interfacing or voice control where a program is first written and then communicated with through natural language using an interface added on. In NLP the functionality of a program is organised only for the definition of the meaning of sentences. For instance, NLP can be used to represent all the knowledge of an autonomous robot. Having done so, its tasks can be scripted by its users so that the robot can execute them autonomously while keeping to prescribed rules of behaviour as determined by the robot's user. Such robots are called transparent robots as their reasoning is transparent to users and this develops trust in robots. Natural language use and natural-language user interfaces include Inform 7, a natural programming language for making interactive fiction, Shakespeare, an esoteric natural programming language in the style of the plays of William Shakespeare, and Wolfram Alpha, a computational knowledge engine, using natural-language input. Some methods for program synthesis are based on natural-language programming.

<span class="mw-page-title-main">Org-mode</span> Open source mode for GNU Emacs

Org Mode is a mode for document editing, formatting, and organizing within the free software text editor GNU Emacs and its derivatives, designed for notes, planning, and authoring. The name is used to encompass plain text files that include simple marks to indicate levels of a hierarchy, and an editor with functions that can read the markup and manipulate hierarchy elements.

MAXQDA is a software program designed for computer-assisted qualitative and mixed methods data, text and multimedia analysis in academic, scientific, and business institutions. It is being developed and distributed by VERBI Software based in Berlin, Germany.

<span class="mw-page-title-main">RStudio</span> Integrated development environment for R

RStudio IDE is an integrated development environment for R, a programming language for statistical computing and graphics. It is available in two formats: RStudio Desktop is a regular desktop application while RStudio Server runs on a remote server and allows accessing RStudio using a web browser. The RStudio IDE is a product of Posit PBC.

<span class="mw-page-title-main">Knitr</span> Report generation engine with R

knitr is an engine for dynamic report generation with R. It is a package in the programming language R that enables integration of R code into LaTeX, LyX, HTML, Markdown, AsciiDoc, and reStructuredText documents. The purpose of knitr is to allow reproducible research in R through the means of literate programming. It is licensed under the GNU General Public License.

Authorea is an online collaborative writing tool that allows researchers to write, cite, collaborate, host data and publish. It has been described as "Google Docs for Scientists". It has been owned by the commercial publishing company Wiley through Atypon since 2018.

<span class="mw-page-title-main">JASP</span> Free and open-source statistical program

JASP is a free and open-source program for statistical analysis supported by the University of Amsterdam. It is designed to be easy to use, and familiar to users of SPSS. It offers standard analysis procedures in both their classical and Bayesian form. JASP generally produces APA style results tables and plots to ease publication. It promotes open science via integration with the Open Science Framework and reproducibility by integrating the analysis settings into the results. The development of JASP is financially supported by several universities and research funds. As the JASP GUI is developed in C++ using Qt framework, some of the team left to make a notable fork which is Jamovi which has its GUI developed in JavaScript and HTML5.

References

  1. Leisch, Friedrich (2002). "Sweave, Part I: Mixing R and LaTeX: A short introduction to the Sweave file format and corresponding R functions" (PDF). R News. 2 (3): 28–31. Retrieved 22 January 2012.
  2. Pineda-Krch, Mario (17 January 2011). "The Joy of Sweave – A Beginner's Guide to Reproducible Research with Sweave" (PDF). Retrieved 22 Jan 2012.