| Tidyverse | |
|---|---|
| The tidyverse hex logo | |
| Initial release | September 15, 2016 [1] [2] |
| Stable release | |
| Repository | github |
| Written in | R |
| Type | Package collection |
| License | MIT |
| Website | www |
The tidyverse is a collection of open source packages for the R programming language introduced by Hadley Wickham [4] and his team that "share an underlying design philosophy, grammar, and data structures" of tidy data. [5] Characteristic features of tidyverse packages include extensive use of non-standard evaluation and encouraging piping. [6] [7] [8]
As of November 2018, the tidyverse package and some of its individual packages comprise 5 out of the top 10 most downloaded R packages. [9] The tidyverse is the subject of multiple books and papers. [10] [11] [12] [13] In 2019, the ecosystem has been published in the Journal of Open Source Software . [14]
Its syntax has been referred to as "supremely readable", [15] and some [16] have argued that tidyverse is an effective way to introduce complete beginners to programming, as pedagogically it allows students to quickly begin doing data processing tasks. [17] [16] Moreover, some practitioners have pointed out that data processing tasks are intuitively easier to chain together with tidyverse compared to Python's equivalent data processing package, pandas. [18] There is also an active R community around the tidyverse. For example, there is the TidyTuesday social data project organised by the Data Science Learning Community (DSLC), [19] where varied real-world datasets are released each week for the community to participate, share, practice, and make learning to work with data easier. [20] Critics of the tidyverse have argued it promotes tools that are harder to teach and learn than their built-in, base R equivalents and are too dissimilar to some programming languages. [21] [22]
The tidyverse principles more generally encourage and help ensure that a universe of streamlined packages, in principle, will help alleviate dependency issues and compatibility with current and future features. [23] An example of such a tidyverse principled approach is the pharmaverse, which is a collection of R packages for clinical reporting usage in pharma. [24]
The core tidyverse packages, which provide functionality to model, transform, and visualize data, include: [25]
Additional packages assist the core collection. [26] Other packages based on the tidy data principles are regularly developed, such as tidytext [27] for text analysis, tidymodels [28] for machine learning, or tidyquant [29] for financial operations.