Statistical thinking

Last updated

Statistical thinking is a tool for process analysis of phenomena in relatively simple terms, while also providing a level of uncertainty surrounding it. [1] It is worth nothing that "statistical thinking" is not the same as "quantitative literacy", although there is overlap in interpreting numbers and data visualizations. [2]

Contents

Statistical thinking relates processes and statistics, and is based on the following principles:

History

"The great body of physical science ... [is] only accessible and only thinkable to those who have had a sound training in mathematical analysis, and the time may not be very remote when it will be understood that for complete initiation as an efficient citizen ... it is necessary to be able to compute, to think in averages and maxima and minima, as it is now to be able to read and write."

H.G. Wells, Mankind in the Making

W. Edwards Deming promoted the concepts of statistical thinking, using two powerful experiments:

1. The Red Bead experiment, [3] in which workers are tasked with running a more or less random procedure, yet the lowest "performing" workers are fired. The experiment demonstrates how the natural variability in a process can dwarf the contribution of individual workers' talent.

2. The Funnel experiment, again demonstrating that natural variability in a process can loom larger than it ought to.

The take home message from the experiments is that before management adjusts a process—such as by firing seemingly underperforming employees, or by making physical changes to an apparatus—they should consider all sources of variation in the process that led to the performance outcome.

Nigel Marriott breaks down the evolution of statistical thinking. [4]

Benchmarks

Statistical thinking is thought to help in different contexts, such as the courtroom, [5] biology labs, [6] and children growing up surrounded by data. [2]

The American Statistical Association (ASA) has laid out what it means to be "statistically educated". [2] Here is a subset of concepts for students to know, that:

Statistical thinking is a recognized method used as part of Six Sigma methodologies.

See also

Related Research Articles

Analysis of variance (ANOVA) is a collection of statistical models and their associated estimation procedures used to analyze the differences among means. ANOVA was developed by the statistician Ronald Fisher. ANOVA is based on the law of total variance, where the observed variance in a particular variable is partitioned into components attributable to different sources of variation. In its simplest form, ANOVA provides a statistical test of whether two or more population means are equal, and therefore generalizes the t-test beyond two means. In other words, the ANOVA is used to test the difference between two or more means.

Biostatistics is a branch of statistics that applies statistical methods to a wide range of topics in biology. It encompasses the design of biological experiments, the collection and analysis of data from those experiments and the interpretation of the results.

<span class="mw-page-title-main">Statistics</span> Study of the collection, analysis, interpretation, and presentation of data

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

<span class="mw-page-title-main">Statistical inference</span> Process of using data analysis

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

<span class="mw-page-title-main">Experiment</span> Scientific procedure performed to validate a hypothesis

An experiment is a procedure carried out to support or refute a hypothesis, or determine the efficacy or likelihood of something previously untried. Experiments provide insight into cause-and-effect by demonstrating what outcome occurs when a particular factor is manipulated. Experiments vary greatly in goal and scale but always rely on repeatable procedure and logical analysis of the results. There also exist natural experimental studies.

<span class="mw-page-title-main">W. Edwards Deming</span> American engineer and statistician (1900–1993)

William Edwards Deming was an American business theorist, composer, economist, industrial engineer, management consultant, statistician, and writer. Educated initially as an electrical engineer and later specializing in mathematical physics, he helped develop the sampling techniques still used by the United States Census Bureau and the Bureau of Labor Statistics. He is also known as the father of the quality movement and was hugely influential in post-WWII Japan, credited with revolutionizing Japan's industry and making it one of the most dominant economies in the world. He is best known for his theories of management.

<span class="mw-page-title-main">Sampling (statistics)</span> Selection of data points in statistics.

In statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population. Sampling has lower costs and faster data collection compared to recording data from the entire population, and thus, it can provide insights in cases where it is infeasible to measure an entire population.

<span class="mw-page-title-main">Walter A. Shewhart</span> American statistician

Walter Andrew Shewhart was an American physicist, engineer and statistician, sometimes known as the father of statistical quality control and also related to the Shewhart cycle.

Common and special causes are the two distinct origins of variation in a process, as defined in the statistical thinking and methods of Walter A. Shewhart and W. Edwards Deming. Briefly, "common causes", also called natural patterns, are the usual, historical, quantifiable variation in a system, while "special causes" are unusual, not previously observed, non-quantifiable variation.

<span class="mw-page-title-main">Control chart</span> Process control tool to determine if a manufacturing process is in a state of control

Control charts are graphical plots used in production control to determine whether quality and manufacturing processes are being controlled under stable conditions. The hourly status is arranged on the graph, and the occurrence of abnormalities is judged based on the presence of data that differs from the conventional trend or deviates from the control limit line. Control charts are classified into Shewhart individuals control chart and CUSUM(CUsUM)(or cumulative sum control chart)(ISO 7870-4).

Taguchi methods are statistical methods, sometimes called robust design methods, developed by Genichi Taguchi to improve the quality of manufactured goods, and more recently also applied to engineering, biotechnology, marketing and advertising. Professional statisticians have welcomed the goals and improvements brought about by Taguchi methods, particularly by Taguchi's development of designs for studying variation, but have criticized the inefficiency of some of Taguchi's proposals.

Statistical process control (SPC) or statistical quality control (SQC) is the application of statistical methods to monitor and control the quality of a production process. This helps to ensure that the process operates efficiently, producing more specification-conforming products with less waste scrap. SPC can be applied to any process where the "conforming product" output can be measured. Key tools used in SPC include run charts, control charts, a focus on continuous improvement, and the design of experiments. An example of a process where SPC is applied is manufacturing lines.

<span class="mw-page-title-main">Mathematical statistics</span> Branch of statistics

Mathematical statistics is the application of probability theory, a branch of mathematics, to statistics, as opposed to techniques for collecting statistical data. Specific mathematical techniques which are used for this include mathematical analysis, linear algebra, stochastic analysis, differential equations, and measure theory.

In the statistical theory of the design of experiments, blocking is the arranging of experimental units that are similar to one another in groups (blocks) based on one or more variables. These variables are chosen carefully to minimize the impact of their variability on the observed outcomes. There are different ways that blocking can be implemented, resulting in different confounding effects. However, the different methods share the same purpose: to control variability introduced by specific factors that could influence the outcome of an experiment. The roots of blocking originated from the statistician, Ronald Fisher, following his development of ANOVA.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

Analytic and enumerative statistical studies are two types of scientific studies:

<span class="mw-page-title-main">Replication (statistics)</span> Principle that variation can be better estimated with nonvarying repetition of conditions

In engineering, science, and statistics, replication is the process of repeating a study or experiment under the same or similar conditions to support the original claim, which crucial to confirm the accuracy of results as well as for identifying and correcting the flaws in the original experiment. ASTM, in standard E1847, defines replication as "... the repetition of the set of all the treatment combinations to be compared in an experiment. Each of the repetitions is called a replicate."

Repeated measures design is a research design that involves multiple measures of the same variable taken on the same or matched subjects either under different conditions or over two or more time periods. For instance, repeated measurements are collected in a longitudinal study in which change over time is assessed.

In statistics education, informal inferential reasoning refers to the process of making a generalization based on data (samples) about a wider universe (population/process) while taking into account uncertainty without using the formal statistical procedure or methods.

References

  1. Poldrack, Russell A. (2019-12-01). "Statistical Thinking for the 21st Century (Poldrack)". Statistics LibreTexts. Retrieved 2024-02-02.
  2. 1 2 3 Wai, Jonathan (2014-08-05). "The case for starting statistics education in kindergarten". Quartz. Retrieved 2024-02-02.
  3. "Red Bead Experiment - The W. Edwards Deming Institute". deming.org/. Retrieved 2023-06-13.
  4. Marriott, Nigel (2014-12-01). "The Future of Statistical Thinking". Significance. 11 (5): 78–80. doi:10.1111/j.1740-9713.2014.00787.x. ISSN   1740-9705. OCLC   5706565999 via Oxford Academic.
  5. Denis, Daniel J. (2017-04-24). "How statistical thinking should shape the courtroom". The Conversation. Retrieved 2024-02-02.
  6. Fay, David S.; Gerow, Ken (2018), "A biologist's guide to statistical thinking and analysis", WormBook: The Online Review of C. elegans Biology [Internet], WormBook, PMID   23908055 , retrieved 2024-02-02