Nonprobability sampling

Last updated

Sampling is the use of a subset of the population to represent the whole population or to inform about (social) processes that are meaningful beyond the particular cases, individuals or sites studied. Probability sampling, or random sampling, is a sampling technique in which the probability of getting any particular sample may be calculated. In cases where external validity is not of critical importance to the study's goals or purpose, researchers might prefer to use nonprobability sampling.  Nonprobability sampling does not meet this criterion. Nonprobability sampling techniques are not intended to be used to infer from the sample to the general population in statistical terms. Instead, for example, grounded theory can be produced through iterative nonprobability sampling until theoretical saturation is reached (Strauss and Corbin, 1990).

Thus, one cannot say the same on the basis of a nonprobability sample than on the basis of a probability sample. The grounds for drawing generalizations (e.g., propose new theory, propose policy) from studies based on nonprobability samples are based on the notion of "theoretical saturation" and "analytical generalization" (Yin, 2014) instead of on statistical generalization.

Researchers working with the notion of purposive sampling assert that while probability methods are suitable for large-scale studies concerned with representativeness, nonprobability approaches are more suitable for in-depth qualitative research in which the focus is often to understand complex social phenomena (e.g., Marshall 1996; Small 2009). One of the advantages of nonprobability sampling is its lower cost compared to probability sampling. Moreover, the in-depth analysis of a small-N purposive sample or a case study enables the "discovery" and identification of patterns and causal mechanisms that do not draw time and context-free assumptions.

Nonprobability sampling is often not appropriate in statistical quantitative research, though, as these assertions raise some questions — how can one understand a complex social phenomenon by drawing only the most convenient expressions of that phenomenon into consideration? What assumption about homogeneity in the world must one make to justify such assertions? Alas, the consideration that research can only be based in statistical inference focuses on the problems of bias linked to nonprobability sampling and acknowledges only one situation in which a nonprobability sample can be appropriate — if one is interested only in the specific cases studied (for example, if one is interested in the Battle of Gettysburg), one does not need to draw a probability sample from similar cases (Lucas 2014a).

Nonprobability sampling is however widely used in qualitative research. Examples of nonprobability sampling include:

Studies intended to use probability sampling sometimes end up using nonprobability samples because of characteristics of the sampling method. For example, using a sample of people in the paid labor force to analyze the effect of education on earnings is to use a nonprobability sample of persons who could be in the paid labor force. Because the education people obtain could determine their likelihood of being in the paid labor force, the sample in the paid labor force is a nonprobability sample for the question at issue. In such cases results are biased.

The statistical model one uses can also render the data a nonprobability sample. For example, Lucas (2014b) notes that several published studies that use multilevel modeling have been based on samples that are probability samples in general, but nonprobability samples for one or more of the levels of analysis in the study. Evidence indicates that in such cases the bias is poorly behaved, such that inferences from such analyses are unjustified.

These problems occur in the academic literature, but they may be more common in non-academic research. For example, in public opinion polling by private companies (or other organizations unable to require response), the sample can be self-selected rather than random. This often introduces an important type of error, self-selection bias, in which a potential participant's willingness to volunteer for the sample may be determined by characteristics such as submissiveness or availability. The samples in such surveys should be treated as nonprobability samples of the population, and the validity of the findings based on them is unknown and cannot be established.

See also

Related Research Articles

<span class="mw-page-title-main">Cluster sampling</span> Sampling methodology in statistics

In statistics, cluster sampling is a sampling plan used when mutually homogeneous yet internally heterogeneous groupings are evident in a statistical population. It is often used in marketing research.

<span class="mw-page-title-main">Statistics</span> Study of the collection, analysis, interpretation, and presentation of data

Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

<span class="mw-page-title-main">Statistical inference</span> Process of using data analysis

Statistical inference is the process of using data analysis to infer properties of an underlying distribution of probability. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

In statistics, survey sampling describes the process of selecting a sample of elements from a target population to conduct a survey. The term "survey" may refer to many different types or techniques of observation. In survey sampling it most often involves a questionnaire used to measure the characteristics and/or attitudes of people. Different ways of contacting members of a sample once they have been selected is the subject of survey data collection. The purpose of sampling is to reduce the cost and/or the amount of work that it would take to survey the entire target population. A survey that measures the entire target population is called a census. A sample refers to a group or section of a population from which information is to be obtained.

The following outline is provided as an overview of and topical guide to statistics:

Randomization is a statistical process in which a random mechanism is employed to select a sample from a population or assign subjects to different groups. The process is crucial in ensuring the random allocation of experimental units or treatment protocols, thereby minimizing selection bias and enhancing the statistical validity. It facilitates the objective comparison of treatment effects in experimental design, as it equates groups statistically by balancing both known and unknown factors at the outset of the study. In statistical terms, it underpins the principle of probabilistic equivalence among groups, allowing for the unbiased estimation of treatment effects and the generalizability of conclusions drawn from sample data to the broader population.

<span class="mw-page-title-main">Sampling (statistics)</span> Selection of data points in statistics.

In statistics, quality assurance, and survey methodology, sampling is the selection of a subset or a statistical sample of individuals from within a statistical population to estimate characteristics of the whole population. The subset is meant to reflect the whole population and statisticians attempt to collect samples that are representative of the population. Sampling has lower costs and faster data collection compared to recording data from the entire population, and thus, it can provide insights in cases where it is infeasible to measure an entire population.

In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.

Quantitative marketing research is the application of quantitative research techniques to the field of marketing research. It has roots in both the positivist view of the world, and the modern marketing viewpoint that marketing is an interactive process in which both the buyer and seller reach a satisfying agreement on the "four Ps" of marketing: Product, Price, Place (location) and Promotion.

Inductive reasoning is any of various methods of reasoning in which broad generalizations or principles are derived from a body of observations. This article is concerned with the inductive reasoning other than deductive reasoning, where the conclusion of a deductive argument is certain given the premises are correct; in contrast, the truth of the conclusion of an inductive argument is at best probable, based upon the evidence given.

A statistical syllogism is a non-deductive syllogism. It argues, using inductive reasoning, from a generalization true for the most part to a particular case.

Quota sampling is a method for selecting survey participants that is a non-probabilistic version of stratified sampling.

Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample. The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting the data, and the need for it to offer sufficient statistical power. In complex studies, different sample sizes may be allocated, such as in stratified surveys or experimental designs with multiple treatment groups. In a census, data is sought for an entire population, hence the intended sample size is equal to the population. In experimental design, where a study may be divided into different treatment groups, there may be different sample sizes for each group.

External validity is the validity of applying the conclusions of a scientific study outside the context of that study. In other words, it is the extent to which the results of a study can generalize or transport to other situations, people, stimuli, and times. Generalizability refers to the applicability of a predefined sample to a broader population while transportability refers to the applicability of one sample to another target population. In contrast, internal validity is the validity of conclusions drawn within the context of a particular study.

This glossary of statistics and probability is a list of definitions of terms and concepts used in the mathematical sciences of statistics and probability, their sub-disciplines, and related fields. For additional related terms, see Glossary of mathematics and Glossary of experimental design.

An open-access poll is a type of opinion poll in which a nonprobability sample of participants self-select into participation. The term includes call-in, mail-in, and some online polls.

In statistics, resampling is the creation of new samples based on one observed sample. Resampling methods are:

  1. Permutation tests
  2. Bootstrapping
  3. Cross validation
  4. Jackknife

Convenience sampling is a type of non-probability sampling that involves the sample being drawn from that part of the population that is close to hand.

Intuitive statistics, or folk statistics, is the cognitive phenomenon where organisms use data to make generalizations and predictions about the world. This can be a small amount of sample data or training instances, which in turn contribute to inductive inferences about either population-level properties, future data, or both. Inferences can involve revising hypotheses, or beliefs, in light of probabilistic data that inform and motivate future predictions. The informal tendency for cognitive animals to intuitively generate statistical inferences, when formalized with certain axioms of probability theory, constitutes statistics as an academic discipline.

References

  1. Suresh, Sharma (2014). Nursing Research and Statistics. Elsevier Health Sciences. p. 224. ISBN   9788131237861 . Retrieved 29 September 2017.
  2. Schuster, Daniel P.; Powers (MD.), William J. (2005). Translational and Experimental Clinical Research. Lippincott Williams & Wilkins. p. 46. ISBN   9780781755658 . Retrieved 29 September 2017.
  3. Bowers, David; House, Allan; Owens, David H. (2011). Getting Started in Health Research. John Wiley & Sons. ISBN   9781118292969 . Retrieved 29 September 2017.