Probability-proportional-to-size sampling

Last updated

In survey methodology, probability-proportional-to-size (pps) sampling is a sampling process where each element of the population (of size N) has some (independent) chance to be selected to the sample when performing one draw. This is proportional to some known quantity so that . [1] :97 [2]

Contents

One of the cases this occurs in, as developed by Hanson and Hurwitz in 1943, [3] is when we have several clusters of units, each with a different (known upfront) number of units, then each cluster can be selected with a probability that is proportional to the number of units inside it. [4] :250 So, for example, if we have 3 clusters with 10, 20 and 30 units each, then the chance of selecting the first cluster will be 1/6, the second would be 1/3, and the third cluster will be 1/2.

The pps sampling results in a fixed sample size n (as opposed to Poisson sampling which is similar but results in a random sample size with expectancy of n). When selecting items with replacement the selection procedure is to just draw one item at a time (like getting n draws from a multinomial distribution with N elements, each with their own selection probability). If doing a without-replacement sampling, the schema can become more complex. [1] :93

Another sampling method, Reservoir sampling, is 'Weighted random sampling with a reservoir', which offers an algorithm for drawing a weighted random sample of size m from a population of n weighted items, where m⩽n, in one-pass over unknown population size. [5]

Distribution and properties

If observations from some distribution F are sampled in a way that is proportional to their value, then the distribution of the values in that sample follows a Length-biased distribution, with the following density function: [6] :2 [7]

Also:

Notice that this would assume that the PPS sampling is done with replacement (or if the sample size is much smaller than the population size).

See also

References

  1. 1 2 Carl-Erik Sarndal; Bengt Swensson; Jan Wretman (1992). Model Assisted Survey Sampling. Springer. ISBN   978-0-387-97528-3.
  2. Skinner, Chris J. (2016). "Probability Proportional to Size (PPS) Sampling". Wiley StatsRef: Statistics Reference Online. pp. 1–5. doi:10.1002/9781118445112.stat03346.pub2. ISBN   978-1-118-44511-2.
  3. Hansen, Morris H.; Hurwitz, William N. (1943). "On the Theory of Sampling from Finite Populations". The Annals of Mathematical Statistics. 14 (4): 333–362. doi:10.1214/aoms/1177731356. JSTOR   2235923.
  4. Cochran, W. G. (1977). Sampling Techniques (3rd ed.). Nashville, TN: John Wiley & Sons. ISBN   978-0-471-16240-7
  5. Efraimidis, Pavlos S.; Spirakis, Paul G. (March 2006). "Weighted random sampling with a reservoir". Information Processing Letters. 97 (5): 181–185. doi:10.1016/j.ipl.2005.11.003.
  6. Mustafa, Abdelfattah; Khan, M. I. (June 2022). "The length-biased power hazard rate distribution: Some properties and applications". Statistics in Transition. New Series. 23 (2): 1–16. doi:10.2478/stattrans-2022-0013. hdl: 10419/266304 .
  7. Lee, Kyeongjun (30 November 2024). "Estimation of length biased exponential distribution based on progressive hybrid censoring". Communications for Statistical Applications and Methods. 31 (6): 661–675. doi: 10.29220/CSAM.2024.31.6.661 .