Proportionator

Last updated

The proportionator is the most efficient unbiased stereological method used to estimate population size in samples.

Contents

A typical application is counting the number of cells in an organ. The proportionator is related to the optical fractionator and physical dissector methods that also estimate population. The optical and physical fractionators use a sampling method called systematic uniform random sampling, or SURS. Unlike these two methods the proportionator introduces sampling with probability proportional to size, or PPS. With SURS all sampling sites are equal. With PPS sites are not sampled with the same probability. The reason for using PPS is to improve the efficiency of the estimation process.

Efficiency is the notion of how much is gained by a given amount of work. A more efficient method provides better results for the same amount of work. The proportionator provides a better estimate, that is a more precise estimate, than either of these two methods: the optical fractionator and physical dissector . The PPS is implemented by assigning a value to a sampling site. This value is the characteristic of the sampling site. The proportionator becomes the optical fractionator if the characteristic is constant, i.e. the same, for all sampling sites. If there is no difference between sampling sites, then the proportionator behaves the same as the optical fractionator. In actual sampling, the characteristic varies across the tissue being studied. Information about the distribution of the characteristic is used to refine the sampling. The greater the variance of the characteristic, the greater the efficiency of the proportionator. What this means to the stereologist is simple: if you need to count more and more to get the CE needed to publish just stop and switch to the proportionator.

The proportionator is a patented process that is not generally available. The only current licensee for the patent is Visiopharm.

Introduction

The proportionator is the de facto standard method used to count cells in large projects. The increased efficiency provided by the proportionator makes more work intensive methods such as the optical fractionator less attractive except in small projects.

A common misconception in the stereological literature is that design based methodologies require that all objects of interest must have the same probability of being selected. It is true that making such a design decision ensures an unbiased result, but it is not necessary. The use of nonuniform sampling is often used in stereological work. The point sampled intercept method selects cells using a point probe. The result is a volume-weighted estimate of the size of the cells. This is not a biased result.

A sampling method known as probability proportional to size, or PPS, selects objects based on a characteristic that differs between objects. An excellent example of this is the selection of trees based on their diameter, or selecting a cell based on volume. The PSI selects cells with points. DeVries estimators select trees with lines. Sections select objects based on their height. These are examples of objects being selected in a varying probability by probes. In these examples the characteristic is a function of the objects themselves. That does not have to be the case.

The proportionator applies PPS to counting cells. The PPS is employed to gain efficiency in the sampling, and not to produce a weighted estimate, such as a volume weighted estimate. The optical fractionator is the older standard for estimating the number of cells in an unbiased manner. The optical fractionator, and other sampling methods, has some statistical uncertainty. This uncertainty is due to the variance of the sampling even though the result is unbiased. The efficiency of the sampling can be determined by use of the coefficient of error, or CE. This value describes the variance of the sampling method. Often, biological sampling is done at a CE of .05.

The efficiency of a sampling method is the amount of work it takes to obtain a desired CE. A more efficient method is one that requires less work to obtain a desired CE. A method is less efficient if the same amount of work results in a larger CE.

Suppose that every sample always gave the same result. There would be no difference between samples. This means that the variance in this case is 0. No more than 1 sample would be required to obtain a good result. (Understand that this might not be efficient if the sampling requires a great deal of work and there is no need for a CE this low.) If samples differ, then the variance is positive, and so is the CE.

The typical method of controlling the CE is to do more counting. The literature on the optical fractionator recommends methods of deciding where to increase the workload: more slices, or more optical dissectors. In keeping with this notion some amount of effort has been made to perform automatic image acquisition and counting to facilitate the process. The proportionator provides a superior result by avoiding more counting.

Plotless sampling

One of the earliest stereological methods that employed PPS was introduced by Walter Bitterlich in 1939 to improve the efficiency of fieldwork in the forest sciences. Bitterlich developed a sampling method that revolutionized the forest sciences. Up to this time the sampling quadrat method proposed by Pond and Clements in 1898 was still in use. Laying out sampling quadrats at each sampling site was a difficult process at times due to the physical obstructions of the natural world. Besides the physical issues it was also a costly procedure. It took a considerable amount of time to lay out a rectangle and to measure the trees included in the quadrat. Bitterlich realized that PPS could be used in the field. Bitterlich proposed the use of a sampling angle. All of the trees selected by a fixed angle from a sampling point would be counted. The quadrat, or plot as it was often called, was not required.

The quantity being estimated by the researchers was tree volume. The original sampling method was to choose a number of sampling points. The researcher traveled to each sampling point. A quadrat, rectangular sampling area, was laid out at each sampling point. Measurements of the trees in the quadrats was used to estimate tree volume. A typical measurement is basal area.

Bitterlich's method was to choose a number of sampling points. The researcher traveled to each sampling point just as in the quadrat method. At each sampling point the researcher used an angle gauge to see if a tree had a larger apparent angle than the gauge. If so, the tree was counted. No quadrat and no measurements! Just count and go. The result of this procedure was an estimate of tree volume.

Lou Grosenbaugh realized the importance of Bitterlich's work and wrote a number of articles describing the method. Soon a host of devices from angle gauge, to relascope, to sampling prism were developed. The Bitterlich method, employing PPS, and these devices profoundly increased the efficiency of fieldwork.

The proportionator reduces the workload by avoiding the expense of increased counting. The efficiency increase is attained by employing PPS. Efforts to automate the counting process attack the variance problem at the wrong level of sampling. The better solution is to reduce the workload before going to the counting step. The optimal situation is to have all samples providing identical counts. The next best situation is to reduce the difference between samples.

The proportionator adjusts the sampling scheme to select samples that are likely to provide estimates that have a smaller difference. Thus the variance of the estimator is addressed without changing the workload. That results in a gain in efficiency due to the reduction in variance for a given cost.

The main steps in sampling biological tissue are:

  1. Selection of a set of animals
  2. Selection of tissue, usually organs from the animals in step 1
  3. Sampling of the organs by means such as slabbing, cutting bars from organs in step 2
  4. Selecting a sample of the slices produced from the material in step 3
  5. Selection of sampling sites on slices from step 4
  6. Sampling in an optical dissector within the sampling sites chosen in step 5

The typical attempt at increasing efficiency is the counting which occurs in step 6. The proportionator adjusts the sampling at step 5. This is accomplished by assigning a characteristic to each sampling site. Since each of the sampling sites is viewed it is possible for the automated systems to make a visual record of the site. The image collected at each site is used to determine a value for the site. The values for the sites are the characteristic. Recall that the characteristic may, but does not have to a function of the objects being counted. The potential sampling sites are then sampled based on the observed characteristic. Sites are chosen in a non-uniform manner, but still an unbiased method. Not only is the result unbiased, but the result is not weighted by the characteristic. The end result is that the difference between samples is reduced. This reduces the variance. Therefore, the workload is reduced.

Experimental evidence demonstrates that the proportionator significantly reduces the variance between samples, especially in situations where the tissue distribution is heterogeneous. This means that the situations where it is harder to reduce the variance, or improve the CE, are just the situations where the proportionator excels. Another way to look at this is that the proportionator is designed to take the CE reduction issue out of the hands of the researcher.

Suppose that the goal is to have a CE of .05. If the CE is larger than that value, then the only option available in the optical fractionator method is to increase the counting by either using more slices or more sampling sites on the slices. The proportionator is able to adjust the sampling to decrease the CE without increasing the counting. In fact, if the proportionator is able to reduce the CE below .05, then it is possible to reduce the counting workload and allow the CE to come up to the .05 requirement.

PPS revolutionized the forestry sciences. The application of PPS to cell counting makes larger scale research projects possible, while saving time and reducing expenses.

Sources

    Commercial products

    Related Research Articles

    Cluster sampling

    In statistics, cluster sampling is a sampling plan used when mutually homogeneous yet internally heterogeneous groupings are evident in a statistical population. It is often used in marketing research. In this sampling plan, the total population is divided into these groups and a simple random sample of the groups is selected. The elements in each cluster are then sampled. If all elements in each sampled cluster are sampled, then this is referred to as a "one-stage" cluster sampling plan. If a simple random subsample of elements is selected within each of these groups, this is referred to as a "two-stage" cluster sampling plan. A common motivation for cluster sampling is to reduce the total number of interviews and costs given the desired accuracy. For a fixed sample size, the expected random error is smaller when most of the variation in the population is present internally within the groups, and not between the groups.

    In statistics, an estimator is a rule for calculating an estimate of a given quantity based on observed data: thus the rule, the quantity of interest and its result are distinguished. For example, the sample mean is a commonly used estimator of the population mean.

    Median Middle quantile of a data set or probability distribution

    In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic feature of the median in describing data compared to the mean is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of a "typical" value. Median income, for example, may be a better way to suggest what a "typical" income is, because income distribution can be very skewed. The median is of central importance in robust statistics, as it is the most resistant statistic, having a breakdown point of 50%: so long as no more than half the data are contaminated, the median is not an arbitrarily large or small result.

    Statistics Study of the collection, analysis, interpretation, and presentation of data

    Statistics is the discipline that concerns the collection, organization, analysis, interpretation, and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments.

    A statistic (singular) or sample statistic is any quantity computed from values in a sample which is considered for a statistical purpose. Statistical purposes include estimating a population parameter, describing a sample, or evaluating a hypothesis. The average of sample values is a statistic. The term statistic is used both for the function and for the value of the function on a given sample. When a statistic is being used for a specific purpose, it may be referred to by a name indicating its purpose.

    The weighted arithmetic mean is similar to an ordinary arithmetic mean, except that instead of each of the data points contributing equally to the final average, some data points contribute more than others. The notion of weighted mean plays a role in descriptive statistics and also occurs in a more general form in several other areas of mathematics.

    Sampling (statistics) Selection of data points in statistics.

    In statistics, quality assurance, and survey methodology, sampling is the selection of a subset of individuals from within a statistical population to estimate characteristics of the whole population. Statisticians attempt to collect samples that are representative of the population in question. Sampling has lower costs and faster data collection than measuring the entire population and can provide insights in cases where it is infeasible to sample an entire population.

    In statistics, point estimation involves the use of sample data to calculate a single value which is to serve as a "best guess" or "best estimate" of an unknown population parameter. More formally, it is the application of a point estimator to the data to obtain a point estimate.

    In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.

    In statistics, importance sampling is a general technique for estimating properties of a particular distribution, while only having samples generated from a different distribution than the distribution of interest. The method was first introduced by Teun Kloek and Herman K. van Dijk in 1978, and is related to umbrella sampling in computational physics. Depending on the application, the term may refer to the process of sampling from this alternative distribution, the process of inference, or both.

    Monte Carlo integration

    In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand at a regular grid, Monte Carlo randomly chooses points at which the integrand is evaluated. This method is particularly useful for higher-dimensional integrals.

    In statistics, resampling is any of a variety of methods for doing one of the following:

    1. Estimating the precision of sample statistics by using subsets of available data (jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping)
    2. Permutation tests are exact tests: Exchanging labels on data points when performing significance tests
    3. Validating models by using random subsets

    In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes median-unbiased from the usual mean-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

    Unbiased rendering Type of rendering in computer graphics

    Within the field of computer graphics, unbiased rendering refers to any rendering technique that does not introduce systematic error, or bias, into the radiance approximation. The term refers to statistical bias, not the broader meaning of subjective bias. Because of this, an unbiased rendering technique can produce a reference image to compare against renders that use other techniques. In simple terms, unbiased rendering tries to mimic the real world as closely as possible without taking short cuts. Path tracing and its derivatives can be unbiased, whereas ray tracing was originally biased.

    Stereology is the three-dimensional interpretation of two-dimensional cross sections of materials or tissues. It provides practical techniques for extracting quantitative information about a three-dimensional material from measurements made on two-dimensional planar sections of the material. Stereology is a method that utilizes random, systematic sampling to provide unbiased and quantitative data. It is an important and efficient tool in many applications of microscopy. Stereology is a developing science with many important innovations being developed mainly in Europe. New innovations such as the proportionator continue to make important improvements in the efficiency of stereological procedures.

    In survey methodology, the design effect is the ratio between the variances of two estimators to some parameter of interest. Specifically the ratio of an actual variance of an estimator that is based on a sample from some sampling design, to the variance of an alternative estimator that would be calculated (hypothetically) using a sample from a simple random sample (SRS) of the same number of elements. It measures the expected effect of the design structure on the variance of some estimator of interest. The design effect is a positive real number that can indicate an inflation, or deflation in the variance of an estimator for some parameter, that is due to the study not using SRS.

    In the comparison of various statistical procedures, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given performance. This article primarily deals with efficiency of estimators.

    Cell counting is any of various methods for the counting or similar quantification of cells in the life sciences, including medical diagnosis and treatment. It is an important subset of cytometry, with applications in research and clinical practice. For example, the complete blood count can help a physician to determine why a patient feels unwell and what to do to help. Cell counts within liquid media are usually expressed as a number of cells per unit of volume, thus expressing a concentration.

    The ratio estimator is a statistical parameter and is defined to be the ratio of means of two random variables. Ratio estimates are biased and corrections must be made when they are used in experimental or survey work. The ratio estimates are asymmetrical and symmetrical tests such as the t test should not be used to generate confidence intervals.