Proportional reduction in loss

Last updated

Proportional reduction in loss (PRL) is a general framework for developing and evaluating measures of the reliability of particular ways of making observations which are possibly subject to errors of all types. Such measures quantify how much having the observations available has reduced the loss (cost) of the uncertainty about the intended quantity compared with not having those observations.

Proportional reduction in error is a more restrictive framework widely used in statistics, in which the general loss function is replaced by a more direct measure of error such as the mean square error.[ citation needed ] Examples are the coefficient of determination and Goodman and Kruskal's lambda. [1]

The concept of proportional reduction in loss was proposed by Bruce Cooil and Roland T. Rust in their 1994 paper Reliability and Expected Loss: A Unifying Principle. [2] Many commonly used reliability measures for quantitative data (such as continuous data in an experimental design) are PRL measures, including Cronbach's alpha and measures proposed by Ben J. Winer in 1971. [3] It also provides a general way of developing measures for the reliability of qualitative data. For example, this framework provides several possible measures that are applicable when a researcher wants to assess the consensus between judges who are asked to code a number of items into mutually exclusive qualitative categories. [4] Measures of this latter type have been proposed by several researchers, including Perrault and Leigh in 1989. [5]

Related Research Articles

<span class="mw-page-title-main">Median</span> Middle quantile of a data set or probability distribution

In statistics and probability theory, the median is the value separating the higher half from the lower half of a data sample, a population, or a probability distribution. For a data set, it may be thought of as "the middle" value. The basic feature of the median in describing data compared to the mean is that it is not skewed by a small proportion of extremely large or small values, and therefore provides a better representation of a "typical" value. Median income, for example, may be a better way to suggest what a "typical" income is, because income distribution can be very skewed. The median is of central importance in robust statistics, as it is the most resistant statistic, having a breakdown point of 50%: so long as no more than half the data are contaminated, the median is not an arbitrarily large or small result.

<span class="mw-page-title-main">Outlier</span> Observation far apart from others in statistics and data science

In statistics, an outlier is a data point that differs significantly from other observations. An outlier may be due to a variability in the measurement, an indication of novel data, or it may be the result of experimental error; the latter are sometimes excluded from the data set. An outlier can be an indication of exciting possibility, but can also cause serious problems in statistical analyses.

Statistical bias is a systematic tendency which causes differences between results and facts. The bias exists in numbers of the process of data analysis, including the source of the data, the estimator chosen, and the ways the data was analyzed. Bias may have a serious impact on results, for example, to investigate people's buying habits. If the sample size is not large enough, the results may not be representative of the buying habits of all the people. That is, there may be discrepancies between the survey results and the actual results. Therefore, understanding the source of statistical bias can help to assess whether the observed results are close to the real results.

Cronbach's alpha, also known as tau-equivalent reliability or coefficient alpha, is a reliability coefficient that provides a method of measuring internal consistency of tests and measures. Numerous studies warn against using it unconditionally, and note that reliability coefficients based on structural equation modeling (SEM) are in many cases a suitable alternative.

In mathematical optimization and decision theory, a loss function or cost function is a function that maps an event or values of one or more variables onto a real number intuitively representing some "cost" associated with the event. An optimization problem seeks to minimize a loss function. An objective function is either a loss function or its opposite, in which case it is to be maximized. The loss function could include terms from several levels of the hierarchy.

<span class="mw-page-title-main">Content analysis</span> Research method for studying documents and communication artifacts

Content analysis is the study of documents and communication artifacts, which might be texts of various formats, pictures, audio or video. Social scientists use content analysis to examine patterns in communication in a replicable and systematic manner. One of the key advantages of using content analysis to analyse social phenomena is its non-invasive nature, in contrast to simulating social experiences or collecting survey answers.

In statistics and research, internal consistency is typically a measure based on the correlations between different items on the same test. It measures whether several items that propose to measure the same general construct produce similar scores. For example, if a respondent expressed agreement with the statements "I like to ride bicycles" and "I've enjoyed riding bicycles in the past", and disagreement with the statement "I hate bicycles", this would be indicative of good internal consistency of the test.

<span class="mw-page-title-main">Giant Metrewave Radio Telescope</span>

The Giant Metrewave Radio Telescope (GMRT), located near Pune, Junnar, near Narayangaon at khodad in India, is an array of thirty fully steerable parabolic radio telescopes of 45 metre diameter, observing at metre wavelengths. It is operated by the National Centre for Radio Astrophysics (NCRA), a part of the Tata Institute of Fundamental Research, Mumbai. It was conceived and built under the direction of Late Prof. Govind Swarup during 1984 to 1996. It is an interferometric array with baselines of up to 25 kilometres (16 mi). It was recently upgraded with new receivers, after which it is also known as the Upgraded Giant Metrewave Radio Telescope (uGMRT).

In robust statistics, robust regression seeks to overcome some limitations of traditional regression analysis. A regression analysis models the relationship between one or more independent variables and a dependent variable. Standard types of regression, such as ordinary least squares, have favourable properties if their underlying assumptions are true, but can give misleading results otherwise. Robust regression methods are designed to limit the effect that violations of assumptions by the underlying data-generating process have on regression estimates.

Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from a parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard deviations; under this model, non-robust methods like a t-test work poorly.

A rating scale is a set of categories designed to elicit information about a quantitative or a qualitative attribute. In the social sciences, particularly psychology, common examples are the Likert response scale and 1-10 rating scales in which a person selects the number which is considered to reflect the perceived quality of a product.

In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct. As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research. CFA was first developed by Jöreskog (1969) and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959).

In statistics, the t-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. It is used in hypothesis testing via Student's t-test. The t-statistic is used in a t-test to determine whether to support or reject the null hypothesis. It is very similar to the z-score but with the difference that t-statistic is used when the sample size is small or the population standard deviation is unknown. For example, the t-statistic is used in estimating the population mean from a sampling distribution of sample means if the population standard deviation is unknown. It is also used along with p-value when running hypothesis tests where the p-value tells us what the odds are of the results to have happened.

<i>Psychometrika</i> Academic journal

Psychometrika is the official journal of the Psychometric Society, a professional body devoted to psychometrics and quantitative psychology. The journal covers quantitative methods for measurement and evaluation of human behavior, including statistical methods and other mathematical techniques. Past editors include Marion Richardson, Dorothy Adkins, Norman Cliff, and Willem J. Heiser. According to Journal Citation Reports, the journal had a 2019 impact factor of 1.959.

The item-total correlation test arises in psychometrics in contexts where a number of tests or questions are given to an individual and where the problem is to construct a useful single quantity for each individual that can be used to compare that individual with others in a given population. The test is used to see if any of the tests or questions ("items") do not have responses that vary in line with those for other tests across the population. The summary measure would be an average of some form, weighted where necessary, and the item-correlation test is used to decide whether or not responses to a given test should be included in the set being averaged. In some fields of application such a summary measure is called a scale.

Willem Egbert (Wim) Saris is a Dutch sociologist and Emeritus Professor of Statistics and Methodology, especially known for his work on "Causal modelling in non-experimental research" and measurement errors.

Bruce Cooil is The Dean Samuel B. and Evelyn R. Richmond Professor of Management at Vanderbilt University in the Owen Graduate School of Management. His main areas of research are statistical modelling and its application to decrease mortality and morbidity rates due to coronary heart disease and what can be done to improve the healthcare of impoverished regions like Mozambique.

In statistical models applied to psychometrics, congeneric reliability a single-administration test score reliability coefficient, commonly referred to as composite reliability, construct reliability, and coefficient omega. is a structural equation model(SEM)-based reliability coefficients and is obtained from on a unidimensional model. is the second most commonly used reliability factor after tau-equivalent reliability(), and is often recommended as its alternative.

Murray Aitkin is an Australian statistician who specialises in statistical models. He attained his BSc, PhD, and DSc in Sydney University for mathematical statistics in 1961, 1966 and 1997, respectively.

<span class="mw-page-title-main">Homoscedasticity and heteroscedasticity</span> Statistical property

In statistics, a sequence of random variables is homoscedastic if all its random variables have the same finite variance. This is also known as homogeneity of variance. The complementary notion is called heteroscedasticity. The spellings homoskedasticity and heteroskedasticity are also frequently used.

References

  1. Upton, Graham J. G. (2008). A dictionary of statistics. Ian Cook (2nd ed., rev ed.). Oxford: Oxford University Press. ISBN   978-0-19-954145-4. OCLC   191929569.
  2. Cooil, Bruce; Rust, Roland T. (1994-06-01). "Reliability and expected loss: A unifying principle". Psychometrika. 59 (2): 203–216. doi:10.1007/BF02295184. ISSN   1860-0980. S2CID   122165746.
  3. Winer, B. J. (1962). "Statistical principles in experimental design". doi:10.1037/11774-000. hdl:2027/mdp.39015002001249.{{cite journal}}: Cite journal requires |journal= (help)
  4. Cooil, Bruce; Rust, Roland T. (1995-06-01). "General estimators for the reliability of qualitative data". Psychometrika. 60 (2): 199–220. doi:10.1007/BF02301413. ISSN   1860-0980. S2CID   121776134.
  5. Perreault, William D.; Leigh, Laurence E. (May 1989). "Reliability of Nominal Data Based on Qualitative Judgments". Journal of Marketing Research. 26 (2): 135–148. doi:10.1177/002224378902600201. ISSN   0022-2437. S2CID   144279197.

Further reading