Standard score

Last updated

Compares the various grading methods in a normal distribution. Includes: Standard deviations, cumulative percentages, percentile equivalents, Z-scores, T-scores The Normal Distribution.svg
Compares the various grading methods in a normal distribution. Includes: Standard deviations, cumulative percentages, percentile equivalents, Z-scores, T-scores

In statistics, the standard score is the number of standard deviations by which the value of a raw score (i.e., an observed value or data point) is above or below the mean value of what is being observed or measured. Raw scores above the mean have positive standard scores, while those below the mean have negative standard scores.

Contents

It is calculated by subtracting the population mean from an individual raw score and then dividing the difference by the population standard deviation. This process of converting a raw score into a standard score is called standardizing or normalizing (however, "normalizing" can refer to many types of ratios; see normalization for more).

Standard scores are most commonly called z-scores; the two terms may be used interchangeably, as they are in this article. Other terms include z-values, normal scores, standardized variables and pull in High Energy Physics. [1]

Computing a z-score requires knowing the mean and standard deviation of the complete population to which a data point belongs; if one only has a sample of observations from the population, then the analogous computation with sample mean and sample standard deviation yields the t-statistic.

Calculation

If the population mean and population standard deviation are known, a raw score x is converted into a standard score by [2]

where:

μ is the mean of the population.
σ is the standard deviation of the population.

The absolute value of z represents the distance between that raw score x and the population mean in units of the standard deviation. z is negative when the raw score is below the mean, positive when above.

Calculating z using this formula requires the population mean and the population standard deviation, not the sample mean or sample deviation. But knowing the true mean and standard deviation of a population is often unrealistic except in cases such as standardized testing, where the entire population is measured.

When the population mean and the population standard deviation are unknown, the standard score may be calculated using the sample mean and sample standard deviation as estimates of the population values. [3] [4] [5] [6]

In these cases, the z-score is

where:

is the mean of the sample.
S is the standard deviation of the sample.

In either case, since the numerator and denominator of the equation must both be expressed in the same units of measure, and since the units cancel out through division, z is left as a dimensionless quantity.

Applications

Z-test

The z-score is often used in the z-test in standardized testing – the analog of the Student's t-test for a population whose parameters are known, rather than estimated. As it is very unusual to know the entire population, the t-test is much more widely used.

Prediction intervals

The standard score can be used in the calculation of prediction intervals. A prediction interval [L,U], consisting of a lower endpoint designated L and an upper endpoint designated U, is an interval such that a future observation X will lie in the interval with high probability , i.e.

For the standard score Z of X it gives: [7]

By determining the quantile z such that

it follows:

Process control

In process control applications, the Z value provides an assessment of how off-target a process is operating.

Comparison of scores measured on different scales: ACT and SAT

The z score for Student A was 1, meaning Student A was 1 standard deviation above the mean. Thus, Student A performed in the 84.13 percentile on the SAT. Z score for Students A.png
The z score for Student A was 1, meaning Student A was 1 standard deviation above the mean. Thus, Student A performed in the 84.13 percentile on the SAT.

When scores are measured on different scales, they may be converted to z-scores to aid comparison. Dietz et al. [8] give the following example comparing student scores on the (old)SAT and ACT high school tests. The table shows the mean and standard deviation for total score on the SAT and ACT. Suppose that student A scored 1800 on the SAT, and student B scored 24 on the ACT. Which student performed better relative to other test-takers?

SATACT
Mean150021
Standard deviation3005
The z score for Student B was .6, meaning Student B was .6 standard deviation above the mean. Thus, Student B performed in the 72.57 percentile on the SAT. Z score for Student B.png
The z score for Student B was .6, meaning Student B was .6 standard deviation above the mean. Thus, Student B performed in the 72.57 percentile on the SAT.

The z-score for student A is

The z-score for student B is

Because student A has a higher z-score than student B, student A performed better compared to other test-takers than did student B.

Percentage of observations below a z-score

Continuing the example of ACT and SAT scores, if it can be further assumed that both ACT and SAT scores are normally distributed (which is approximately correct), then the z-scores may be used to calculate the percentage of test-takers who received lower scores than students A and B.

Cluster analysis and multidimensional scaling

"For some multivariate techniques such as multidimensional scaling and cluster analysis, the concept of distance between the units in the data is often of considerable interest and importance … When the variables in a multivariate data set are on different scales, it makes more sense to calculate the distances after some form of standardization." [9]

Principal components analysis

In principal components analysis, "Variables measured on different scales or on a common scale with widely differing ranges are often standardized." [10]

Relative importance of variables in multiple regression: Standardized regression coefficients

Standardization of variables prior to multiple regression analysis is sometimes used as an aid to interpretation. [11] (page 95) state the following.

"The standardized regression slope is the slope in the regression equation if X and Y are standardized… Standardization of X and Y is done by subtracting the respective means from each set of observations and dividing by the respective standard deviations… In multiple regression, where several X variables are used, the standardized regression coefficients quantify the relative contribution of each X variable."

However, Kutner et al. [12] (p 278) give the following caveat: "… one must be cautious about interpreting any regression coefficients, whether standardized or not. The reason is that when the predictor variables are correlated among themselves, … the regression coefficients are affected by the other predictor variables in the model … The magnitudes of the standardized regression coefficients are affected not only by the presence of correlations among the predictor variables but also by the spacings of the observations on each of these variables. Sometimes these spacings may be quite arbitrary. Hence, it is ordinarily not wise to interpret the magnitudes of standardized regression coefficients as reflecting the comparative importance of the predictor variables."

Standardizing in mathematical statistics

In mathematical statistics, a random variable X is standardized by subtracting its expected value and dividing the difference by its standard deviation

If the random variable under consideration is the sample mean of a random sample of X:

then the standardized version is

T-score

In educational assessment, T-score is a standard score Z shifted and scaled to have a mean of 50 and a standard deviation of 10. [13] [14] [15]

In bone density measurements, the T-score is the standard score of the measurement compared to the population of healthy 30-year-old adults. [16]

See also

Related Research Articles

In probability theory and statistics, kurtosis is a measure of the "tailedness" of the probability distribution of a real-valued random variable. Like skewness, kurtosis describes the shape of a probability distribution and there are different ways of quantifying it for a theoretical distribution and corresponding ways of estimating it from a sample from a population. Different measures of kurtosis may have different interpretations.

Normal distribution Probability distribution

In probability theory, a normaldistribution is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is

Standard deviation Measure of the amount of variation or dispersion of a set of values

In statistics, the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

Skewness measure of the asymmetry of random variables

In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.

Variance Statistical measure of how far values spread from their average

In probability theory and statistics, variance is the expectation of the squared deviation of a random variable from its mean. In other words, it measures how far a set of numbers is spread out from their average value. Variance has a central role in statistics, where some ideas that use it include descriptive statistics, statistical inference, hypothesis testing, goodness of fit, and Monte Carlo sampling. Variance is an important tool in the sciences, where statistical analysis of data is common. The variance is the square of the standard deviation, the second central moment of a distribution, and the covariance of the random variable with itself, and it is often represented by , , or .

Multivariate normal distribution Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

Log-normal distribution Probability distribution

In probability theory, a log-normal distribution is a continuous probability distribution of a random variable whose logarithm is normally distributed. Thus, if the random variable X is log-normally distributed, then Y = ln(X) has a normal distribution. Equivalently, if Y has a normal distribution, then the exponential function of Y, X = exp(Y), has a log-normal distribution. A random variable which is log-normally distributed takes only positive real values. It is a convenient and useful model for measurements in exact and engineering sciences, as well as medicine, economics and other topics.

Students <i>t</i>-distribution Probability distribution

In probability and statistics, Student's t-distribution is any member of a family of continuous probability distributions that arise when estimating the mean of a normally-distributed population in situations where the sample size is small and the population's standard deviation is unknown. It was developed by English statistician William Sealy Gosset under the pseudonym "Student".

In probability theory, Chebyshev's inequality guarantees that, for a wide class of probability distributions, no more than a certain fraction of values can be more than a certain distance from the mean. Specifically, no more than 1/k2 of the distribution's values can be k or more standard deviations away from the mean. The rule is often called Chebyshev's theorem, about the range of standard deviations around the mean, in statistics. The inequality has great utility because it can be applied to any probability distribution in which the mean and variance are defined. For example, it can be used to prove the weak law of large numbers.

Correlation and dependence Statistical concept

In statistics, correlation or dependence is any statistical relationship, whether causal or not, between two random variables or bivariate data. In the broadest sense correlation is any statistical association, though it commonly refers to the degree to which a pair of variables are linearly related. Familiar examples of dependent phenomena include the correlation between the height of parents and their offspring, and the correlation between the price of a good and the quantity the consumers are willing to purchase, as it is depicted in the so-called demand curve.

In probability theory and statistics, a standardized moment of a probability distribution is a moment that is normalized. The normalization is typically a division by an expression of the standard deviation which renders the moment scale invariant. This has the advantage that such normalized moments differ only in other properties than variability, facilitating e.g. comparison of shape of different probability distributions.

Pearson correlation coefficient Measure of linear correlation

In statistics, the Pearson correlation coefficient, also referred to as Pearson's r, the Pearson product-moment correlation coefficient (PPMCC), or the bivariate correlation, is a measure of linear correlation between two sets of data. It is the covariance of two variables, divided by the product of their standard deviations; thus it is essentially a normalised measurement of the covariance, such that the result always has a value between −1 and 1. As with covariance itself, the measure can only reflect a linear correlation of variables, and ignores many other types of relationship or correlation. As a simple example, one would expect the age and height of a sample of teenagers from a high school to have a Pearson correlation coefficient significantly greater than 0, but less than 1.

<i>Z</i>-test

A Z-test is any statistical test for which the distribution of the test statistic under the null hypothesis can be approximated by a normal distribution. Z-test tests the mean of a distribution. For each significance level in the confidence interval, the Z-test has a single critical value which makes it more convenient than the Student's t-test whose critical values are defined by the sample size.

In statistics, an effect size is a number measuring the strength of the relationship between two variables in a population, or a sample-based estimate of that quantity. It can refer to the value of a statistic calculated from a sample of data, the value of a parameter for a hypothetical population, or to the equation that operationalizes how statistics or parameters lead to the effect size value. Examples of effect sizes include the correlation between two variables, the regression coefficient in a regression, the mean difference, or the risk of a particular event happening. Effect sizes complement statistical hypothesis testing, and play an important role in power analyses, sample size planning, and in meta-analyses. The cluster of data-analysis methods concerning effect sizes is referred to as estimation statistics.

In statistics and optimization, errors and residuals are two closely related and easily confused measures of the deviation of an observed value of an element of a statistical sample from its "theoretical value". The error of an observed value is the deviation of the observed value from the (unobservable) true value of a quantity of interest, and the residual of an observed value is the difference between the observed value and the estimated value of the quantity of interest. The distinction is most important in regression analysis, where the concepts are sometimes called the regression errors and regression residuals and where they lead to the concept of studentized residuals.

In statistical inference, specifically predictive inference, a prediction interval is an estimate of an interval in which a future observation will fall, with a certain probability, given what has already been observed. Prediction intervals are often used in regression analysis.

In probability theory and statistics, the coefficient of variation (CV), also known as relative standard deviation (RSD), is a standardized measure of dispersion of a probability distribution or frequency distribution. It is often expressed as a percentage, and is defined as the ratio of the standard deviation to the mean . The CV or RSD is widely used in analytical chemistry to express the precision and repeatability of an assay. It is also commonly used in fields such as engineering or physics when doing quality assurance studies and ANOVA gauge R&R. In addition, CV is utilized by economists and investors in economic models.

The following is a glossary of terms used in the mathematical sciences statistics and probability.

68–95–99.7 rule Shorthand used in statistics

In statistics, the 68–95–99.7 rule, also known as the empirical rule, is a shorthand used to remember the percentage of values that lie within an interval estimate in a normal distribution: 68%, 95%, and 99.7% of the values lie within one, two, and three standard deviations of the mean, respectively.

In statistics, the t-statistic is the ratio of the departure of the estimated value of a parameter from its hypothesized value to its standard error. It is used in hypothesis testing via Student's t-test. The t-statistic is used in a t-test to determine whether to support or reject the null hypothesis. It is very similar to the Z-score but with the difference that t-statistic is used when the sample size is small or the population standard deviation is unknown. For example, the t-statistic is used in estimating the population mean from a sampling distribution of sample means if the population standard deviation is unknown. It is also used along with p-value when running hypothesis tests where the p-value tells us what the odds are of the results to have happened.

References

  1. https://e-publishing.cern.ch/index.php/CYRSP/article/download/303/405/2022.Missing or empty |title= (help)
  2. E. Kreyszig (1979). Advanced Engineering Mathematics (Fourth ed.). Wiley. p. 880, eq. 5. ISBN   0-471-02140-7.
  3. Spiegel, Murray R.; Stephens, Larry J (2008), Schaum's Outlines Statistics (Fourth ed.), McGraw Hill, ISBN   978-0-07-148584-5
  4. Mendenhall, William; Sincich, Terry (2007), Statistics for Engineering and the Sciences (Fifth ed.), Pearson / Prentice Hall, ISBN   978-0131877061
  5. Glantz, Stanton A.; Slinker, Bryan K.; Neilands, Torsten B. (2016), Primer of Applied Regression & Analysis of Variance (Third ed.), McGraw Hill, ISBN   978-0071824118
  6. Aho, Ken A. (2014), Foundational and Applied Statistics for Biologists (First ed.), Chapman & Hall / CRC Press, ISBN   978-1439873380
  7. E. Kreyszig (1979). Advanced Engineering Mathematics (Fourth ed.). Wiley. p. 880, eq. 6. ISBN   0-471-02140-7.
  8. Diez, David; Barr, Christopher; Çetinkaya-Rundel, Mine (2012), OpenIntro Statistics (Second ed.), openintro.org
  9. Everitt, Brian; Hothorn, Torsten J (2011), An Introduction to Applied Multivariate Analysis with R, Springer, ISBN   978-1441996497
  10. Johnson, Richard; Wichern, Wichern (2007), Applied Multivariate Statistical Analysis, Pearson / Prentice Hall
  11. Afifi, Abdelmonem; May, Susanne K.; Clark, Virginia A. (2012), Practical Multivariate Analysis (Fifth ed.), Chapman & Hall/CRC, ISBN   978-1439816806
  12. Kutner, Michael; Nachtsheim, Christopher; Neter, John (204), Applied Linear Regression Models (Fourth ed.), McGraw Hill, ISBN   978-0073014661
  13. John Salvia; James Ysseldyke; Sara Witmer (29 January 2009). Assessment: In Special and Inclusive Education. Cengage Learning. pp. 43–. ISBN   0-547-13437-1.
  14. Edward S. Neukrug; R. Charles Fawcett (1 January 2014). Essentials of Testing and Assessment: A Practical Guide for Counselors, Social Workers, and Psychologists. Cengage Learning. pp. 133–. ISBN   978-1-305-16183-2.
  15. Randy W. Kamphaus (16 August 2005). Clinical Assessment of Child and Adolescent Intelligence. Springer. pp. 123–. ISBN   978-0-387-26299-4.
  16. "Bone Mass Measurement: What the Numbers Mean". NIH Osteoporosis and Related Bone Diseases National Resource Center. National Institute of Health. Retrieved 5 August 2017.

Further reading