Internal consistency

Last updated

In statistics and research, internal consistency is typically a measure based on the correlations between different items on the same test (or the same subscale on a larger test). It measures whether several items that propose to measure the same general construct produce similar scores. For example, if a respondent expressed agreement with the statements "I like to ride bicycles" and "I've enjoyed riding bicycles in the past", and disagreement with the statement "I hate bicycles", this would be indicative of good internal consistency of the test.

Contents

Cronbach's alpha

Internal consistency is usually measured with Cronbach's alpha, a statistic calculated from the pairwise correlations between items. Internal consistency ranges between negative infinity and one. Coefficient alpha will be negative whenever there is greater within-subject variability than between-subject variability. [1]

A commonly accepted rule of thumb for describing internal consistency is as follows: [2]

Cronbach's alpha Internal consistency
0.9 ≤ αExcellent
0.8 ≤ α < 0.9Good
0.7 ≤ α < 0.8Acceptable
0.6 ≤ α < 0.7Questionable
0.5 ≤ α < 0.6Poor
α < 0.5Unacceptable

Very high reliabilities (0.95 or higher) are not necessarily desirable, as this indicates that the items may be redundant. [3] The goal in designing a reliable instrument is for scores on similar items to be related (internally consistent), but for each to contribute some unique information as well. Note further that Cronbach's alpha is necessarily higher for tests measuring more narrow constructs, and lower when more generic, broad constructs are measured. This phenomenon, along with a number of other reasons, argue against using objective cut-off values for internal consistency measures. [4] Alpha is also a function of the number of items, so shorter scales will often have lower reliability estimates yet still be preferable in many situations because they are lower burden.

An alternative way of thinking about internal consistency is that it is the extent to which all of the items of a test measure the same latent variable. The advantage of this perspective over the notion of a high average correlation among the items of a test – the perspective underlying Cronbach's alpha – is that the average item correlation is affected by skewness (in the distribution of item correlations) just as any other average is. Thus, whereas the modal item correlation is zero when the items of a test measure several unrelated latent variables, the average item correlation in such cases will be greater than zero. Thus, whereas the ideal of measurement is for all items of a test to measure the same latent variable, alpha has been demonstrated many times to attain quite high values even when the set of items measures several unrelated latent variables. [5] [6] [7] [8] [9] [10] [11] The hierarchical "coefficient omega" may be a more appropriate index of the extent to which all of the items in a test measure the same latent variable. [12] [13] Several different measures of internal consistency are reviewed by Revelle & Zinbarg (2009). [14] [15]

See also

Related Research Articles

Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, factor analysis, experimental designs, and Bayesian statistics. The article also discusses journals in the same field.

Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally covers specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:

"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores."

Cronbach's alpha, also known as tau-equivalent reliability or coefficient alpha, is a reliability coefficient and a measure of the internal consistency of tests and measures.

The Spearman–Brown prediction formula, also known as the Spearman–Brown prophecy formula, is a formula relating psychometric reliability to test length and used by psychometricians to predict the reliability of a test after changing the test length. The method was published independently by Spearman (1910) and Brown (1910).

Classical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score (error-free score) and an error score. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests.

In psychometrics, the Kuder–Richardson formulas, first published in 1937, are a measure of internal consistency reliability for measures with dichotomous choices. They were developed by Kuder and Richardson.

Cyril J. Hoyt was an American educationalist and academic. A long-time member of the University of Minnesota Department of Educational Psychology, he was known as the author of Hoyt's coefficient of reliability.

Quantitative psychology is a field of scientific study that focuses on the mathematical modeling, research design and methodology, and statistical analysis of psychological processes. It includes tests and other devices for measuring cognitive abilities. Quantitative psychologists develop and analyze a wide variety of research methods, including those of psychometrics, a field concerned with the theory and technique of psychological measurement.

Lee Joseph Cronbach was an American educational psychologist who made contributions to psychological testing and measurement.

William Roger Revelle is a psychology professor at Northwestern University working in personality psychology. Revelle studies the biological basis of personality and motivation, psychometric theory, the structure of daily mood, and models of attention and memory.

In statistics, scale analysis is a set of methods to analyze survey data, in which responses to questions are combined to measure a latent variable. These items can be dichotomous or polytomous. Any measurement for such data is required to be reliable, valid, and homogeneous with comparable results over different studies.

In statistics, inter-rater reliability is the degree of agreement among independent observers who rate, code, or assess the same phenomenon.

Generalizability theory, or G theory, is a statistical framework for conceptualizing, investigating, and designing reliable observations. It is used to determine the reliability of measurements under specific conditions. It is particularly useful for assessing the reliability of performance assessments. It was originally introduced in Cronbach, L.J., Rajaratnam, N., & Gleser, G.C. (1963).

In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct. As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research. CFA was first developed by Jöreskog (1969) and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959).

Peter Hans Schönemann was a German born psychometrician and statistical expert. He was professor emeritus in the Department of Psychological Sciences at Purdue University. His research interests included multivariate statistics, multidimensional scaling and measurement, quantitative behavior genetics, test theory and mathematical tools for social scientists. He published around 90 papers dealing mainly with the subjects of psychometrics and mathematical scaling. Schönemann's influences included Louis Guttman, Lee Cronbach, Oscar Kempthorne and Henry Kaiser.

Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

<i>Psychometrika</i> Academic journal

Psychometrika is the official journal of the Psychometric Society, a professional body devoted to psychometrics and quantitative psychology. The journal covers quantitative methods for measurement and evaluation of human behavior, including statistical methods and other mathematical techniques. Past editors include Marion Richardson, Dorothy Adkins, Norman Cliff, and Willem J. Heiser. According to Journal Citation Reports, the journal had a 2019 impact factor of 1.959.

<span class="mw-page-title-main">Exploratory factor analysis</span> Statistical method in psychology

In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables. It is commonly used by researchers when developing a scale and serves to identify a set of latent constructs underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. Measured variables are any one of several attributes of people that may be observed and measured. Examples of measured variables could be the physical height, weight, and pulse rate of a human being. Usually, researchers would have a large number of measured variables, which are assumed to be related to a smaller number of "unobserved" factors. Researchers must carefully consider the number of measured variables to include in the analysis. EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis.

In statistical models applied to psychometrics, congeneric reliability a single-administration test score reliability coefficient, commonly referred to as composite reliability, construct reliability, and coefficient omega. is a structural equation model (SEM)-based reliability coefficients and is obtained from on a unidimensional model. is the second most commonly used reliability factor after tau-equivalent reliability(; also known as Cronbach's alpha), and is often recommended as its alternative.

References

  1. Knapp, T. R. (1991). Coefficient alpha: Conceptualizations and anomalies. Research in Nursing & Health, 14, 457-480.
  2. George, D., & Mallery, P. (2003). SPSS for Windows step by step: A simple guide and reference. 11.0 update (4th ed.). Boston: Allyn & Bacon.
  3. Streiner, D. L. (2003) Starting at the beginning: an introduction to coefficient alpha and internal consistency, Journal of Personality Assessment, 80, 99-103
  4. Peters, G.-J. Y (2014) The alpha and the omega of scale reliability and validity: Why and how to abandon Cronbach’s alpha and the route towards more comprehensive assessment of scale quality. European Health Psychologist, 16 (2). URL: http://ehps.net/ehp/index.php/contents/article/download/ehp.v16.i2.p56/1
  5. Cortina. J. M. (1993). What is coefficient alpha? An examination of theory and applications. Journal of Applied Psychology, 78, 98104.
  6. Cronbach, L. J. (1951). Coefficient alpha and the internal structure of tests. Psychometrika, 16(3), 297334.
  7. Green, S. B., Lissitz, R.W., & Mulaik, S. A. (1977). Limitations of coefficient alpha as an index of test unidimensionality. Educational and Psychological Measurement, 37, 827838.
  8. Revelle, W. (1979). Hierarchical cluster analysis and the internal structure of tests. Multivariate Behavioral Research, 14, 5774.
  9. Schmitt, N. (1996). Uses and abuses of coefficient alpha. Psychological Assessment, 8, 350353.
  10. Zinbarg, R., Yovel, I., Revelle, W. & McDonald, R. (2006). Estimating generalizability to a universe of indicators that all have an attribute in common: A comparison of estimators for . Applied Psychological Measurement, 30, 121144.
  11. Trippi, R. & Settle, R. (1976). A Nonparametric Coefficient of Internal Consistency. Multivariate Behavioral Research, 4, 419-424. URL: http://www.sigma-research.com/misc/Nonparametric%20Coefficient%20of%20Internal%20Consistency.htm
  12. McDonald, R. P. (1999). Test theory: A unified treatment. Psychology Press. ISBN   0-8058-3075-8
  13. Zinbarg, R., Revelle, W., Yovel, I. & Li, W. (2005). Cronbach’s α, Revelle’s β, and McDonald’s ωH: Their relations with each other and two alternative conceptualizations of reliability. Psychometrika, 70, 123133.
  14. Revelle, W., Zinbarg, R. (2009) "Coefficients Alpha, Beta, Omega, and the glb: Comments on Sijtsma", Psychometrika, 74(1), 145154.
  15. Dunn, T. J., Baguley, T. and Brunsden, V. (2013), From alpha to omega: A practical solution to the pervasive problem of internal consistency estimation. British Journal of Psychology. doi: 10.1111/bjop.12046