Psychological statistics

Last updated

Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, factor analysis, experimental designs, and Bayesian statistics. The article also discusses journals in the same field. [1]

Contents

Psychometrics

Psychometrics deals with measurement of psychological attributes. It involves developing and applying statistical models for mental measurements. [2] The measurement theories are divided into two major areas: (1) Classical test theory; (2) Item Response Theory. [3]

Classical test theory

The classical test theory or true score theory or reliability theory in statistics is a set of statistical procedures useful for development of psychological tests and scales. It is based on a fundamental equation, X = T + E where, X is total score, T is a true score and E is error of measurement. For each participant, it assumes that there exist a true score and it need to be obtained score (X) has to be as close to it as possible. [2] [4] The closeness of X has with T is expressed in terms of ratability of the obtained score. The reliability in terms of classical test procedure is correlation between true score and obtained score. The typical test construction procedures has following steps:

(1) Determine the construct (2) Outline the behavioral domain of the construct (3) Write 3 to 5 times more items than desired test length (4) Get item content analyzed by experts and cull items (5) Obtain data on initial version of the test (6) Item analysis (Statistical Procedure) (7) Factor analysis (Statistical Procedure) (8) After the second cull, make final version (9) Use it for research

Reliability

The reliability is computed in specific ways. (A) Inter-Rater reliability: Inter-Rater reliability is estimate of agreement between independent raters. This is most useful for subjective responses. Cohen's Kappa, Krippendorff's Alpha, Intra-Class correlation coefficients, Correlation coefficients, Kendal's concordance coefficient, etc. are useful statistical tools. (B) Test-Retest Reliability: Test-Retest Procedure is estimation of temporal consistency of the test. A test is administered twice to the same sample with a time interval. Correlation between two sets of scores is used as an estimate of reliability. Testing conditions are assumed to be identical. (C) Internal Consistency Reliability: Internal consistency reliability estimates consistency of items with each other. Split-half reliability (Spearman- Brown Prophecy) and Cronbach Alpha are popular estimates of this reliability. [5] (D) Parallel Form Reliability: It is an estimate of consistency between two different instruments of measurement. The inter-correlation between two parallel forms of a test or scale is used as an estimate of parallel form reliability.

Validity

Validity of a scale or test is ability of the instrument to measure what it purports to measure. [3] Construct validity, Content Validity, and Criterion Validity are types of validity. Construct validity is estimated by convergent and discriminant validity and factor analysis. Convergent and discriminant validity are ascertained by correlation between similar of different constructs. Content Validity: Subject matter experts evaluate content validity. Criterion Validity is correlation between the test and a criterion variable (or variables) of the construct. Regression analysis, Multiple regression analysis, and Logistic regression are used as an estimate of criterion validity. Software applications: The R software has ‘psych’ package that is useful for classical test theory analysis. [6]

Modern test theory

The modern test theory is based on latent trait model. Every item estimates the ability of the test taker. The ability parameter is called as theta (θ). The difficulty parameter is called b. the two important assumptions are local independence and unidimensionality. The Item Response Theory has three models. They are one parameter logistic model, two parameter logistic model and three parameter logistic model. In addition, Polychromous IRT Model are also useful. [7]

The R Software has ‘ltm’, packages useful for IRT analysis.

Factor analysis

Factor analysis is at the core of psychological statistics. It has two schools: (1) Exploratory Factor analysis (2) Confirmatory Factor analysis.

Exploratory factor analysis (EFA)

The exploratory factor analysis begins without a theory or with a very tentative theory. It is a dimension reduction technique. It is useful in psychometrics, multivariate analysis of data and data analytics. Typically a k-dimensional correlation matrix or covariance matrix of variables is reduced to k X r factor pattern matrix where r < k. Principal Component analysis and common factor analysis are two ways of extracting data. Principal axis factoring, ML factor analysis, alpha factor analysis and image factor analysis is most useful ways of EFA. It employs various factor rotation methods which can be classified into orthogonal (resulting in uncorrelated factors) and oblique (resulting correlated factors).

The ‘psych’ package in R is useful for EFA.

Confirmatory factor analysis (CFA)

Confirmatory Factor Analysis (CFA) is a factor analytic technique that begins with a theory and test the theory by carrying out factor analysis. The CFA is also called as latent structure analysis, which considers factor as latent variables causing actual observable variables. The basic equation of the CFA is

X = Λξ + δ

where, X is observed variables, Λ are structural coefficients, ξ are latent variables (factors) and δ are errors. The parameters are estimated using ML methods however; other methods of estimation are also available. The chi-square test is very sensitive and hence various fit measures are used. [8] [9] R package ‘sem’, ‘lavaan’ are useful for the same.

Experimental design

Experimental methods are very popular in psychology, going back more than 100 years. Experimental psychology is a sub-discipline of psychology . Statistical methods applied for designing and analyzing experimental psychological data include the t-test, ANOVA, ANCOVA, MANOVA, MANCOVA, binomial test, chi-square, etc.

Multivariate behavioral research

Multivariate behavioral research is becoming very popular in psychology. These methods include Multiple Regression and Prediction; Moderated and Mediated Regression Analysis; Logistics Regression; Canonical Correlations; Cluster analysis; Multi-level modeling; Survival-Failure analysis; Structural Equations Modeling; hierarchical linear modelling, etc. are very useful for psychological statistics. [10] [11] [9] [12] [13]

Journals for statistical applications for psychology

There are many specialized journals that publish advances in statistical analysis for psychology:

Software packages for psychological research

Various software packages are available for statistical methods for psychological research. They can be classified as commercial software (e.g., JMP and SPSS) and open-source (e.g., R). Among the open-source offerings, the R software is the most popular. There are many online references for R and specialized books on R for Psychologists are also being written. [14] The "psych" package of R is very useful for psychologists. Among others, "lavaan", "sem", "ltm", "ggplot2" are some of the popular packages. PSPP and KNIME are other free packages. Commercial packages include JMP, SPSS and SAS. JMP and SPSS are commonly reported in books.

See also

Notes

  1. Wilcox, R. (2012). Modern Statistics for the Social and Behavioral Sciences: A Practical Introduction. FL: CRC Press. ISBN   9781439834565
  2. 1 2 Lord, F. M., and Novick, M. R. ( 1 968). Statistical theories of mental test scores. Reading, Mass. : Addison-Wesley, 1968.
  3. 1 2 Nunnally, J. & Bernstein, I. (1994). Psychometric Theory. McGraw-Hill.
  4. Raykov, T. & Marcoulides, G.A. (2010) Introduction to Psychometric Theory. New York: Routledge.
  5. Cronbach LJ (1951). Coefficient alpha and the internal structure of tests. Psychometrika 16, 297–334. doi:10.1007/bf02310555
  6. Kline, T. J. B. (2005)Psychological Testing: A Practical Approach to Design and Evaluation. Sage Publications: Thousand Oaks.
  7. Hambleton, R. K., & Swaminathan H. (1985). Item Response theory: Principles and Applications. Boston: Kluwer.
  8. Bollen, KA. (1989). Structural Equations with Latent Variables. New York: John Wiley & Sons.
  9. 1 2 Loehlin, J. E. (1992). Latent Variable Models: An Introduction to Factor, Path, and Structural Analysis (2nd ed.). Hillsdale, NJ: Lawrence Erlbaum.
  10. Hayes, A. F. (2013). Introduction to mediation, moderation, and conditional process analysis. The Guilford Press: NY.
  11. Agresti, A. (1990). Categorical data analysis. Wiley: NJ.
  12. Menard, S. (2001). Applied logistic regression analysis. (2nd ed.). Thousand Oaks. CA: Sage Publications.
  13. Tabachnick, B. G., & Fidell, L. S. (2007). Using Multivariate Statistics, 5th ed. Boston: Allyn and Bacon.
  14. Belhekar, V. M. (2016). Statistics for Psychology Using R, New Delhi: SAGE. ISBN   9789385985003

Related Research Articles

Multivariate statistics is a subdivision of statistics encompassing the simultaneous observation and analysis of more than one outcome variable, i.e., multivariate random variables. Multivariate statistics concerns understanding the different aims and background of each of the different forms of multivariate analysis, and how they relate to each other. The practical application of multivariate statistics to a particular problem may involve several types of univariate and multivariate analyses in order to understand the relationships between variables and their relevance to the problem being studied.

Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally covers specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:

"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores."

Factor analysis is a statistical method used to describe variability among observed, correlated variables in terms of a potentially lower number of unobserved variables called factors. For example, it is possible that variations in six observed variables mainly reflect the variations in two unobserved (underlying) variables. Factor analysis searches for such joint variations in response to unobserved latent variables. The observed variables are modelled as linear combinations of the potential factors plus "error" terms, hence factor analysis can be thought of as a special case of errors-in-variables models.

Cronbach's alpha, also known as tau-equivalent reliability or coefficient alpha, is a reliability coefficient and a measure of the internal consistency of tests and measures.

Classical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score (error-free score) and an error score. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests.

In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments". By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.

In statistics and research, internal consistency is typically a measure based on the correlations between different items on the same test. It measures whether several items that propose to measure the same general construct produce similar scores. For example, if a respondent expressed agreement with the statements "I like to ride bicycles" and "I've enjoyed riding bicycles in the past", and disagreement with the statement "I hate bicycles", this would be indicative of good internal consistency of the test.

<span class="mw-page-title-main">Structural equation modeling</span> Form of causal modeling that fit networks of constructs to data

Structural equation modeling (SEM) is a diverse set of methods used by scientists doing both observational and experimental research. SEM is used mostly in the social and behavioral sciences but it is also used in epidemiology, business, and other fields. A definition of SEM is difficult without reference to technical language, but a good starting place is the name itself.

In statistics, latent variables are variables that can only be inferred indirectly through a mathematical model from other observable variables that can be directly observed or measured. Such latent variable models are used in many disciplines, including political science, demography, engineering, medicine, ecology, physics, machine learning/artificial intelligence, bioinformatics, chemometrics, natural language processing, management, psychology and the social sciences.

Quantitative psychology is a field of scientific study that focuses on the mathematical modeling, research design and methodology, and statistical analysis of psychological processes. It includes tests and other devices for measuring cognitive abilities. Quantitative psychologists develop and analyze a wide variety of research methods, including those of psychometrics, a field concerned with the theory and technique of psychological measurement.

In statistics, inter-rater reliability is the degree of agreement among independent observers who rate, code, or assess the same phenomenon.

Differential item functioning (DIF) is a statistical property of a test item that indicates how likely it is for individuals from distinct groups, possessing similar abilities, to respond differently to the item. It manifests when individuals from different groups, with comparable skill levels, do not have an equal likelihood of answering a question correctly. There are two primary types of DIF: uniform DIF, where one group consistently has an advantage over the other, and nonuniform DIF, where the advantage varies based on the individual's ability level. The presence of DIF requires review and judgment, but it doesn't always signify bias. DIF analysis provides an indication of unexpected behavior of items on a test. DIF characteristic of an item isn't solely determined by varying probabilities of selecting a specific response among individuals from different groups. Rather, DIF becomes pronounced when individuals from different groups, who possess the same underlying true ability, exhibit differing probabilities of giving a certain response. Even when uniform bias is present, test developers sometimes resort to assumptions such as DIF biases may offset each other due to the extensive work required to address it, compromising test ethics and perpetuating systemic biases. Common procedures for assessing DIF are Mantel-Haenszel procedure, logistic regression, item response theory (IRT) based methods, and confirmatory factor analysis (CFA) based methods.

In statistics, confirmatory factor analysis (CFA) is a special form of factor analysis, most commonly used in social science research. It is used to test whether measures of a construct are consistent with a researcher's understanding of the nature of that construct. As such, the objective of confirmatory factor analysis is to test whether the data fit a hypothesized measurement model. This hypothesized model is based on theory and/or previous analytic research. CFA was first developed by Jöreskog (1969) and has built upon and replaced older methods of analyzing construct validity such as the MTMM Matrix as described in Campbell & Fiske (1959).

Nambury S. Raju was an American psychology professor known for his work in psychometrics, meta-analysis, and utility theory. He was a Fellow of the Society of Industrial Organizational Psychology.

Karl Gustav Jöreskog is a Swedish statistician. Jöreskog is a professor emeritus at Uppsala University, and a co-author of the LISREL statistical program. He is also a member of the Royal Swedish Academy of Sciences. Jöreskog received his bachelor's, master's, and doctoral degrees at Uppsala University. He is also a former student of Herman Wold. He was a statistician at Educational Testing Service (ETS) and a visiting professor at Princeton University.

Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

<span class="mw-page-title-main">Exploratory factor analysis</span> Statistical method in psychology

In multivariate statistics, exploratory factor analysis (EFA) is a statistical method used to uncover the underlying structure of a relatively large set of variables. EFA is a technique within factor analysis whose overarching goal is to identify the underlying relationships between measured variables. It is commonly used by researchers when developing a scale and serves to identify a set of latent constructs underlying a battery of measured variables. It should be used when the researcher has no a priori hypothesis about factors or patterns of measured variables. Measured variables are any one of several attributes of people that may be observed and measured. Examples of measured variables could be the physical height, weight, and pulse rate of a human being. Usually, researchers would have a large number of measured variables, which are assumed to be related to a smaller number of "unobserved" factors. Researchers must carefully consider the number of measured variables to include in the analysis. EFA procedures are more accurate when each factor is represented by multiple measured variables in the analysis.

Daniel John Bauer is an American statistician, professor, and director of the quantitative psychology program at the University of North Carolina, where he is also on the faculty at the Center for Developmental Science. He is known for rigorous methodological work on latent variable models and is a proponent of integrative data analysis, a meta-analytic technique that pools raw data across multiple independent studies.

References