Measuring the Mind

Last updated
Measuring the Mind
Measuring the Mind.jpg
Author Denny Borsboom
Publisher Cambridge University Press
Publication date
2005
Media typeHardcover
Pages183
ISBN 978-0-521-84463-5
OCLC 254153121
150.15195 22
LC Class BF39 .B693 2005

Measuring the Mind: Conceptual Issues in Contemporary Psychometrics [1] is a book by Dutch academic Denny Borsboom, Assistant Professor of Psychological Methods at the University of Amsterdam, at time of publication. [2] The book discusses the extent to which psychology can measure mental attributes such as intelligence and examines the philosophical issues that arise from such attempts.

Contents

The book examines three major models within psychometrics; classical test theory/true scores, latent variables/item response theory and representational measurement theory. Each theory is examined against three perspectives or “stances”:

The book also examines the relationship between the three models and finally ends with a discussion on the concept of validity.

Book Structure

The book has six chapters including an introduction. Three chapters are devoted to each model.

1. Introduction

Borsboom discusses the importance of psychological testing and hence the importance of measurement models in psychometrics. He describes such models as “local philosophies of science” and goes on to discuss several major “global” philosophies of science; logical positivism, instrumentalism, and social constructivism all of which he describes as “anti-realist” to contrast with realist views of science. [3]

2. True scores

This chapter discusses classical test theory's central concept of the true score. It covers the history and fundamental axioms of classical test theory and goes on to discuss the philosophical implications of true scores. Borsboom describes the strengths and limitations of true scores in this way:

Classical test theory was either one of the best ideas in twentieth-century psychology, or one of the worst mistakes. The theory is mathematically elegant and conceptually simple, and in terms of its acceptance by psychologists, it is a psychometric success story. However, as is typical of popular statistical procedures, classical test theory is prone to misinterpretation. One reason for this is the terminology used: if a competition for the misnomer of the century existed, the term 'true score' would be a serious contestant. The infelicitous use of the adjective 'true' invites the mistaken idea that the true score on a test must somehow be identical to the 'real', 'valid', or 'construct' score. This chapter has hopefully proved the inadequacy of this view beyond reasonable doubt. [4]

3. Latent Variables

This chapter discusses the theory behind latent variables in psychometrics particularly with regard to item response theory. In particular Borsboom discusses issues of causality with regard to latent variables and the extent to which latent variables can be regarded as “causes” of between-subject differences and also be treated as a causal factor within a subject.

4. Scales

This chapter discusses measurement scales as the central concept of representational measurement theory. The chapter looks at the history behind psychological measurement scales and also at attempts to formalise measurement properties such as additive conjoint measurement. Borsboom also discusses what he terms “the problem of error”, that is the inability of such theories to handle the error that is intrinsic within psychological measurement.

If the ability to construct a homomorphic representation were to be a necessary condition for measurement, this entails that we should be able to gather data that fit the measurement model perfectly. This is because, strictly speaking, models like the conjoint model are refuted by a single violation of the axioms...Since we can safely assume that we will not succeed in error-free data – certainly not in psychology – we must choose two conclusions: either measurement is impossible, or it is not necessary to create a perfect homomorphic representation. If we accept the former, we may just as well stop the discussion right now. If we accept the latter, then we have to invent a way to deal with error. [5]

Reviews

"The six chapters of the book reflect an impressive interplay between philosophy of science, measurement, and mathematics. Consequently, readers who enjoy probing the why behind how we think about true scores, latent variables, scales, relations between models, and ultimately validity will, I think, relish the contents of the book."
"Overall, this is a well-written and well-argued book and theoretically minded psychometricians will find it of interest. While reading the book, I often found myself arguing with the author and, at the end I came away with more questions than answers. For me, these are the hallmarks of a good book."
"Psychometrics is an important sub-discipline. It not only sustains a significant psycho-technology, it also leads social science on its Pythagorean quest. It is therefore strange that, unlike behaviourism or psychoanalysis, it has eluded critical, conceptual scrutiny. Perhaps its foundations seemed secure. This book scuttles that illusion and deftly exposes its soft underbelly."

Related Research Articles

Psychological statistics

Psychological statistics is application of formulas, theorems, numbers and laws to psychology. Statistical Methods for psychology include development and application statistical theory and methods for modeling psychological data. These methods include psychometrics, Factor analysis, Experimental Designs, and Multivariate Behavioral Research. The article also discusses journals in the same field.

Psychometrics theory and technique of psychological measurement

Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally refers to specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

In statistics and psychometrics, reliability is the overall consistency of a measure. A measure is said to have a high reliability if it produces similar results under consistent conditions:

"It is the characteristic of a set of test scores that relates to the amount of random error from the measurement process that might be embedded in the scores. Scores that are highly reliable are precise, reproducible, and consistent from one testing occasion to another. That is, if the testing process were repeated with a group of test takers, essentially the same results would be obtained. Various kinds of reliability coefficients, with values ranging between 0.00 and 1.00, are usually used to indicate the amount of error in the scores."

Classical test theory (CTT) is a body of related psychometric theory that predicts outcomes of psychological testing such as the difficulty of items or the ability of test-takers. It is a theory of testing based on the idea that a person's observed or obtained score on a test is the sum of a true score and an error score. Generally speaking, the aim of classical test theory is to understand and improve the reliability of psychological tests.

In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments" (p. 197). By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.

Operationalization

In research design, especially in psychology, social sciences, life sciences and physics, operationalization or operationalisation is a process of defining the measurement of a phenomenon that is not directly measurable, though its existence is inferred by other phenomena. Operationalization thus defines a fuzzy concept so as to make it clearly distinguishable, measurable, and understandable by empirical observation. In a broader sense, it defines the extension of a concept—describing what is and is not an instance of that concept. For example, in medicine, the phenomenon of health might be operationalized by one or more indicators like body mass index or tobacco smoking. As another example, in visual processing the presence of a certain object in the environment could be inferred by measuring specific features of the light it reflects. In these examples, the phenomena are difficult to directly observe and measure because they are general/abstract or they are latent. Operationalization helps infer the existence, and some elements of the extension, of the phenomena of interest by means of some observable and measurable effects they have.

Construct validity is the accumulation of evidence to support the interpretation of what a measure reflects. Modern validity theory defines construct validity as the overarching concern of validity research, subsuming all other types of validity evidence such as content validity and criterion validity.

Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and is widely criticized by scholars in other disciplines. Other classifications include those by Mosteller and Tukey, and by Chrisman.

Structural equation modeling Form of causal modeling that fit networks of constructs to data

Structural equation modeling (SEM) is a label for a diverse set of methods used by scientists in both experimental and observational research across the sciences, business, and other fields. It is used most in the social and behavioral sciences. A definition of SEM is difficult without reference to highly technical language, but a good starting place is the name itself.

Mathematical psychology

Mathematical psychology is an approach to psychological research that is based on mathematical modeling of perceptual, thought, cognitive and motor processes, and on the establishment of law-like rules that relate quantifiable stimulus characteristics with quantifiable behavior. The mathematical approach is used with the goal of deriving hypotheses that are more exact and thus yield stricter empirical validations. There are five major research areas in mathematical psychology: learning and memory, perception and psychophysics, choice and decision-making, language and thinking, and measurement and scaling.

The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between (a) the respondent's abilities, attitudes, or personality traits and (b) the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession, agriculture, and market research because of their general applicability.

In statistics, latent variables are variables that are not directly observed but are rather inferred from other variables that are observed. Mathematical models that aim to explain observed variables in terms of latent variables are called latent variable models. Latent variable models are used in many disciplines, including psychology, demography, economics, engineering, medicine, physics, machine learning/artificial intelligence, bioinformatics, chemometrics, natural language processing, econometrics, management and the social sciences.

Lee Joseph Cronbach was an American educational psychologist who made contributions to psychological testing and measurement. At the University of Illinois, Urbana, Cronbach produced many of his works: the "Alpha" paper, as well as an essay titled The Two Disciplines of Scientific Psychology, in the American Psychologist magazine in 1957, where he discussed his thoughts on the increasing divergence between the fields of experimental psychology and correlational psychology.

Peter Hans Schönemann was a German born psychometrician and statistical expert. He was professor emeritus in the Department of Psychological Sciences at Purdue University. His research interests included multivariate statistics, multidimensional scaling and measurement, quantitative behavior genetics, test theory and mathematical tools for social scientists. He published around 90 papers dealing mainly with the subjects of psychometrics and mathematical scaling. Schönemann’s influences included Louis Guttman, Lee Cronbach, Oscar Kempthorne and Henry Kaiser.

Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

The relationship between nations and IQ is a controversial area of study concerning differences between nations in average intelligence test scores, their possible causes, and their correlation with measures of social well-being and economic prosperity.

Gideon J. Mellenbergh Dutch psychologist

Gideon Jan (Don) Mellenbergh was a Dutch psychologist, who was Professor of Psychological methods at the University of Amsterdam, known for his contribution in the field of psychometrics, and Social Research Methodology.

The Mokken scale is a psychometric method of data reduction. A Mokken scale is a unidimensional scale that consists of hierarchically-ordered items that measure the same underlying, latent concept. This method is named after the political scientist Rob Mokken who suggested it in 1971.

Bruno D. Zumbo is an applied mathematician working at the intersection of the mathematical sciences and the behavioral, social and health sciences. He is currently Professor and Distinguished University Scholar, the Tier 1 Canada Research Chair in Psychometrics and Measurement, and the Paragon UBC Professor of Psychometrics & Measurement at University of British Columbia.

Mark Daniel Reckase is an educational psychologist and expert on quantitative methods and measurement who is known for his work on computerized adaptive testing, multidimensional item response theory, and standard setting in educational and psychological tests. Reckase is University Distinguished Professor Emeritus in the College of Education at Michigan State University.

References

  1. Borsboom, Denny (2005), Measuring the Mind: Conceptual Issues in Contemporary Psychometrics, Cambridge, UK: Cambridge University Press, ISBN   978-0-521-84463-5 , retrieved 10 August 2010
  2. Borsboom (2005), p.i
  3. Borsboom (2005), pp.7-8
  4. Borsboom (2005), pp.44-45
  5. Borsboom (2005), p.106