Thurstone scale

Last updated

In psychology and sociology, the Thurstone scale was the first formal technique to measure an attitude. It was developed by Louis Leon Thurstone in 1928, originally as a means of measuring attitudes towards religion. Today it is used to measure attitudes towards a wide variety of issues. The technique uses a number of statements about a particular issue, and each statement is given a numerical value indicating how favorable or unfavorable it is judged to be. These numerical values are prepared ahead of time by the researcher and not shown to the test subjects. The subjects then check each of the statements with which they agree, and a mean score of those statements' values is computed, indicating their attitude.

Contents

Thurstone scale

Thurstone's method of pair comparisons can be considered a prototype of a normal distribution-based method for scaling-dominance matrices. Even though the theory behind this method is quite complex (Thurstone, 1927a), the algorithm itself is straightforward. For the basic Case V, the frequency dominance matrix is translated into proportions and interfaced with the standard scores. The scale is then obtained as a left-adjusted column marginal average of this standard score matrix (Thurstone, 1927b). The underlying rationale for the method and basis for the measurement of the "psychological scale separation between any two stimuli" derives from Thurstone's Law of comparative judgment (Thurstone, 1928). ASU

The principal difficulty with this algorithm is its indeterminacy with respect to one-zero proportions, which return z values as plus or minus infinity, respectively. The inability of the pair comparisons algorithm to handle these cases imposes considerable limits on the applicability of the method.

The most frequent recourse when the 1.00-0.00 frequencies are encountered is their omission. Thus, e.g., Guilford (1954, p. 163) has recommended not using proportions more extreme than .977 or .023, and Edwards (1957, pp. 41–42) has suggested that “if the number of judges is large, say 200 or more, then we might use pij values of .99 and .01, but with less than 200 judges, it is probably better to disregard all comparative judgments for which pij is greater than .98 or less than .02."’ Since the omission of such extreme values leaves empty cells in the Z matrix, the averaging procedure for arriving at the scale values cannot be applied, and an elaborate procedure for the estimation of unknown parameters is usually employed (Edwards, 1957, pp. 42–46). An alternative solution of this problem was suggested by Krus and Kennedy (1977).

With later developments in psychometric theory, it has become possible to employ direct methods of scaling such as application of the Rasch model or unfolding models such as the Hyperbolic Cosine Model (HCM) (Andrich & Luo, 1993). The Rasch model has a close conceptual relationship to Thurstone's law of comparative judgment (Andrich, 1978), the principal difference being that it directly incorporates a person parameter. Also, the Rasch model takes the form of a logistic function rather than a cumulative normal function.

See also

Related Research Articles

<span class="mw-page-title-main">Psychometrics</span> Theory and technique of psychological measurement

Psychometrics is a field of study within psychology concerned with the theory and technique of measurement. Psychometrics generally refers to specialized fields within psychology and education devoted to testing, measurement, assessment, and related activities. Psychometrics is concerned with the objective measurement of latent constructs that cannot be directly observed. Examples of latent constructs include intelligence, introversion, mental disorders, and educational achievement. The levels of individuals on nonobservable latent variables are inferred through mathematical modeling based on what is observed from individuals' responses to items on tests and scales.

In the social sciences, scaling is the process of measuring or ordering entities with respect to quantitative attributes or traits. For example, a scaling technique might involve estimating individuals' levels of extraversion, or the perceived quality of products. Certain methods of scaling permit estimation of magnitudes on a continuum, while other methods provide only for relative ordering of the entities.

In psychometrics, item response theory (IRT) is a paradigm for the design, analysis, and scoring of tests, questionnaires, and similar instruments measuring abilities, attitudes, or other variables. It is a theory of testing based on the relationship between individuals' performances on a test item and the test takers' levels of performance on an overall measure of the ability that item was designed to measure. Several different statistical models are used to represent both item and test taker characteristics. Unlike simpler alternatives for creating scales and evaluating questionnaire responses, it does not assume that each item is equally difficult. This distinguishes IRT from, for instance, Likert scaling, in which "All items are assumed to be replications of each other or in other words items are considered to be parallel instruments". By contrast, item response theory treats the difficulty of each item as information to be incorporated in scaling items.

Louis Leon Thurstone was an American pioneer in the fields of psychometrics and psychophysics. He conceived the approach to measurement known as the law of comparative judgment, and is well known for his contributions to factor analysis. A Review of General Psychology survey, published in 2002, ranked Thurstone as the 88th most cited psychologist of the 20th century, tied with John Garcia, James J. Gibson, David Rumelhart, Margaret Floy Washburn, and Robert S. Woodworth.

<span class="mw-page-title-main">Likert scale</span> Psychometric measurement scale

A Likert scale is a psychometric scale named after its inventor, American social psychologist Rensis Likert, which is commonly used in research questionnaires. It is the most widely used approach to scaling responses in survey research, such that the term is often used interchangeably with rating scale, although there are other types of rating scales.

Level of measurement or scale of measure is a classification that describes the nature of information within the values assigned to variables. Psychologist Stanley Smith Stevens developed the best-known classification with four levels, or scales, of measurement: nominal, ordinal, interval, and ratio. This framework of distinguishing levels of measurement originated in psychology and has since had a complex history, being adopted and extended in some disciplines and by some scholars, and criticized or rejected by others. Other classifications include those by Mosteller and Tukey, and by Chrisman.

Stevens' power law is an empirical relationship in psychophysics between an increased intensity or strength in a physical stimulus and the perceived magnitude increase in the sensation created by the stimulus. It is often considered to supersede the Weber–Fechner law, which is based on a logarithmic relationship between stimulus and sensation, because the power law describes a wider range of sensory comparisons, down to zero intensity.

<span class="mw-page-title-main">Mathematical psychology</span> Mathematical modeling of psychological theories and phenomena

Mathematical psychology is an approach to psychological research that is based on mathematical modeling of perceptual, thought, cognitive and motor processes, and on the establishment of law-like rules that relate quantifiable stimulus characteristics with quantifiable behavior. The mathematical approach is used with the goal of deriving hypotheses that are more exact and thus yield stricter empirical validations. There are five major research areas in mathematical psychology: learning and memory, perception and psychophysics, choice and decision-making, language and thinking, and measurement and scaling.

The Rasch model, named after Georg Rasch, is a psychometric model for analyzing categorical data, such as answers to questions on a reading assessment or questionnaire responses, as a function of the trade-off between the respondent's abilities, attitudes, or personality traits, and the item difficulty. For example, they may be used to estimate a student's reading ability or the extremity of a person's attitude to capital punishment from responses on a questionnaire. In addition to psychometrics and educational research, the Rasch model and its extensions are used in other areas, including the health profession, agriculture, and market research

The polytomous Rasch model is generalization of the dichotomous Rasch model. It is a measurement model that has potential application in any context in which the objective is to measure a trait or ability through a process in which responses to items are scored with successive integers. For example, the model is applicable to the use of Likert scales, rating scales, and to educational assessment items for which successively higher integer scores are intended to indicate increasing levels of competence or attainment.

<span class="mw-page-title-main">Semantic differential</span>

The semantic differential (SD) is a measurement scale designed to measure a person's subjective perception of, and affective reactions to, the properties of concepts, objects, and events by making use of a set of bipolar scales. The SD is used to assess one's opinions, attitudes, and values regarding these concepts, objects, and events in a controlled and valid way. Respondents are asked to choose where their position lies, on a set of scales with polar adjectives. Compared to other measurement scaling techniques such as Likert scaling, the SD can be assumed to be relatively reliable, valid, and robust.

The law of comparative judgment was conceived by L. L. Thurstone. In modern-day terminology, it is more aptly described as a model that is used to obtain measurements from any process of pairwise comparison. Examples of such processes are the comparisons of perceived intensity of physical stimuli, such as the weights of objects, and comparisons of the extremity of an attitude expressed within statements, such as statements about capital punishment. The measurements represent how we perceive entities, rather than measurements of actual physical properties. This kind of measurement is the focus of psychometrics and psychophysics.

Pairwise comparison generally is any process of comparing entities in pairs to judge which of each entity is preferred, or has a greater amount of some quantitative property, or whether or not the two entities are identical. The method of pairwise comparison is used in the scientific study of preferences, attitudes, voting systems, social choice, public choice, requirements engineering and multiagent AI systems. In psychology literature, it is often referred to as paired comparison.

A rating scale is a set of categories designed to elicit information about a quantitative or a qualitative attribute. In the social sciences, particularly psychology, common examples are the Likert response scale and 1-10 rating scales in which a person selects the number which is considered to reflect the perceived quality of a product.

The theory of conjoint measurement is a general, formal theory of continuous quantity. It was independently discovered by the French economist Gérard Debreu (1960) and by the American mathematical psychologist R. Duncan Luce and statistician John Tukey.

Psychometric software is software that is used for psychometric analysis of data from tests, questionnaires, or inventories reflecting latent psychoeducational variables. While some psychometric analyses can be performed with standard statistical software like SPSS, most analyses require specialized tools.

<span class="mw-page-title-main">Benjamin Drake Wright</span> American psychometrician (1926–2015)

Benjamin Drake Wright was an American psychometrician. He is largely responsible for the widespread adoption of Georg Rasch's measurement principles and models. In the wake of what Rasch referred to as Wright's “almost unbelievable activity in this field” in the period from 1960 to 1972, Rasch's ideas entered the mainstream in high-stakes testing, professional certification and licensure examinations, and in research employing tests, and surveys and assessments across a range of fields. Wright's seminal contributions to measurement continued until 2001, and included articulation of philosophical principles, production of practical results and applications, software development, development of estimation methods and model fit statistics, vigorous support for students and colleagues, and the founding of professional societies and new publications.

Adaptive comparative judgement is a technique borrowed from psychophysics which is able to generate reliable results for educational assessment – as such it is an alternative to traditional exam script marking. In the approach, judges are presented with pairs of student work and are then asked to choose which is better, one or the other. By means of an iterative and adaptive algorithm, a scaled distribution of student work can then be obtained without reference to criteria.

<i>The Vectors of Mind</i> Book published by psychologist Louis Leon Thurstone

The Vectors of Mind is a book published by American psychologist Louis Leon Thurstone in 1935 that summarized Thurstone's methodology for multiple factor analysis.

Automatic Item Generation (AIG), or Automated Item Generation, is a process linking psychometrics with computer programming. It uses a computer algorithm to automatically create test items that are the basic building blocks of a psychological test. The method was first described by John R. Bormuth in the 1960s but was not developed until recently. AIG uses a two-step process: first, a test specialist creates a template called an item model; then, a computer algorithm is developed to generate test items. So, instead of a test specialist writing each individual item, computer algorithms generate families of items from a smaller set of parent item models.

References