# Thurstonian model

Last updated

A Thurstonian model is a stochastic transitivity model with latent variables for describing the mapping of some continuous scale onto discrete, possibly ordered categories of response. In the model, each of these categories of response corresponds to a latent variable whose value is drawn from a normal distribution, independently of the other response variables and with constant variance. Developments over the last two decades, however, have led to Thurstonian models that allow unequal variance and non zero covariance terms. Thurstonian models have been used as an alternative to generalized linear models in analysis of sensory discrimination tasks. [1] They have also been used to model long-term memory in ranking tasks of ordered alternatives, such as the order of the amendments to the US Constitution. [2] Their main advantage over other models ranking tasks is that they account for non-independence of alternatives. [3] Ennis [4] provides a comprehensive account of the derivation of Thurstonian models for a wide variety of behavioral tasks including preferential choice, ratings, triads, tetrads, dual pair, same-different and degree of difference, ranks, first-last choice, and applicability scoring. In Chapter 7 of this book, a closed form expression, derived in 1988, is given for a Euclidean-Gaussian similarity model that provides a solution to the well-known problem that many Thurstonian models are computationally complex often involving multiple integration. In Chapter 10, a simple form for ranking tasks is presented that only involves the product of univariate normal distribution functions and includes rank-induced dependency parameters. A theorem is proven that shows that the particular form of the dependency parameters provides the only way that this simplification is possible. Chapter 6 links discrimination, identification and preferential choice through a common multivariate model in the form of weighted sums of central F distribution functions and allows a general variance-covariance matrix for the items.

## Definition

Consider a set of m options to be ranked by n independent judges. Such a ranking can be represented by the ordering vector rn = (rn1, rn2,...,rnm).

Rankings are assumed to be derived from real-valued latent variables zij, representing the evaluation of option j by judge i. Rankings ri are derived deterministically from zi such that zi(ri1) < zi(ri2) < ... < zi(rim).

The zi are assumed to be derived from an underlying ground truth value μ for each option. In the most general case, they are multivariate-normal:

${\displaystyle \mathbf {z} _{j}\ \sim \ {\mathcal {N}}(\mathbf {\mu } _{j},\mathbf {\Sigma } _{j})}$

One common simplification is to assume an isotropic Gaussian distribution, with a single standard deviation parameter for each judge:

${\displaystyle \mathbf {z} _{j}\ \sim \ {\mathcal {N}}(\mathbf {\mu } _{j},\sigma _{i}^{2}\mathbf {I} ).}$

## Inference

The Gibbs-sampler based approach to estimating model parameters is due to Yao and Bockenholt (1999). [3]

• Step 1: Given β, Σ, and ri, sample zi.

The zij must be sampled from a truncated multivariate normal distribution to preserve their rank ordering. Hajivassiliou's Truncated Multivariate Normal Gibbs sampler can be used to sample efficiently. [5] [6]

• Step 2: Given Σ, zi, sample β.

β is sampled from a normal distribution:

${\displaystyle \beta \ \sim \ {\mathcal {N}}(\beta ^{*},\Sigma ^{*}).}$

where β* and Σ* are the current estimates for the means and covariance matrices.

• Step 3: Given β, zi, sample Σ.

Σ−1 is sampled from a Wishart posterior, combining a Wishart prior with the data likelihood from the samples εi =zi - β.

## History

Thurstonian models were introduced by Louis Leon Thurstone to describe the law of comparative judgment. [7] Prior to 1999, Thurstonian models were rarely used for modeling tasks involving more than 4 options because of the high-dimensional integration required to estimate parameters of the model. In 1999, Yao and Bockenholt introduced their Gibbs-sampler based approach to estimating model parameters. [3] This comment, however, only applies to ranking and Thurstonian models with a much broader range of applications were developed prior to 1999. For instance, a multivariate Thurstonian model for preferential choice with a general variance-covariance structure is discussed in chapter 6 of Ennis (2016) that was based on papers published in 1993 and 1994. Even earlier, a closed form for a Thurstonian multivariate model of similarity with arbitrary covariance matrices was published in 1988 as discussed in Chapter 7 of Ennis (2016). This model has numerous applications and is not limited to any particular number of items or individuals.

## Applications to sensory discrimination

Thurstonian models have been applied to a range of sensory discrimination tasks, including auditory, taste, and olfactory discrimination, to estimate sensory distance between stimuli that range along some sensory continuum. [8] [9] [10]

The Thurstonian approach motivated Frijter (1979)'s explanation of Gridgeman's Paradox, also known as the paradox of discriminatory nondiscriminators: [1] [9] [11] [12] People perform better in a three-alternative forced choice task when told in advance which dimension of the stimulus to attend to. (For example, people are better at identifying which of one three drinks is different from the other two when told in advance that the difference will be in degree of sweetness.) This result is accounted for by differing cognitive strategies: when the relevant dimension is known in advance, people can estimate values along that particular dimension. When the relevant dimension is not known in advance, they must rely on a more general, multi-dimensional measure of sensory distance.

The above paragraph contains a common misunderstanding of the Thurstonian resolution of Gridgeman's paradox. Although it is true that different decision rules (cognitive strategies) are used in making a choice among three alternatives, the mere fact of knowing an attribute in advance does not explain the paradox, nor are subjects required to rely on a more general, multidimensional measure of sensory difference. In the triangular method, for instance, the subject is instructed to choose the most different of three items, two of which are putatively identical. The items may differ on a unidimensional scale and the subject may be made aware of the nature of the scale in advance. Gridgeman's paradox will still be observed. This occurs because of the sampling process combined with a distance-based decision rule as opposed to a magnitude-based decision rule assumed to model the results of the 3-alternative forced choice task.

## Related Research Articles

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables each of which clusters around a mean value.

In probability theory and statistics, covariance is a measure of the joint variability of two random variables. If the greater values of one variable mainly correspond with the greater values of the other variable, and the same holds for the lesser values, the covariance is positive. In the opposite case, when the greater values of one variable mainly correspond to the lesser values of the other,, the covariance is negative. The sign of the covariance therefore shows the tendency in the linear relationship between the variables. The magnitude of the covariance is not easy to interpret because it is not normalized and hence depends on the magnitudes of the variables. The normalized version of the covariance, the correlation coefficient, however, shows by its magnitude the strength of the linear relation.

In probability theory and statistics, a Gaussian process is a stochastic process, such that every finite collection of those random variables has a multivariate normal distribution, i.e. every finite linear combination of them is normally distributed. The distribution of a Gaussian process is the joint distribution of all those random variables, and as such, it is a distribution over functions with a continuous domain, e.g. time or space.

In statistics, the Wishart distribution is a generalization to multiple dimensions of the gamma distribution. It is named in honor of John Wishart, who first formulated the distribution in 1928.

In statistics, a vector of random variables is heteroscedastic if the variability of the random disturbance is different across elements of the vector. Here, variability could be quantified by the variance or any other measure of statistical dispersion. Thus heteroscedasticity is the absence of homoscedasticity. A typical example is the set of observations of income in different cities.

In statistics, multivariate analysis of variance (MANOVA) is a procedure for comparing multivariate sample means. As a multivariate procedure, it is used when there are two or more dependent variables, and is often followed by significance tests involving individual dependent variables separately.

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships between a dependent variable and one or more independent variables. The most common form of regression analysis is linear regression, in which a researcher finds the line that most closely fits the data according to a specific mathematical criterion. For example, the method of ordinary least squares computes the unique line that minimizes the sum of squared differences between the true data and that line. For specific mathematical reasons, this allows the researcher to estimate the conditional expectation of the dependent variable when the independent variables take on a given set of values. Less common forms of regression use slightly different procedures to estimate alternative location parameters or estimate the conditional expectation across a broader collection of non-linear models.

In statistics, sometimes the covariance matrix of a multivariate random variable is not known but has to be estimated. Estimation of covariance matrices then deals with the question of how to approximate the actual covariance matrix on the basis of a sample from the multivariate distribution. Simple cases, where observations are complete, can be dealt with by using the sample covariance matrix. The sample covariance matrix (SCM) is an unbiased and efficient estimator of the covariance matrix if the space of covariance matrices is viewed as an extrinsic convex cone in Rp×p; however, measured using the intrinsic geometry of positive-definite matrices, the SCM is a biased and inefficient estimator. In addition, if the random variable has normal distribution, the sample covariance matrix has Wishart distribution and a slightly differently scaled version of it is the maximum likelihood estimate. Cases involving missing data require deeper considerations. Another issue is the robustness to outliers, to which sample covariance matrices are highly sensitive.

The James–Stein estimator is a biased estimator of the mean, , of (possibly) correlated Gaussian distributed random vectors with unknown means .

In statistics, generalized least squares (GLS) is a technique for estimating the unknown parameters in a linear regression model when there is a certain degree of correlation between the residuals in a regression model. In these cases, ordinary least squares and weighted least squares can be statistically inefficient, or even give misleading inferences. GLS was first described by Alexander Aitken in 1936.

In statistics, the multivariate t-distribution is a multivariate probability distribution. It is a generalization to random vectors of the Student's t-distribution, which is a distribution applicable to univariate random variables. While the case of a random matrix could be treated within this structure, the matrix t-distribution is distinct and makes particular use of the matrix structure.

In statistics, the inverse Wishart distribution, also called the inverted Wishart distribution, is a probability distribution defined on real-valued positive-definite matrices. In Bayesian statistics it is used as the conjugate prior for the covariance matrix of a multivariate normal distribution.

In probability and statistics, the truncated normal distribution is the probability distribution derived from that of a normally distributed random variable by bounding the random variable from either below or above. The truncated normal distribution has wide applications in statistics and econometrics. For example, it is used to model the probabilities of the binary outcomes in the probit model and to model censored data in the Tobit model.

In linear regression, mean response and predicted response are values of the dependent variable calculated from the regression parameters and a given value of the independent variable. The values of these two responses are the same, but their calculated variances are different.

In probability theory and statistics, the normal-inverse-gamma distribution is a four-parameter family of multivariate continuous probability distributions. It is the conjugate prior of a normal distribution with unknown mean and variance.

In probability and statistics, an elliptical distribution is any member of a broad family of probability distributions that generalize the multivariate normal distribution. Intuitively, in the simplified two and three dimensional case, the joint distribution forms an ellipse and an ellipsoid, respectively, in iso-density plots.

In statistics, the matrix t-distribution is the generalization of the multivariate t-distribution from vectors to matrices. The matrix t-distribution shares the same relationship with the multivariate t-distribution that the matrix normal distribution shares with the multivariate normal distribution. For example, the matrix t-distribution is the compound distribution that results from sampling from a matrix normal distribution having sampled the covariance matrix of the matrix normal from an inverse Wishart distribution.

In statistics, linear regression is a linear approach to modelling the relationship between a scalar response and one or more explanatory variables. The case of one explanatory variable is called simple linear regression; for more than one, the process is called multiple linear regression. This term is distinct from multivariate linear regression, where multiple correlated dependent variables are predicted, rather than a single scalar variable.

The GHK algorithm is an importance sampling method for simulating choice probabilities in the multivariate probit model. These simulated probabilities can be used to recover parameter estimates from the maximized likelihood equation using any one of the usual well known maximization methods. Train has well documented steps for implementing this algorithm for a multinomial probit model. What follows here will applies to the binary multivariate probit model.

In the mathematical theory of probability, multivariate Laplace distributions are extensions of the Laplace distribution and the asymmetric Laplace distribution to multiple variables. The marginal distributions of symmetric multivariate Laplace distribution variables are Laplace distributions. The marginal distributions of asymmetric multivariate Laplace distribution variables are asymmetric Laplace distributions.

## References

1. Lundahl, David (1997). "Thurstonian Models — an Answer to Gridgeman's Paradox?". CAMO Software Statistical Methods.
2. Lee, Michael; Steyvers, Mark; de Young, Mindy; Miller, Brent (2011). "A Model-Based Approach to Measuring Expertise in Ranking Tasks" (PDF). CogSci 2011 Proceedings (PDF). ISBN   978-0-9768318-7-7.
3. Yao, G.; Bockenholt, U. (1999). "Bayesian estimation of Thurstonian ranking models based on the Gibbs sampler". British Journal of Mathematical and Statistical Psychology. 52: 19–92. doi:10.1348/000711099158973.
4. Ennis, Daniel (2016). Thurstonian Models — Categorical Decision Making in the Presence of Noise. Richmond: The Institute for Perception. ISBN   978-0-9906446-0-6.
5. Hajivassiliou, V.A. (1993). "Simulation estimation methods for limited dependent variable models". In Maddala, G.S.; Rao, C.R.; Vinod, H.D. (eds.). Econometrics. Handbook of statistics. 11. Amsterdam: Elsevier. ISBN   0444895779.
6. V.A., Hajivassiliou; D., McFadden; P., Ruud (1996). "Simulation of multivariate normal rectangle probabilities and their derivatives. Theoretical and computational results". Journal of Econometrics. 72 (1–2): 85–134. doi:10.1016/0304-4076(94)01716-6.
7. Thurstone, Louis Leon (1927). "A Law of Comparative Judgment". Psychological Review. 34 (4): 273–286. doi:10.1037/h0070288. Reprinted: Thurstone, L. L. (1994). "A law of comparative judgment". Psychological Review. 101 (2): 266–270. doi:10.1037/0033-295X.101.2.266.
8. Durlach, N.I.; Braida, L.D. (1969). "Intensity Perception. I. Preliminary Theory of Intensity Resolution". Journal of the Acoustical Society of America . 46 (2): 372–383. Bibcode:1969ASAJ...46..372D. doi:10.1121/1.1911699. PMID   5804107.
9. Dessirier, Jean-Marc; O’Mahony, Michael (9 October 1998). "Comparison of d′ values for the 2-AFC (paired comparison) and 3-AFC discrimination methods: Thurstonian models, sequential sensitivity analysis and power". Food Quality and Preference. 10 (1): 51–58. doi:10.1016/S0950-3293(98)00037-8.
10. Frijter, J.E.R. (1980). "Three-stimulus procedures in olfactory psychophysics: an experimental comparison of Thurstone-Ura and three-alternative forced choice models of signal detection theory". Perception & Psychophysics. 28 (5): 390–7. doi:. PMID   7208248.
11. Gridgement, N.T. (1970). "A Reexamination of the Two-Stage Triangle Test for the Perception of Sensory Differences". Journal of Food Science. 35 (1): 87–91. doi:10.1111/j.1365-2621.1970.tb12376.x.
12. Frijters, J.E.R. (1979). "The paradox of discriminatory nondiscriminators resolved". Chemical Senses & Flavor. 4 (4): 355–8. doi:10.1093/chemse/4.4.355.