Jonckheere's trend test

Last updated

In statistics, the Jonckheere trend test [1] (sometimes called the Jonckheere–Terpstra [2] test) is a test for an ordered alternative hypothesis within an independent samples (between-participants) design. It is similar to the Kruskal–Wallis test in that the null hypothesis is that several independent samples are from the same population. However, with the Kruskal–Wallis test there is no a priori ordering of the populations from which the samples are drawn. When there is an a priori ordering, the Jonckheere test has more statistical power than the Kruskal–Wallis test. The test was developed by Aimable Robert Jonckheere, who was a psychologist and statistician at University College London.


The null and alternative hypotheses can be conveniently expressed in terms of population medians for k populations (where k > 2). Letting θi be the population median for the ith population, the null hypothesis is:

The alternative hypothesis is that the population medians have an a priori ordering e.g.:

with at least one strict inequality.


The test can be seen as a special case of Maurice Kendall’s more general method of rank correlation [3] and makes use of the Kendall's S statistic. This can be computed in one of two ways:

The ‘direct counting’ method

  1. Arrange the samples in the predicted order
  2. For each score in turn, count how many scores in the samples to the right are larger than the score in question. This is P.
  3. For each score in turn, count how many scores in the samples to the right are smaller than the score in question. This is Q.
  4. S = PQ

The ‘nautical’ method

  1. Cast the data into an ordered contingency table, with the levels of the independent variable increasing from left to right, and values of the dependent variable increasing from top to bottom.
  2. For each entry in the table, count all other entries that lie to the ‘South East’ of the particular entry. This is P.
  3. For each entry in the table, count all other entries that lie to the ‘South West’ of the particular entry. This is Q.
  4. S = PQ

Note that there will always be ties in the independent variable (individuals are ‘tied’ in the sense that they are in the same group) but there may or may not be ties in the dependent variable. If there are no ties – or the ties occur within a particular sample (which does not affect the value of the test statistic) – exact tables of S are available; for example, Jonckheere [1] provided selected tables for values of k from 3 to 6 and equal samples sizes (m) from 2 to 5. Leach presented critical values of S for k = 3 with sample sizes ranging from 2,2,1 to 5,5,5. [4]

Normal approximation to S

The standard normal distribution can be used to approximate the distribution of S under the null hypothesis for cases in which exact tables are not available. The mean of the distribution of S will always be zero, and assuming that there are no ties scores between the values in two (or more) different samples the variance is given by

Where n is the total number of scores, and ti is the number of scores in the ith sample. The approximation to the standard normal distribution can be improved by the use of a continuity correction: Sc = |S| – 1. Thus 1 is subtracted from a positive S value and 1 is added to a negative S value. The z-score equivalent is then given by


If scores are tied between the values in two (or more) different samples there are no exact table for the S distribution and an approximation to the normal distribution has to be used. In this case no continuity correction is applied to the value of S and the variance is given by

where ti is a row marginal total and ui a column marginal total in the contingency table. The z-score equivalent is then given by

A numerical example

In a partial replication of a study by Loftus and Palmer participants were assigned at random to one of three groups, and then shown a film of two cars crashing into each other. [5] After viewing the film, the participants in one group were asked the following question: “About how fast were the cars going when they contacted each other?” Participants in a second group were asked, “About how fast were the cars going when they bumped into each other?” Participants in the third group were asked, “About how fast were the cars going when they smashed into each other?” Loftus and Palmer predicted that the action verb used (contacted, bumped, smashed) would influence the speed estimates in miles per hour (mph) such that action verbs implying greater energy would lead to higher estimated speeds. The following results were obtained (simulated data):


The ‘direct counting’ method

P = 8 + 7 + 7 + 7 + 4 + 4 + 3 + 3 = 43
Q = 0 + 0 + 1 + 1 + 0 + 0 + 0 + 1 = 3

The 'nautical' method

MphContactedBumpedSmashedTotals ti
Totals ui44412
P = (1 × 8) + (1 × 7) + (1 × 7) + (1 × 7) + (1 × 4) + (1 × 4) + (1 × 3) + ( 1 × 3) = 43
Q = (1 × 2) + (1 × 1) = 3

Using exact tables

When the ties between samples are few (as in this example) Leach suggested that ignoring the ties and using exact tables would provide a reasonably accurate result. [4] Jonckheere suggested breaking the ties against the alternative hypothesis and then using exact tables. [1] In the current example where tied scores only appear in adjacent groups, the value of S is unchanged if the ties are broken against the alternative hypothesis. This may be verified by substituting 11 mph in place of 12 mph in the Bumped sample, and 19 mph in place of 20 mph in the Smashed and re-computing the test statistic. From tables with k = 3, and m = 4, the critical S value for α = 0.05 is 36 and thus the result would be declared statistically significant at this level.

Computing a standard normal approximation

As , , and , and

the variance of S is then

And z is given by

For α = 0.05 (one-sided) the critical z value is 1.645, so again the result would be declared significant at this level. A similar test for trend within the context of repeated measures (within-participants) designs and based on Spearman's rank correlation coefficient was developed by Page. [6]

Related Research Articles

Skewness measure of the asymmetry of random variables

In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.

In statistics, the likelihood function measures the goodness of fit of a statistical model to a sample of data for given values of the unknown parameters. It is formed from the joint probability distribution of the sample, but viewed and used as a function of the parameters only, thus treating the random variables as fixed at the observed values.

In statistics, the likelihood-ratio test assesses the goodness of fit of two competing statistical models based on the ratio of their likelihoods, specifically one found by maximization over the entire parameter space and another found after imposing some constraint. If the constraint is supported by the observed data, the two likelihoods should not differ by more than sampling error. Thus the likelihood-ratio test tests whether this ratio is significantly different from one, or equivalently whether its natural logarithm is significantly different from zero.

Chi-squared distribution Probability distribution and special case of gamma distribution

In probability theory and statistics, the chi-squared distribution with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random variables. The chi-squared distribution is a special case of the gamma distribution and is one of the most widely used probability distributions in inferential statistics, notably in hypothesis testing and in construction of confidence intervals. This distribution is sometimes called the central chi-squared distribution, a special case of the more general noncentral chi-squared distribution.

In statistics, the mean squared error (MSE) or mean squared deviation (MSD) of an estimator measures the average of the squares of the errors—that is, the average squared difference between the estimated values and the actual value. MSE is a risk function, corresponding to the expected value of the squared error loss. The fact that MSE is almost always strictly positive is because of randomness or because the estimator does not account for information that could produce a more accurate estimate.

In statistics, the Rao–Blackwell theorem, sometimes referred to as the Rao–Blackwell–Kolmogorov theorem, is a result which characterizes the transformation of an arbitrarily crude estimator into an estimator that is optimal by the mean-squared-error criterion or any of a variety of similar criteria.

In estimation theory and statistics, the Cramér–Rao bound (CRB) expresses a lower bound on the variance of unbiased estimators of a deterministic parameter, stating that the variance of any such estimator is at least as high as the inverse of the Fisher information. The result is named in honor of Harald Cramér and C. R. Rao, but has independently also been derived by Maurice Fréchet, Georges Darmois, as well as Alexander Aitken and Harold Silverstone.

In statistics, the score is the gradient of the log-likelihood function with respect to the parameter vector. Evaluated at a particular point of the parameter vector, the score indicates the steepness of the log-likelihood function and thereby the sensitivity to infinitesimal changes to the parameter values. If the log-likelihood function is continuous over the parameter space, the score will vanish at a local maximum or minimum; this fact is used in maximum likelihood estimation to find the parameter values that maximize the likelihood function.

In statistics, G-tests are likelihood-ratio or maximum likelihood statistical significance tests that are increasingly being used in situations where chi-squared tests were previously recommended.

Directional statistics

Directional statistics is the subdiscipline of statistics that deals with directions, axes or rotations in Rn. More generally, directional statistics deals with observations on compact Riemannian manifolds including the Stiefel manifold.

In statistics, the score test assesses constraints on statistical parameters based on the gradient of the likelihood function—known as the score—evaluated at the hypothesized parameter value under the null hypothesis. Intuitively, if the restricted estimator is near the maximum of the likelihood function, the score should not differ from zero by more than sampling error. While the finite sample distributions of score tests are generally unknown, they have an asymptotic χ2-distribution under the null hypothesis as first proved by C. R. Rao in 1948, a fact that can be used to determine statistical significance.

In statistics, the Wald test assesses constraints on statistical parameters based on the weighted distance between the unrestricted estimate and its hypothesized value under the null hypothesis, where the weight is the precision of the estimate. Intuitively, the larger this weighted distance, the less likely it is that the constraint is true. While the finite sample distributions of Wald tests are generally unknown, it has an asymptotic χ2-distribution under the null hypothesis, a fact that can be used to determine statistical significance.

In econometrics and statistics, the generalized method of moments (GMM) is a generic method for estimating parameters in statistical models. Usually it is applied in the context of semiparametric models, where the parameter of interest is finite-dimensional, whereas the full shape of the data's distribution function may not be known, and therefore maximum likelihood estimation is not applicable.

In statistics, the delta method is a result concerning the approximate probability distribution for a function of an asymptotically normal statistical estimator from knowledge of the limiting variance of that estimator.

In statistics, a binomial proportion confidence interval is a confidence interval for the probability of success calculated from the outcome of a series of success–failure experiments. In other words, a binomial proportion confidence interval is an interval estimate of a success probability p when only the number of experiments n and the number of successes nS are known.

In statistics, the bias of an estimator is the difference between this estimator's expected value and the true value of the parameter being estimated. An estimator or decision rule with zero bias is called unbiased. In statistics, "bias" is an objective property of an estimator. Bias can also be measured with respect to the median, rather than the mean, in which case one distinguishes median-unbiased from the usual mean-unbiasedness property. Bias is a distinct concept from consistency. Consistent estimators converge in probability to the true value of the parameter, but may be biased or unbiased; see bias versus consistency for more.

Stochastic approximation methods are a family of iterative methods typically used for root-finding problems or for optimization problems. The recursive update rules of stochastic approximation methods can be used, among other things, for solving linear systems when the collected data is corrupted by noise, or for approximating extreme values of functions which cannot be computed directly, but only estimated via noisy observations.

In the comparison of various statistical procedures, efficiency is a measure of quality of an estimator, of an experimental design, or of a hypothesis testing procedure. Essentially, a more efficient estimator, experiment, or test needs fewer observations than a less efficient one to achieve a given performance. This article primarily deals with efficiency of estimators.

A product distribution is a probability distribution constructed as the distribution of the product of random variables having two other known distributions. Given two statistically independent random variables X and Y, the distribution of the random variable Z that is formed as the product

In statistics, the variance function is a smooth function which depicts the variance of a random quantity as a function of its mean. The variance function is a measure of heteroscedasticity and plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.


  1. 1 2 3 Jonckheere, A. R. (1954). "A distribution-free k-sample test against ordered alternatives". Biometrika . 41: 133–145. doi:10.2307/2333011.
  2. Terpstra, T. J. (1952). "The asymptotic normality and consistency of Kendall's test against trend, when ties are present in one ranking" (PDF). Indagationes Mathematicae. 14: 327–333.
  3. Kendall, M. G. (1962). Rank correlation methods (3rd ed.). London: Charles Griffin.
  4. 1 2 Leach, C. (1979). Introduction to Statistics: A non-parametric approach for the social sciences. Chichester: John Wiley.
  5. Loftus, E. F.; Palmer, J. C. (1974). "Reconstruction of automobile destruction: An example of the interaction between language and memory". Journal of Verbal Learning and Verbal Behavior. 13: 585–589. doi:10.1016/S0022-5371(74)80011-3.
  6. Page, E. B. (1963). "Ordered hypotheses for multiple treatments: A significance test for linear ranks". Journal of the American Statistical Association. 58 (301): 216–30. doi:10.2307/2282965.

Further reading