Hardy distribution

Hardy Distribution
	Probability mass function The horizontal axis represents the hole score n. The vertical axis represents the probability of the hole score n given the par of the hole and the probabilities p = 0.20 and q = 0.10. The blue points represent the probabilities for a par three, the green points for a par four and the red points for a par five The function is defined only at integer values of n. The connecting lines are only guides for the eye.
	Cumulative distribution function The horizontal axis represents the hole score n. The vertical axis represents the cumulative probability of the hole score n given the par of the hole and the probabilities p = 0.20 and q = 0.10. The blue points represent the probabilities for a par three, the green points for a par four and the red points for a par five. The cumulative probability density (CDF) is discontinuous at the integers of n and flat everywhere else because a variable that is Hardy distributed takes on only integer values.
Notation
Parameters	, and
Support	(Natural numbers starting from 1)
PMF	For m is odd: Contents Definitions ; Probability mass function ; Hardy distribution for a par three, four and five ; History ; Goodness of fit ; References ; For m is even: with and
Mean
MGF	For m is odd: For m is even: with and

Last updated April 11, 2024

In probability theory and statistics, the Hardy distribution is a discrete probability distribution that expresses the probability of the hole score for a given golf player. It is based on Hardy's (Hardy, 1945) basic assumption that there are three types of shots:

good  $(G)$ ,  bad  $(B)$  and  ordinary  $(O)$ ,

where the probability of a good hit equals $p$ , the probability of a bad hit equals $q$ and the probability of an ordinary hit equals $1-p-q$ . Hardy further assigned

a value of 2 to a good stroke,  a value of 0 to a bad stroke and  a value of 1 to a regular or ordinary stroke.

Once the sum of the values is greater than or equal to the value of the par of the hole, the number of strokes in question is equal to the score achieved on that hole. A birdie on a par three could then have come about in three ways: $OG$ , $GO$ and $GG$ , respectively, with probabilities $(1-p-q)\,p$ , $p\,(1-p-q)$ and $p^{2}$ .

Definitions

Probability mass function

A discrete random variable $X$ is said to have a Hardy distribution, with parameters $p$ , $q$ and $m$ if it has a probability mass function given by:

$P\left(X=n\right)\,=\,\sum _{j={\frac {m+1}{2}}}^{m}{n-1 \choose n-j}{q}^{n-j}\left(A_{j,m}+B_{j,m}\right)$ if m is odd

and

$P\left(X=n\right)\,=\,\sum _{j={\frac {m}{2}}}^{m}{n-1 \choose n-j}{q}^{n-j}\left(A_{j,m}+B_{j,m}\right)$ if m is even

with

$A_{j,m}\,=\,{j-1 \choose 2\,j-m-1}{p}^{m-j+1}\left(1-p-q\right)^{2\,j-m-1}$

and

$B_{j,m}\,=\,{j \choose 2\,j-m}{p}^{m-j}\left(1-p-q\right)^{2\,j-m}$

where

$m$ is the par of the hole ( $m=1,2,\ldots$ )

$n$ is the golf hole score ( $n={\frac {m}{2}},{\frac {m}{2}}+1,{\frac {m}{2}}+2,\ldots$ ) if $m$ is even

$n$ is the golf hole score ( $n={\frac {m+1}{2}},{\frac {m+1}{2}}+1,{\frac {m+1}{2}}+2,\ldots$ ) if $m$ is odd

$p$ is the probability of a good shot ( $0<p<1$ )

$q$ is the probability of a bad shot ( $0<q<1$ ) and ( $0<p+q<1$ )

The moment generating function is given by:

$M_{m}\left(t\right)=\sum _{j={\frac {m}{2}}+{\frac {1}{2}}}^{m}{\frac {\left(X_{\it {jm}}+Y_{\it {jm}}\right)~e^{j~t}}{\left(1-e^{t}~q\right)^{j}}}$ if m is odd

and

$M_{m}\left(t\right)=\sum _{j={\frac {m}{2}}}^{m}{\frac {\left(X_{\it {jm}}+Y_{\it {jm}}\right)~e^{j~t}}{\left(1-e^{t}~q\right)^{j}}}$ if m is even

with

$X_{\it {jm}}\,=\,{j-1 \choose 2\,j-m-1}{p}^{m+1-j}\left(1-p-q\right)^{2\,j-m-1}$

and

$Y_{\it {jm}}\,=\,{j \choose 2\,j-m}{p}^{-j+m}\left(1-p-q\right)^{2\,j-m}$

Each raw moment and each central moment can be easily determined with the moment generating function, but the formulas involved are too large to present here.

Hardy distribution for a par three, four and five

For a par three:

{\begin{aligned}P\left(T_{3}=n\right)&={n-1 \choose n-2}{q}^{n-2}\left({p}^{2}+2\,p\,\left(1-p-q\right)\right)+\\&+{n-1 \choose n-3}{q}^{n-3}\left(p\,\left(1-p-q\right)^{2}+\left(1-p-q\right)^{3}\right)\end{aligned}}

For a par four:

{\begin{aligned}P\left(T_{4}=n\right)&={n-1 \choose n-2}{q}^{n-2}{p}^{2}+\\&+{n-1 \choose n-3}{q}^{n-3}\left(2\,{p}^{2}\left(1-p-q\right)+3\,p\,\left(1-p-q\right)^{2}\right)+\\&+{n-1 \choose n-4}{q}^{n-4}\left(p\,\left(1-p-q\right)^{3}+\left(1-p-q\right)^{4}\right)\end{aligned}}

Note the resemblance with $P(T_{3}=n)$ . For a par five:

{\begin{aligned}P\left(T_{5}=n\right)&={n-1 \choose n-3}{q}^{n-3}\left({p}^{3}+3\,{p}^{2}\left(1-p-q\right)\right)+\\&+{n-1 \choose n-4}{q}^{n-4}\left(3\,{p}^{2}\left(1-p-q\right)^{2}+4\,p\,\left(1-p-q\right)^{3}\right)+\\&+{n-1 \choose n-5}{q}^{n-5}\left(p\,\left(1-p-q\right)^{4}+\left(1-p-q\right)^{5}\right)\end{aligned}}

Note the resemblance with the formulas for $P(T_{3}=n)$ and $P(T_{4}=n)$ .

History

When trying to make a probability distribution in golf that describes the frequency distribution of the number of strokes on a hole, the simplest setup is to assume that there are only two types of strokes:

A good stroke with a probability of  $p$    A bad stroke with a probability of  $1-p$ .  while a good shot then gets the value 1 and a bad shot gets the value 0.

Once the sum of the shot values equals the par of the hole, that is the number of strokes needed for the hole. It is clear that with this setup, a birdie is not possible. After all, the smallest number of strokes one can get is the par of the hole. Hardy (1945) probably realized that too and then came up with the idea not to assume that there were just two types of strokes: good $(G)$ and bad $(B)$ , but three types:

good  $(G)$  with probability  $p$      bad  $(B)$  with probability  $q$      ordinary  $(O)$  with probability  $1-p-q$ .

In fact, Hardy called a good shot a supershot and a bad shot a subshot. ^[1] Minton later called Hardy's supershot an excellent shot $(E)$ and Hardy's subshot a bad shot $(B)$ .^[2] In this article, Minton's excellent shot is called a good shot $(G)$ . Hardy came up with the idea of three types of shots in 1945, but the actual derivation of the probability distribution of the hole score was not given until 2012 by van der Ven.^[3]

Hardy assumed that the probability of a good stroke was equal to the probability of a bad stroke, namely $p=q$ . This was confirmed by Kang:

Hardy's model is very simple in that all strokes are independent from each other and the probability of producing a good shot is equal to the probability of producing a bad shot.^[4]

In retrospect, Hardy might well have been right, as the data in Table 2 in van der Ven (2013) show. This table shows the estimated $p$ - and $q$ -values for holes 1-18 for rounds 1 and 2 of the 2012 British Open Championship. The mean values were equal to 0.0633 and 0.0697, respectively. Later Cohen (2002) introduced the idea that $p$ and $q$ should be different. Kang says about this:

Cohen takes another step forward and includes the possibility that the probability of good shots and bad shots can differ.^[4]

For the Hardy distribution the values of $p$ and $q$ may be different.

Goodness of fit

The Hardy distribution gives the probability distribution of a single player's hole score. It takes several observations to perform a goodness-of-fit test (see Goodness of fit test) to check whether the Hardy distribution applies or not. This can be done with a single individual by having the individual play the same hole multiple times. Goodness-of-fit tests assume pure replications (see Replication (statistics)). This means that there should be no change in the player's golfing ability during repeated play of the hole. For example, there should not be an ongoing learning process (see Learning). Such effects cannot really be ruled out. One way around this problem is to use multiple players who can be assumed to have approximately the same golf proficiency. Such players are the participants in professional golf tournaments (see PGA Tour). Before using a goodness-of-fit test, it should first be checked that the participants indeed have approximately the same golf proficiency. This can be done separately for each hole by using, for example, the Pearson correlation coefficient between the hole score on the first day and the second day of a tournament. If there are no systematic differences (see Classical test theory) between players, the correlation (see Correlation) between the score achieved on Day 1 on a hole and the score achieved on Day 2 on that hole will not differ significantly (see Statistical significance) from zero. This can be easily tested statistically. In a study by van der Ven,^[5] the results of a goodness-of-fit test of the Hardy distribution were reported using the hole-by-hole scores from the 2012 Open Championship played at the St Andrews Golf Club. The distribution has been tested separately for each hole. Pearson's chi-squared test was used to determine whether the observed sample frequencies of the hole scores differed significantly from the expected frequencies according to the Hardy distribution. The fit between observed and expected frequencies was generally very satisfactory.

Related Research Articles

In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.

In statistics, the standard deviation is a measure of the amount of variation of a random variable expected about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard deviation is commonly used in the determination of what constitutes an outlier and what does not.

<span class="mw-page-title-main">Skewness</span> Measure of the asymmetry of random variables

In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.

<span class="mw-page-title-main">Multivariate normal distribution</span> Generalization of the one-dimensional normal distribution to higher dimensions

In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.

<span class="mw-page-title-main">Pigeonhole principle</span> If there are more items than boxes holding them, one box must contain at least two items

In mathematics, the pigeonhole principle states that if $n$ items are put into $m$ containers, with $n > m$ , then at least one container must contain more than one item. For example, of three gloves, at least two must be right-handed or at least two must be left-handed, because there are three objects but only two categories of handedness to put them into. This seemingly obvious statement, a type of counting argument, can be used to demonstrate possibly unexpected results. For example, given that the population of London is more than one unit greater than the maximum number of hairs that can be on a human's head, the principle requires that there must be at least two people in London who have the same number of hairs on their heads.

In probability theory, the birthday problem asks for the probability that, in a set of $n$ randomly chosen people, at least two will share a birthday. The birthday paradox refers to the counterintuitive fact that only 23 people are needed for that probability to exceed 50%.

In number theory and combinatorics, a partition of a non-negative integer $n$ , also called an integer partition, is a way of writing $n$ as a sum of positive integers. Two sums that differ only in the order of their summands are considered the same partition. For example, $4$ can be partitioned in five distinct ways:

In mathematics, the error function, often denoted by $erf$ , is a function defined as:

The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data.

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.

In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.

In statistics, gambler's ruin is the fact that a gambler playing a game with negative expected value will eventually go broke, regardless of their betting system.

In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.

<span class="mw-page-title-main">Monte Carlo integration</span> Numerical technique

In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand at a regular grid, Monte Carlo randomly chooses points at which the integrand is evaluated. This method is particularly useful for higher-dimensional integrals.

In information theory, the cross-entropy between two probability distributions $and over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution, rather than the true distribution .$

<span class="mw-page-title-main">Wrapped normal distribution</span>

In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution that results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for periodic boundary conditions. It is closely approximated by the von Mises distribution, which, due to its mathematical simplicity and tractability, is the most commonly used distribution in directional statistics.

In statistics, the generalized Marcum Q-function of order $is defined as$

Inverse probability weighting is a statistical technique for estimating quantities related to a population other than the one from which the data was collected. Study designs with a disparate sampling population and population of target inference are common in application. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.

Maximal entropy random walk (MERW) is a popular type of biased random walk on a graph, in which transition probabilities are chosen accordingly to the principle of maximum entropy, which says that the probability distribution which best represents the current state of knowledge is the one with largest entropy. While standard random walk chooses for every vertex uniform probability distribution among its outgoing edges, locally maximizing entropy rate, MERW maximizes it globally by assuming uniform probability distribution among all paths in a given graph.

A Stein discrepancy is a statistical divergence between two probability measures that is rooted in Stein's method. It was first formulated as a tool to assess the quality of Markov chain Monte Carlo samplers, but has since been used in diverse settings in statistics, machine learning and computer science.

References

Notes

↑ Hardy, G.H. (1945). "A mathematical theorem about golf". The Mathematical Gazette. 29: 226–227. doi:10.2307/3609265. JSTOR 3609265.
↑ Minton, R. B. (2010). "G. H. Hardy's Golfing Adventure". In Gallian, Joseph A. (ed.). Mathematics and sports. Mathematical Association of America. doi:10.5948/UPO9781614442004. ISBN 9780883853498.
↑ van der Ven, A.H.G.S. (2012). "The Hardy distribution for golf hole-by-hole scores". The Mathematical Gazette. 96: 428–438. doi:10.1017/S0025557200005052. S2CID 233357735.
1 2 Kang, J. (2017). "Brilliance or steadiness? A suggestion of an alternative model to Hardy's model concerning golf (1945)". The Mathematical Gazette. 101 (551): 250–260. doi: 10.1017/mag.2017.64 . S2CID 148948951.
↑ van der Ven, A.H.G.S. (2013). "Applying the Hardy Distribution to the Hole Scores of the 2012 British Open Championship". International Journal of Golf Science. 2 (2): 152–161. doi:10.1123/ijgs.2013-0014.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hardy, G.H. (1945). "A mathematical theorem about golf". The Mathematical Gazette. 29: 226–227. doi:10.2307/3609265. JSTOR 3609265.

[Minton-2] Minton, R. B. (2010). "G. H. Hardy's Golfing Adventure". In Gallian, Joseph A. (ed.). Mathematics and sports. Mathematical Association of America. doi:10.5948/UPO9781614442004. ISBN 9780883853498.

[3] van der Ven, A.H.G.S. (2012). "The Hardy distribution for golf hole-by-hole scores". The Mathematical Gazette. 96: 428–438. doi:10.1017/S0025557200005052. S2CID 233357735.

[Kang-4] 1 2 Kang, J. (2017). "Brilliance or steadiness? A suggestion of an alternative model to Hardy's model concerning golf (1945)". The Mathematical Gazette. 101 (551): 250–260. doi: 10.1017/mag.2017.64 . S2CID 148948951.

[5] van der Ven, A.H.G.S. (2013). "Applying the Hardy Distribution to the Hole Scores of the 2012 British Open Championship". International Journal of Golf Science. 2 (2): 152–161. doi:10.1123/ijgs.2013-0014.

[1]

[2]

[3]

[4]

[5]

Probability mass function The horizontal axis represents the hole score $n$ . The vertical axis represents the probability of the hole score $n$ given the par of the hole and the probabilities $p$ = 0.20 and $q$ = 0.10. The blue points represent the probabilities for a par three, the green points for a par four and the red points for a par five The function is defined only at integer values of $n$ . The connecting lines are only guides for the eye.
Cumulative distribution function The horizontal axis represents the hole score $n$ . The vertical axis represents the cumulative probability of the hole score $n$ given the par of the hole and the probabilities $p$ = 0.20 and $q$ = 0.10. The blue points represent the probabilities for a par three, the green points for a par four and the red points for a par five. The cumulative probability density (CDF) is discontinuous at the integers of $n$ and flat everywhere else because a variable that is Hardy distributed takes on only integer values.
Notation	$\operatorname {Hardy} (p,q;m)$
Parameters	$p,q\in (0,1)$ , $p+q\in (0,1)$ and $m=1,2,3,\dots$
Support	$n\in \mathbb {N} _{0}$ (Natural numbers starting from 1)
PMF	For m is odd: $P\left(X=n\right)\,=\,\sum _{j={\frac {m+1}{2}}}^{m}{n-1 \choose n-j}{q}^{n-j}\left(A_{j,m}+B_{j,m}\right)$ Contents Definitions Probability mass function Hardy distribution for a par three, four and five History Goodness of fit References For m is even: $P\left(X=n\right)\,=\,\sum _{j={\frac {m}{2}}}^{m}{n-1 \choose n-j}{q}^{n-j}\left(A_{j,m}+B_{j,m}\right)$ with $A_{j,m}\,=\,{j-1 \choose 2\,j-m-1}{p}^{m-j+1}\left(1-p-q\right)^{2\,j-m-1}$ and $B_{j,m}\,=\,{j \choose 2\,j-m}{p}^{m-j}\left(1-p-q\right)^{2\,j-m}$
Mean	$\,-\,\sum _{j=1}^{m}{\frac {\left(m+1-j\right)\,{p}^{j-1}}{\left(q-1\right)^{j}}}$
MGF	For m is odd: $M_{m}\left(t\right)=\sum _{j={\frac {m}{2}}+{\frac {1}{2}}}^{m}{\frac {\left(X_{\it {jm}}+Y_{\it {jm}}\right)~e^{j~t}}{\left(1-e^{t}~q\right)^{j}}}$ For m is even: $M_{m}\left(t\right)=\sum _{j={\frac {m}{2}}}^{m}{\frac {\left(X_{\it {jm}}+Y_{\it {jm}}\right)~e^{j~t}}{\left(1-e^{t}~q\right)^{j}}}$ with $X_{\it {jm}}\,=\,{j-1 \choose 2\,j-m-1}{p}^{m+1-j}\left(1-p-q\right)^{2\,j-m-1}$ and $Y_{\it {jm}}\,=\,{j \choose 2\,j-m}{p}^{-j+m}\left(1-p-q\right)^{2\,j-m}$