Probability mass function | |||
Cumulative distribution function | |||
Notation | |||
---|---|---|---|
Parameters | , and | ||
Support | (Natural numbers starting from 1) | ||
PMF | For m is odd: Contents
For m is even: with and | ||
Mean | |||
MGF | For m is odd: For m is even: with and |
In probability theory and statistics, the Hardy distribution is a discrete probability distribution that expresses the probability of the hole score for a given golf player. It is based on Hardy's (Hardy, 1945) basic assumption that there are three types of shots:
good , bad and ordinary ,
where the probability of a good hit equals , the probability of a bad hit equals and the probability of an ordinary hit equals . Hardy further assigned
a value of 2 to a good stroke, a value of 0 to a bad stroke and a value of 1 to a regular or ordinary stroke.
Once the sum of the values is greater than or equal to the value of the par of the hole, the number of strokes in question is equal to the score achieved on that hole. A birdie on a par three could then have come about in three ways: , and , respectively, with probabilities , and .
A discrete random variable X is said to have a Hardy distribution, with parameters , and if it has a probability mass function given by:
if m is odd
and
if m is even
with
and
where
The moment generating function is given by:
if m is odd
and
if m is even
with
and
Each raw moment and each central moment can be easily determined with the moment generating function, but the formulas involved are too large to present here.
For a par three:
For a par four:
Note the resemblance with . For a par five:
Note the resemblance with the formulas for and .
When trying to make a probability distribution in golf that describes the frequency distribution of the number of strokes on a hole, the simplest setup is to assume that there are only two types of strokes:
A good stroke with a probability of A bad stroke with a probability of . while a good shot then gets the value 1 and a bad shot gets the value 0.
Once the sum of the shot values equals the par of the hole, that is the number of strokes needed for the hole. It is clear that with this setup, a birdie is not possible. After all, the smallest number of strokes one can get is the par of the hole. Hardy (1945) probably realized that too and then came up with the idea not to assume that there were just two types of strokes: good and bad , but three types:
good with probability bad with probability ordinary with probability .
In fact, Hardy called a good shot a supershot and a bad shot a subshot. [1] Minton later called Hardy's supershot an excellent shot and Hardy's subshot a bad shot . [2] In this article, Minton's excellent shot is called a good shot . Hardy came up with the idea of three types of shots in 1945, but the actual derivation of the probability distribution of the hole score was not given until 2012 by van der Ven. [3]
Hardy assumed that the probability of a good stroke was equal to the probability of a bad stroke, namely . This was confirmed by Kang:
Hardy's model is very simple in that all strokes are independent from each other and the probability of producing a good shot is equal to the probability of producing a bad shot. [4]
In retrospect, Hardy might well have been right, as the data in Table 2 in van der Ven (2013) show. This table shows the estimated - and -values for holes 1-18 for rounds 1 and 2 of the 2012 British Open Championship. The mean values were equal to 0.0633 and 0.0697, respectively. Later Cohen (2002) introduced the idea that and should be different. Kang says about this:
Cohen takes another step forward and includes the possibility that the probability of good shots and bad shots can differ. [4]
For the Hardy distribution the values of and may be different.
The Hardy distribution gives the probability distribution of a single player's hole score. It takes several observations to perform a goodness-of-fit test (see Goodness of fit test) to check whether the Hardy distribution applies or not. This can be done with a single individual by having the individual play the same hole multiple times. Goodness-of-fit tests assume pure replications (see Replication (statistics)). This means that there should be no change in the player's golfing ability during repeated play of the hole. For example, there should not be an ongoing learning process (see Learning). Such effects cannot really be ruled out. One way around this problem is to use multiple players who can be assumed to have approximately the same golf proficiency. Such players are the participants in professional golf tournaments (see PGA Tour). Before using a goodness-of-fit test, it should first be checked that the participants indeed have approximately the same golf proficiency. This can be done separately for each hole by using, for example, the Pearson correlation coefficient between the hole score on the first day and the second day of a tournament. If there are no systematic differences (see Classical test theory) between players, the correlation (see Correlation) between the score achieved on Day 1 on a hole and the score achieved on Day 2 on that hole will not differ significantly (see Statistical significance) from zero. This can be easily tested statistically. In a study by van der Ven, [5] the results of a goodness-of-fit test of the Hardy distribution were reported using the hole-by-hole scores from the 2012 Open Championship played at the St Andrews Golf Club. The distribution has been tested separately for each hole. Pearson's chi-squared test was used to determine whether the observed sample frequencies of the hole scores differed significantly from the expected frequencies according to the Hardy distribution. The fit between observed and expected frequencies was generally very satisfactory.
In probability theory and statistics, the binomial distribution with parameters n and p is the discrete probability distribution of the number of successes in a sequence of n independent experiments, each asking a yes–no question, and each with its own Boolean-valued outcome: success or failure. A single success/failure experiment is also called a Bernoulli trial or Bernoulli experiment, and a sequence of outcomes is called a Bernoulli process; for a single trial, i.e., n = 1, the binomial distribution is a Bernoulli distribution. The binomial distribution is the basis for the popular binomial test of statistical significance.
In statistics, the standard deviation is a measure of the amount of variation of a random variable expected about its mean. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range. The standard deviation is commonly used in the determination of what constitutes an outlier and what does not.
In probability theory and statistics, skewness is a measure of the asymmetry of the probability distribution of a real-valued random variable about its mean. The skewness value can be positive, zero, negative, or undefined.
In probability theory and statistics, the multivariate normal distribution, multivariate Gaussian distribution, or joint normal distribution is a generalization of the one-dimensional (univariate) normal distribution to higher dimensions. One definition is that a random vector is said to be k-variate normally distributed if every linear combination of its k components has a univariate normal distribution. Its importance derives mainly from the multivariate central limit theorem. The multivariate normal distribution is often used to describe, at least approximately, any set of (possibly) correlated real-valued random variables, each of which clusters around a mean value.
In mathematics, the pigeonhole principle states that if n items are put into m containers, with n > m, then at least one container must contain more than one item. For example, of three gloves, at least two must be right-handed or at least two must be left-handed, because there are three objects but only two categories of handedness to put them into. This seemingly obvious statement, a type of counting argument, can be used to demonstrate possibly unexpected results. For example, given that the population of London is more than one unit greater than the maximum number of hairs that can be on a human's head, the principle requires that there must be at least two people in London who have the same number of hairs on their heads.
In probability theory, the birthday problem asks for the probability that, in a set of n randomly chosen people, at least two will share a birthday. The birthday paradox refers to the counterintuitive fact that only 23 people are needed for that probability to exceed 50%.
In number theory and combinatorics, a partition of a non-negative integer n, also called an integer partition, is a way of writing n as a sum of positive integers. Two sums that differ only in the order of their summands are considered the same partition. For example, 4 can be partitioned in five distinct ways:
In mathematics, the error function, often denoted by erf, is a function defined as:
The principle of maximum entropy states that the probability distribution which best represents the current state of knowledge about a system is the one with largest entropy, in the context of precisely stated prior data.
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.
In information theory, the asymptotic equipartition property (AEP) is a general property of the output samples of a stochastic source. It is fundamental to the concept of typical set used in theories of data compression.
In statistics, gambler's ruin is the fact that a gambler playing a game with negative expected value will eventually go broke, regardless of their betting system.
In probability theory, the multinomial distribution is a generalization of the binomial distribution. For example, it models the probability of counts for each side of a k-sided dice rolled n times. For n independent trials each of which leads to a success for exactly one of k categories, with each category having a given fixed success probability, the multinomial distribution gives the probability of any particular combination of numbers of successes for the various categories.
In mathematics, Monte Carlo integration is a technique for numerical integration using random numbers. It is a particular Monte Carlo method that numerically computes a definite integral. While other algorithms usually evaluate the integrand at a regular grid, Monte Carlo randomly chooses points at which the integrand is evaluated. This method is particularly useful for higher-dimensional integrals.
In information theory, the cross-entropy between two probability distributions and over the same underlying set of events measures the average number of bits needed to identify an event drawn from the set if a coding scheme used for the set is optimized for an estimated probability distribution , rather than the true distribution .
In probability theory and directional statistics, a wrapped normal distribution is a wrapped probability distribution that results from the "wrapping" of the normal distribution around the unit circle. It finds application in the theory of Brownian motion and is a solution to the heat equation for periodic boundary conditions. It is closely approximated by the von Mises distribution, which, due to its mathematical simplicity and tractability, is the most commonly used distribution in directional statistics.
In statistics, the generalized Marcum Q-function of order is defined as
Inverse probability weighting is a statistical technique for estimating quantities related to a population other than the one from which the data was collected. Study designs with a disparate sampling population and population of target inference are common in application. There may be prohibitive factors barring researchers from directly sampling from the target population such as cost, time, or ethical concerns. A solution to this problem is to use an alternate design strategy, e.g. stratified sampling. Weighting, when correctly applied, can potentially improve the efficiency and reduce the bias of unweighted estimators.
Maximal entropy random walk (MERW) is a popular type of biased random walk on a graph, in which transition probabilities are chosen accordingly to the principle of maximum entropy, which says that the probability distribution which best represents the current state of knowledge is the one with largest entropy. While standard random walk chooses for every vertex uniform probability distribution among its outgoing edges, locally maximizing entropy rate, MERW maximizes it globally by assuming uniform probability distribution among all paths in a given graph.
A Stein discrepancy is a statistical divergence between two probability measures that is rooted in Stein's method. It was first formulated as a tool to assess the quality of Markov chain Monte Carlo samplers, but has since been used in diverse settings in statistics, machine learning and computer science.
Notes