This article needs additional citations for verification .(August 2024) |
Parameters | scale shape | ||
---|---|---|---|
Support | |||
PMF | |||
CDF |
In probability theory and statistics, the discrete Weibull distribution is the discrete variant of the Weibull distribution. The Discrete Weibull Distribution, first introduced by Toshio Nakagawa and Shunji Osaki, is a discrete analog of the continuous Weibull distribution, predominantly used in reliability engineering. It is particularly applicable for modeling failure data measured in discrete units like cycles or shocks. This distribution provides a versatile tool for analyzing scenarios where the timing of events is counted in distinct intervals, making it distinctively useful in fields that deal with discrete data patterns and reliability analysis. The discrete Weibull distribution is infinitely divisible only for . [1]
In the original paper by Nakagawa and Osaki they used the parametrization making the cumulative distribution function
with and the probability mass function
. Setting makes the relationship with the geometric distribution apparent. [2]
An alternative parametrization — related to the Pareto distribution — has been used to estimate parameters in infectious disease modelling. [3] This parametrization introduces a parameter , meaning that the term can be replaced with . Therefore, the probability mass function can be expressed as
and the cumulative mass function can be expressed as
The continuous Weibull distribution has a close relationship with the Gumbel distribution which is easy to see when log-transforming the variable. A similar transformation can be made on the discrete Weibull.
Define where (unconventionally) and define parameters and . By replacing in the cumulative mass function:
We see that we get a location-scale parametrization:
which in estimation settings makes a lot of sense. This opens up the possibility of regression with frameworks developed for Weibull regression and extreme-value-theory. [4]
The discrete Weibull distribution can be compared with other common discrete distributions such as the Poisson, geometric, and negative binomial distributions, each of which has unique characteristics and applications.
The Poisson distribution is often used to model the number of rare event occurrences during a fixed period of time. It is characterized by a single parameter, λ, which is both the mean and variance of the distribution. The discrete Weibull distribution, on the other hand, is more flexible and can handle both over- and under-dispersion in count data. It has two parameters, q and β, which influence the shape and scale of the distribution. Unlike the Poisson distribution, which assumes events occur independently, the discrete Weibull can adapt to different event occurrence patterns.
The geometric distribution models the probability of the first success in a sequence of Bernoulli trials and is characterized by a single parameter, p, which is the probability of success on an individual trial. In contrast, the discrete Weibull distribution can model a broader range of data patterns due to its two parameters. While the geometric distribution is specifically for modeling the number of trials until the first success, the discrete Weibull can be used in a wider variety of scenarios, including those where the probability of success changes over trials.
The negative binomial distribution is used to model the number of Bernoulli trials needed before a particular number of successes is achieved. It is characterized by the probability of success and the number of successes. The discrete Weibull distribution, with its flexibility in modeling different data patterns, can be a better fit for data that does not conform to the specific scenario modeled by the negative binomial distribution.
Overall the discrete Weibull distribution is preferred over these alternatives when dealing with data that exhibit variability in dispersion (over- or under-dispersion) or when the data patterns do not fit the specific scenarios that Poisson, geometric, or negative binomial distributions are best suited for. Its adaptability in terms of shape and scale makes it a versatile tool in statistical modeling of discrete data. [5]
The Discrete Weibull distribution finds diverse applications in statistical analysis, as evidenced by various scholarly papers. One such paper illustrates the distribution's utility in modeling count data, specifically in the context of fertility plans. This study highlights how the Discrete Weibull distribution effectively captures complex relationships influenced by factors like education and family background. Unlike the Poisson distribution, it adeptly manages both overdispersed and underdispersed data, demonstrating its flexibility and efficacy in social science research. This application marks a significant extension of the distribution's usage beyond its traditional role in reliability engineering. [6]
Further expanding its scope, "On Bivariate Discrete Weibull Distribution" explores the application of the Discrete Weibull distribution to bivariate data. The paper delves into sophisticated statistical techniques, including maximum likelihood estimation and Bayesian inference, for analyzing bivariate discrete data. This exploration underscores the distribution's compatibility with complex statistical methods. Moreover, the paper presents practical analysis scenarios, such as examining football match scores and nasal drainage severity, highlighting the distribution's broad applicability across varied fields. These instances underscore the distribution's practicality in real-world situations, moving beyond mere theoretical constructs. [7]
Another significant advancement is presented in "The Exponentiated Discrete Weibull Distribution," which introduces an enhanced version of the distribution, termed the Exponentiated Discrete Weibull Distribution (EDW). This generalization increases the model's flexibility, enabling it to represent a broader spectrum of data patterns, including various hazard rate functions like increasing, decreasing, bathtub-shaped, and inverted bathtub-shaped. The EDW distribution's ability to model both overdispersed and underdispersed data, relative to a Poisson distribution, broadens its applicability. It proves to be a versatile tool for various fields, including reliability engineering and failure time studies, further broadening the distribution's practical utility. [8]
In probability theory and statistics, the negative binomial distribution is a discrete probability distribution that models the number of failures in a sequence of independent and identically distributed Bernoulli trials before a specified (non-random) number of successes occurs. For example, we can define rolling a 6 on some dice as a success, and rolling any other number as a failure, and ask how many failure rolls will occur before we see the third success. In such a case, the probability distribution of the number of failures that appear will be a negative binomial distribution.
In probability theory and statistics, the exponential distribution or negative exponential distribution is the probability distribution of the distance between events in a Poisson point process, i.e., a process in which events occur continuously and independently at a constant average rate; the distance parameter could be any meaningful mono-dimensional measure of the process, such as time between production errors, or length along a roll of fabric in the weaving manufacturing process. It is a particular case of the gamma distribution. It is the continuous analogue of the geometric distribution, and it has the key property of being memoryless. In addition to being used for the analysis of Poisson point processes it is found in various other contexts.
In probability theory and statistics, the geometric distribution is either one of two discrete probability distributions:
In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It models a broad range of random variables, largely in the nature of a time to failure or time between events. Examples are maximum one-day rainfalls and the time a user spends on a web page.
In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
In statistics, a generalized linear model (GLM) is a flexible generalization of ordinary linear regression. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.
In probability theory, a compound Poisson distribution is the probability distribution of the sum of a number of independent identically-distributed random variables, where the number of terms to be added is itself a Poisson-distributed variable. The result can be either a continuous or a discrete distribution.
In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.
In probability theory and statistics, the generalized extreme value (GEV) distribution is a family of continuous probability distributions developed within extreme value theory to combine the Gumbel, Fréchet and Weibull families also known as type I, II and III extreme value distributions. By the extreme value theorem the GEV distribution is the only possible limit distribution of properly normalized maxima of a sequence of independent and identically distributed random variables. that a limit distribution needs to exist, which requires regularity conditions on the tail of the distribution. Despite this, the GEV distribution is often used as an approximation to model the maxima of long (finite) sequences of random variables.
In probability theory and statistics, the inverse gamma distribution is a two-parameter family of continuous probability distributions on the positive real line, which is the distribution of the reciprocal of a variable distributed according to the gamma distribution.
In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.
In directional statistics, the Kent distribution, also known as the 5-parameter Fisher–Bingham distribution, is a probability distribution on the unit sphere. It is the analogue on S2 of the bivariate normal distribution with an unconstrained covariance matrix. The Kent distribution was proposed by John T. Kent in 1982, and is used in geology as well as bioinformatics.
In probability theory and statistics, there are several relationships among probability distributions. These relations can be categorized in the following groups:
In probability theory and statistics, the Poisson distribution is a discrete probability distribution that expresses the probability of a given number of events occurring in a fixed interval of time if these events occur with a known constant mean rate and independently of the time since the last event. It can also be used for the number of events in other types of intervals than time, and in dimension greater than 1.
In probability theory and statistics, the Hermite distribution, named after Charles Hermite, is a discrete probability distribution used to model count data with more than one parameter. This distribution is flexible in terms of its ability to allow a moderate over-dispersion in the data.
In probability theory and statistics, the asymmetric Laplace distribution (ALD) is a continuous probability distribution which is a generalization of the Laplace distribution. Just as the Laplace distribution consists of two exponential distributions of equal scale back-to-back about x = m, the asymmetric Laplace consists of two exponential distributions of unequal scale back to back about x = m, adjusted to assure continuity and normalization. The difference of two variates exponentially distributed with different means and rate parameters will be distributed according to the ALD. When the two rate parameters are equal, the difference will be distributed according to the Laplace distribution.
The Kaniadakis Weibull distribution is a probability distribution arising as a generalization of the Weibull distribution. It is one example of a Kaniadakis κ-distribution. The κ-Weibull distribution has been adopted successfully for describing a wide variety of complex systems in seismology, economy, epidemiology, among many others.
The Kaniadakis Generalized Gamma distribution is a four-parameter family of continuous statistical distributions, supported on a semi-infinite interval [0,∞), which arising from the Kaniadakis statistics. It is one example of a Kaniadakis distribution. The κ-Gamma is a deformation of the Generalized Gamma distribution.
The Kaniadakis Gaussian distribution is a probability distribution which arises as a generalization of the Gaussian distribution from the maximization of the Kaniadakis entropy under appropriated constraints. It is one example of a Kaniadakis κ-distribution. The κ-Gaussian distribution has been applied successfully for describing several complex systems in economy, geophysics, astrophysics, among many others.
The Kaniadakis Logistic distribution is a generalized version of the Logistic distribution associated with the Kaniadakis statistics. It is one example of a Kaniadakis distribution. The κ-Logistic probability distribution describes the population kinetics behavior of bosonic or fermionic character.