# Accelerated failure time model

Last updated

In the statistical area of survival analysis, an accelerated failure time model (AFT model) is a parametric model that provides an alternative to the commonly used proportional hazards models. Whereas a proportional hazards model assumes that the effect of a covariate is to multiply the hazard by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. This is especially appealing in a technical context where the 'disease' is a result of some mechanical process with a known sequence of intermediary stages.

Statistics is the discipline that concerns the collection, organization, displaying, analysis, interpretation and presentation of data. In applying statistics to a scientific, industrial, or social problem, it is conventional to begin with a statistical population or a statistical model to be studied. Populations can be diverse groups of people or objects such as "all people living in a country" or "every atom composing a crystal". Statistics deals with every aspect of data, including the planning of data collection in terms of the design of surveys and experiments. See glossary of probability and statistics.

Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer questions such as: what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?

Parametric statistics is a branch of statistics which assumes that sample data comes from a population that can be adequately modelled by a probability distribution that has a fixed set of parameters. Conversely a non-parametric model differs precisely in that the parameter set is not fixed and can increase, or even decrease, if new relevant information is collected.

## Model specification

In full generality, the accelerated failure time model can be specified as [1]

${\displaystyle \lambda (t|\theta )=\theta \lambda _{0}(\theta t)}$

where ${\displaystyle \theta }$ denotes the joint effect of covariates, typically ${\displaystyle \theta =\exp(-[\beta _{1}X_{1}+\cdots +\beta _{p}X_{p}])}$. (Specifying the regression coefficients with a negative sign implies that high values of the covariates increase the survival time, but this is merely a sign convention; without a negative sign, they increase the hazard.)

This is satisfied if the probability density function of the event is taken to be ${\displaystyle f(t|\theta )=\theta f_{0}(\theta t)}$; it then follows for the survival function that ${\displaystyle S(t|\theta )=S_{0}(\theta t)}$. From this it is easy[ citation needed ] to see that the moderated life time ${\displaystyle T}$ is distributed such that ${\displaystyle T\theta }$ and the unmoderated life time ${\displaystyle T_{0}}$ have the same distribution. Consequently, ${\displaystyle \log(T)}$ can be written as

In probability theory, a probability density function (PDF), or density of a continuous random variable, is a function whose value at any given sample in the sample space can be interpreted as providing a relative likelihood that the value of the random variable would equal that sample. In other words, while the absolute likelihood for a continuous random variable to take on any particular value is 0, the value of the PDF at two different samples can be used to infer, in any particular draw of the random variable, how much more likely it is that the random variable would equal one sample compared to the other sample.

The survival function is a function that gives the probability that a patient, device, or other object of interest will survive beyond any specified time.

${\displaystyle \log(T)=-\log(\theta )+\log(T\theta ):=-\log(\theta )+\epsilon }$

where the last term is distributed as ${\displaystyle \log(T_{0})}$, i.e., independently of ${\displaystyle \theta }$. This reduces the accelerated failure time model to regression analysis (typically a linear model) where ${\displaystyle -\log(\theta )}$ represents the fixed effects, and ${\displaystyle \epsilon }$ represents the noise. Different distributions of ${\displaystyle \epsilon }$ imply different distributions of ${\displaystyle T_{0}}$, i.e., different baseline distributions of the survival time. Typically, in survival-analytic contexts, many of the observations are censored: we only know that ${\displaystyle T_{i}>t_{i}}$, not ${\displaystyle T_{i}=t_{i}}$. In fact, the former case represents survival, while the later case represents an event/death/censoring during the follow-up. These right-censored observations can pose technical challenges for estimating the model, if the distribution of ${\displaystyle T_{0}}$ is unusual.

In statistical modeling, regression analysis is a set of statistical processes for estimating the relationships among variables. It includes many techniques for modeling and analyzing several variables, when the focus is on the relationship between a dependent variable and one or more independent variables. More specifically, regression analysis helps one understand how the typical value of the dependent variable changes when any one of the independent variables is varied, while the other independent variables are held fixed.

In statistics, the term linear model is used in different ways according to the context. The most common occurrence is in connection with regression models and the term is often taken as synonymous with linear regression model. However, the term is also used in time series analysis with a different meaning. In each case, the designation "linear" is used to identify a subclass of models for which substantial reduction in the complexity of the related statistical theory is possible.

The interpretation of ${\displaystyle \theta }$ in accelerated failure time models is straightforward: ${\displaystyle \theta =2}$ means that everything in the relevant life history of an individual happens twice as fast. For example, if the model concerns the development of a tumor, it means that all of the pre-stages progress twice as fast as for the unexposed individual, implying that the expected time until a clinical disease is 0.5 of the baseline time. However, this does not mean that the hazard function ${\displaystyle \lambda (t|\theta )}$ is always twice as high - that would be the proportional hazards model.

Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate. For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure. Other types of survival models such as accelerated failure time models do not exhibit proportional hazards. The accelerated failure time model describes a situation where the biological or mechanical life history of an event is accelerated.

## Statistical issues

Unlike proportional hazards models, in which Cox's semi-parametric proportional hazards model is more widely used than parametric models, AFT models are predominantly fully parametric i.e. a probability distribution is specified for ${\displaystyle \log(T_{0})}$. (Buckley and James [2] proposed a semi-parametric AFT but its use is relatively uncommon in applied research; in a 1992 paper, Wei [3] pointed out that the Buckley–James model has no theoretical justification and lacks robustness, and reviewed alternatives.) This can be a problem, if a degree of realistic detail is required for modelling the distribution of a baseline lifetime. Hence, technical developments in this direction would be highly desirable.

Sir David Roxbee Cox is a prominent British statistician.

In probability theory and statistics, a probability distribution is a mathematical function that provides the probabilities of occurrence of different possible outcomes in an experiment. In more technical terms, the probability distribution is a description of a random phenomenon in terms of the probabilities of events. For instance, if the random variable X is used to denote the outcome of a coin toss, then the probability distribution of X would take the value 0.5 for X = heads, and 0.5 for X = tails. Examples of random phenomena can include the results of an experiment or survey.

Unlike proportional hazards models, the regression parameter estimates from AFT models are robust to omitted covariates. They are also less affected by the choice of probability distribution. [4] [5]

The results of AFT models are easily interpreted. [6] For example, the results of a clinical trial with mortality as the endpoint could be interpreted as a certain percentage increase in future life expectancy on the new treatment compared to the control. So a patient could be informed that he would be expected to live (say) 15% longer if he took the new treatment. Hazard ratios can prove harder to explain in layman's terms.

### Distributions used in AFT models

The log-logistic distribution provides the most commonly used AFT model. Unlike the Weibull distribution, it can exhibit a non-monotonic hazard function which increases at early times and decreases at later times. It is somewhat similar in shape to the log-normal distribution but it has heavier tails. The log-logistic cumulative distribution function has a simple closed form, which becomes important computationally when fitting data with censoring. For the censored observations one needs the survival function, which is the complement of the cumulative distribution function, i.e. one needs to be able to evaluate ${\displaystyle S(t|\theta )=1-F(t|\theta )}$.

The Weibull distribution (including the exponential distribution as a special case) can be parameterised as either a proportional hazards model or an AFT model, and is the only family of distributions to have this property. The results of fitting a Weibull model can therefore be interpreted in either framework. However, the biological applicability of this model may be limited by the fact that the hazard function is monotonic, i.e. either decreasing or increasing.

Other distributions suitable for AFT models include the log-normal, gamma and inverse Gaussian distributions, although they are less popular than the log-logistic, partly as their cumulative distribution functions do not have a closed form. Finally, the generalized gamma distribution is a three-parameter distribution that includes the Weibull, log-normal and gamma distributions as special cases.

## Related Research Articles

In statistics, the likelihood function expresses how probable a given set of observations are for different values of statistical parameters. It is equal to the joint probability distribution of the random sample evaluated at the given observations, and it is, thus, solely a function of parameters that index the family of those probability distributions.

In probability theory and statistics, the Weibull distribution is a continuous probability distribution. It is named after Swedish mathematician Waloddi Weibull, who described it in detail in 1951, although it was first identified by Fréchet (1927) and first applied by Rosin & Rammler (1933) to describe a particle size distribution.

In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions described by two levels of an explanatory variable. For example, in a drug study, the treated population may die at twice the rate per unit time as the control population. The hazard ratio would be 2, indicating higher hazard of death from the treatment. Or in another study, men receiving the same treatment may suffer a certain complication ten times more frequently per unit time than women, giving a hazard ratio of 10.

In statistics, the generalized linear model (GLM) is a flexible generalization of ordinary linear regression that allows for response variables that have error distribution models other than a normal distribution. The GLM generalizes linear regression by allowing the linear model to be related to the response variable via a link function and by allowing the magnitude of the variance of each measurement to be a function of its predicted value.

In statistics, the score test assesses constraints on statistical parameters based on the gradient of the likelihood function—known as the score—evaluated at the hypothesized parameter value under the null hypothesis. Intuitively, if the restricted estimator is near the maximum of the likelihood function, the score should not differ from zero by more than sampling error. While the finite sample distributions of score tests are generally unknown, it has an asymptotic χ2-distribution under the null hypothesis as first proved by C. R. Rao in 1948, a fact that can be used to determine statistical significance.

In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.

Robust statistics are statistics with good performance for data drawn from a wide range of probability distributions, especially for distributions that are not normal. Robust statistical methods have been developed for many common problems, such as estimating location, scale, and regression parameters. One motivation is to produce statistical methods that are not unduly affected by outliers. Another motivation is to provide methods with good performance when there are small departures from parametric distribution. For example, robust methods work well for mixtures of two normal distributions with different standard-deviations; under this model, non-robust methods like a t-test work poorly.

The Kaplan–Meier estimator, also known as the product limit estimator, is a non-parametric statistic used to estimate the survival function from lifetime data. In medical research, it is often used to measure the fraction of patients living for a certain amount of time after treatment. In other fields, Kaplan–Meier estimators may be used to measure the length of time people remain unemployed after a job loss, the time-to-failure of machine parts, or how long fleshy fruits remain on plants before they are removed by frugivores. The estimator is named after Edward L. Kaplan and Paul Meier, who each submitted similar manuscripts to the Journal of the American Statistical Association. The journal editor, John Tukey, convinced them to combine their work into one paper, which has been cited about 55,000 times since its publication.

In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components.

Gauss Moutinho Cordeiro is a Brazilian engineer, mathematician and statistician who has made significant contributions to the theory of statistical inference, mainly through asymptotic theory and applied probability.

In statistics, explained variation measures the proportion to which a mathematical model accounts for the variation (dispersion) of a given data set. Often, variation is quantified as variance; then, the more specific term explained variance can be used.

In probability and statistics, the log-logistic distribution is a continuous probability distribution for a non-negative random variable. It is used in survival analysis as a parametric model for events whose rate increases initially and decreases later, for example mortality rate from cancer following diagnosis or treatment. It has also been used in hydrology to model stream flow and precipitation, in economics as a simple model of the distribution of wealth or income, and in networking to model the transmission times of data considering both the network and the software.

In statistics, ordinal regression is a type of regression analysis used for predicting an ordinal variable, i.e. a variable whose value exists on an arbitrary scale where only the relative ordering between different values is significant. It can be considered an intermediate problem between regression and classification. Examples of ordinal regression are ordered logit and ordered probit. Ordinal regression turns up often in the social sciences, for example in the modeling of human levels of preference, as well as in information retrieval. In machine learning, ordinal regression may also be called ranking learning.

In statistics, the variance function is a smooth function which depicts the variance of a random quantity as a function of its mean. The variance function plays a large role in many settings of statistical modelling. It is a main ingredient in the generalized linear model framework and a tool used in non-parametric regression, semiparametric regression and functional data analysis. In parametric modeling, variance functions take on a parametric form and explicitly describe the relationship between the variance and the mean of a random quantity. In a non-parametric setting, the variance function is assumed to be a smooth function.

Issues of heterogeneity in duration models can take on different forms. On the one hand, unobserved heterogeneity can play a crucial role when it comes to different sampling methods, such as stock or flow sampling. On the other hand, duration models have also been extended to allow for different subpopulations, with a strong link to mixture models. Many of these models impose the assumptions that heterogeneity is independent of the observed covariates, it has a distribution that depends on a finite number of parameters only, and it enters the hazard function multiplicatively.

In statistics, the class of vector generalized linear models (VGLMs) was proposed to enlarge the scope of models catered for by generalized linear models (GLMs). In particular, VGLMs allow for response variables outside the classical exponential family and for more than one parameter. Each parameter can be transformed by a link function. The VGLM framework is also large enough to naturally accommodate multiple responses; these are several independent responses each coming from a particular statistical distribution with possibly different parameter values.

In probability theory and statistics, the discrete Weibull distribution is the discrete variant of the Weibull distribution. It was first described by Nakagawa and Osaki in 1975.

## References

1. Kalbfleisch & Prentice (2002). The Statistical Analysis of Failure Time Data (2nd ed.). Hoboken, NJ: Wiley Series in Probability and Statistics.
2. Buckley, Jonathan; James, Ian (1979), "Linear regression with censored data", Biometrika, 66 (3): 429–436, doi:10.1093/biomet/66.3.429, JSTOR   2335161
3. Wei, L. J. (1992). "The accelerated failure time model: A useful alternative to the cox regression model in survival analysis". Statistics in Medicine. 11 (14–15): 1871–1879. doi:10.1002/sim.4780111409. PMID   1480879.
4. Lambert, Philippe; Collett, Dave; Kimber, Alan; Johnson, Rachel (2004), "Parametric accelerated failure time models with random effects and an application to kidney transplant survival", Statistics in Medicine, 23 (20): 3177–3192, doi:10.1002/sim.1876, PMID   15449337
5. Keiding, N.; Andersen, P. K.; Klein, J. P. (1997). "The Role of Frailty Models and Accelerated Failure Time Models in Describing Heterogeneity Due to Omitted Covariates". Statistics in Medicine. 16 (1–3): 215–224. doi:10.1002/(SICI)1097-0258(19970130)16:2<215::AID-SIM481>3.0.CO;2-J. PMID   9004393.
6. Kay, Richard; Kinnersley, Nelson (2002), "On the use of the accelerated failure time model as an alternative to the proportional hazards model in the treatment of time to event data: A case study in influenza", Drug Information Journal, 36 (3): 571–579, doi:10.1177/009286150203600312
• Bradburn, MJ; Clark, TG; Love, SB; Altman, DG (2003), "Survival Analysis Part II: Multivariate data analysis - an introduction to concepts and methods", British Journal of Cancer, 89 (3): 431–436, doi:10.1038/sj.bjc.6601119, PMC  , PMID   12888808
• Hougaard, Philip (1999), "Fundamentals of Survival Data", Biometrics, 55 (1): 13–22, doi:10.1111/j.0006-341X.1999.00013.x, PMID   11318147
• Collett, D. (2003), Modelling Survival Data in Medical Research (2nd ed.), CRC press, ISBN   978-1-58488-325-8
• Cox, David Roxbee; Oakes, D. (1984), Analysis of Survival Data, CRC Press, ISBN   978-0-412-24490-2
• Marubini, Ettore; Valsecchi, Maria Grazia (1995), Analysing Survival Data from Clinical Trials and Observational Studies, Wiley, ISBN   978-0-470-09341-2
• Martinussen, Torben; Scheike, Thomas (2006), Dynamic Regression Models for Survival Data, Springer, ISBN   0-387-20274-9
• Bagdonavicius, Vilijandas; Nikulin, Mikhail (2002), Accelerated Life Models. Modeling and Statistical Analysis, Chapman&Hall/CRC, ISBN   1-58488-186-0