Bayesian survival analysis

Last updated

Survival analysis is normally carried out using parametric models, semi-parametric models, non-parametric models to estimate the survival rate in clinical research. However recently Bayesian models [1] are also used to estimate the survival rate due to their ability to handle design and analysis issues in clinical research. [2]

Survival analysis is a branch of statistics for analyzing the expected duration of time until one or more events happen, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer questions such as: what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?

In statistics, a parametric model or parametric family or finite-dimensional model is a particular class of statistical models. Specifically, a parametric model is a family of probability distributions that has a finite number of parameters.

In statistics, a semiparametric model is a statistical model that has parametric and nonparametric components.

Related Research Articles

Statistical inference is the process of using data analysis to deduce properties of an underlying probability distribution. Inferential statistical analysis infers properties of a population, for example by testing hypotheses and deriving estimates. It is assumed that the observed data set is sampled from a larger population.

Bayesian inference is a method of statistical inference in which Bayes' theorem is used to update the probability for a hypothesis as more evidence or information becomes available. Bayesian inference is an important technique in statistics, and especially in mathematical statistics. Bayesian updating is particularly important in the dynamic analysis of a sequence of data. Bayesian inference has found application in a wide range of activities, including science, engineering, philosophy, medicine, sport, and law. In the philosophy of decision theory, Bayesian inference is closely related to subjective probability, often called "Bayesian probability".

Meta-analysis statistical method that summarizes data from multiple sources

A meta-analysis is a statistical analysis that combines the results of multiple scientific studies. Meta-analysis can be performed when there are multiple scientific studies addressing the same question, with each individual study reporting measurements that are expected to have some degree of error. The aim then is to use approaches from statistics to derive a pooled estimate closest to the unknown common truth based on how this error is perceived. Existing methods for meta-analysis yield a weighted average from the results of the individual studies, and what differs is the manner in which these weights are allocated and also the manner in which the uncertainty is computed around the point estimate thus generated. In addition to providing an estimate of the unknown common truth, meta-analysis has the capacity to contrast results from different studies and identify patterns among study results, sources of disagreement among those results, or other interesting relationships that may come to light in the context of multiple studies.

Nonparametric statistics is the branch of statistics that is not based solely on parametrized families of probability distributions. Nonparametric statistics is based on either being distribution-free or having a specified distribution but with the distribution's parameters unspecified. Nonparametric statistics includes both descriptive statistics and statistical inference.

In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions described by two levels of an explanatory variable. For example, in a drug study, the treated population may die at twice the rate per unit time as the control population. The hazard ratio would be 2, indicating higher hazard of death from the treatment. Or in another study, men receiving the same treatment may suffer a certain complication ten times more frequently per unit time than women, giving a hazard ratio of 10.

General linear model statistical linear model

The general linear model or multivariate regression model is a statistical linear model. It may be written as

Optimal design

In the design of experiments, optimal designs are a class of experimental designs that are optimal with respect to some statistical criterion. The creation of this field of statistics has been credited to Danish statistician Kirstine Smith.

Robust regression Specialized form of regression analysis, in statistics

In robust statistics, robust regression is a form of regression analysis designed to overcome some limitations of traditional parametric and non-parametric methods. Regression analysis seeks to find the relationship between one or more independent variables and a dependent variable. Certain widely used methods of regression, such as ordinary least squares, have favourable properties if their underlying assumptions are true, but can give misleading results if those assumptions are not true; thus ordinary least squares is said to be not robust to violations of its assumptions. Robust regression methods are designed to be not overly affected by violations of assumptions by the underlying data-generating process.

In statistics, resampling is any of a variety of methods for doing one of the following:

  1. Estimating the precision of sample statistics by using subsets of available data (jackknifing) or drawing randomly with replacement from a set of data points (bootstrapping)
  2. Exchanging labels on data points when performing significance tests
  3. Validating models by using random subsets

Predictive analytics encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.

Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate. For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure. Other types of survival models such as accelerated failure time models do not exhibit proportional hazards. The accelerated failure time model describes a situation where the biological or mechanical life history of an event is accelerated.

In the statistical area of survival analysis, an accelerated failure time model is a parametric model that provides an alternative to the commonly used proportional hazards models. Whereas a proportional hazards model assumes that the effect of a covariate is to multiply the hazard by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. This is especially appealing in a technical context where the 'disease' is a result of some mechanical process with a known sequence of intermediary stages.

In statistics, Bayesian vector autoregression (BVAR) uses Bayesian methods to estimate a vector autoregression (VAR). In that respect, the difference with standard VAR models lies in the fact that the model parameters are treated as random variables, and prior probabilities are assigned to them.

Ensemble learning in statistics and machine learning, using multiple algorithms

In statistics and machine learning, ensemble methods use multiple learning algorithms to obtain better predictive performance than could be obtained from any of the constituent learning algorithms alone. Unlike a statistical ensemble in statistical mechanics, which is usually infinite, a machine learning ensemble consists of only a concrete finite set of alternative models, but typically allows for much more flexible structure to exist among those alternatives.

The survival function is a function that gives the probability that a patient, device, or other object of interest will survive beyond any specified time.

Log-Cauchy distribution

In probability theory, a log-Cauchy distribution is a probability distribution of a random variable whose logarithm is distributed in accordance with a Cauchy distribution. If X is a random variable with a Cauchy distribution, then Y = exp(X) has a log-Cauchy distribution; likewise, if Y has a log-Cauchy distribution, then X = log(Y) has a Cauchy distribution.

An adaptive clinical trial is a clinical trial that evaluates a medical device or treatment by observing participant outcomes on a prescribed schedule, and modifying parameters of the trial protocol in accord with those observations. The adaptation process generally continues throughout the trial, as prescribed in the trial protocol. Modifications may include dosage, sample size, drug undergoing trial, patient selection criteria and "cocktail" mix. In some cases, trials have become an ongoing process that regularly adds and drops therapies and patient groups as more information is gained. Importantly, the trial protocol is set before the trial begins; the protocol pre-specifies the adaptation schedule and processes.

Dipak Kumar Dey is an Indian-American statistician best known for his work on Bayesian methodologies. He is currently the Board of Trustees Distinguished Professor in the Department of Statistics at the University of Connecticut. Dey has an international reputation as a statistician as well as a data scientist. Since he earned a Ph.D. degree in statistics from Purdue University in 1980, Dey has made tremendous contributions to the development of modern statistics, especially in Bayesian analysis, decision science and model selection. Dey has published more than 10 books and edited volumes, and over 260 research articles in peer-refereed national and international journals. In addition, the statistical methodologies that he has developed has found wide applications in a plethora of interdisciplinary and applied fields, such as biometry and bioinformatics, genetics, econometrics, environmental science, and social science. Dey has supervised 40 Ph.D. students, and presented more than 200 professional talks in colloquia, seminars and conferences all over the world. During his career, Dey has been a visiting professor or scholar at many institutions or research centers around the world, such as Macquarie University, Pontificia Universidad Católica de Chile,, University of São Paulo, University of British Columbia, Statistical and Applied Mathematical Sciences Institute, etc. Dey is an elected fellow of the American Association for the Advancement of Science, the American Statistical Association, the Institute of Mathematical Statistics, the International Society for Bayesian Analysis and the International Statistical Institute.

References

  1. Ibrahim, Joseph G.; Chen, Ming-Hui; Sinha, Debajyoti (2001). Bayesian Survival Analysis – Springer. Springer Series in Statistics. doi:10.1007/978-1-4757-3447-8. ISBN   978-1-4419-2933-4.
  2. Sinha, Debajyoti; Dey, Dipak K. (September 1997). "Semiparametric Bayesian Analysis of Survival Data". Journal of the American Statistical Association. 92 (439): 1195–1212. doi:10.1080/01621459.1997.10474077.