Hypertabastic survival models were introduced in 2007 by Mohammad Tabatabai, Zoran Bursac, David Williams, and Karan Singh. This distribution can be used to analyze time-to-event data in biomedical and public health areas and normally called survival analysis. In engineering, the time-to-event analysis is referred to as reliability theory and in business and economics it is called duration analysis. Other fields may use different names for the same analysis. These survival models are applicable in many fields such as biomedical, behavioral science, social science, statistics, medicine, bioinformatics, medical informatics, data science especially in machine learning, computational biology, business economics, engineering, and commercial entities. They not only look at the time to event, but whether or not the event occurred. These time-to-event models can be applied in a variety of applications for instance, time after diagnosis of cancer until death, comparison of individualized treatment with standard care in cancer research, time until an individual defaults on loans, relapsed time for drug and smoking cessation, time until property sold after being put on the market, time until an individual upgrades to a new phone, time until job relocation, time until bones receive microscopic fractures when undergoing different stress levels, time from marriage until divorce, time until infection due to catheter, and time from bridge completion until first repair. [1] [2] [3] [4] [5]
The Hypertabastic cumulative distribution function or simply the hypertabastic distribution function is defined as the probability that random variable will take a value less than or equal to . The hypertabastic distribution function is defined as
where represents the hyperbolic secant function and , are parameters.
The parameters and are both positive with and as hyperbolic secant and hyperbolic cotangent respectively. The Hypertabastic probability density function is
where and are hyperbolic cosecant and hyperbolic tangent respectively and
The Hypertabastic survival function is defined as
where is the probability that waiting time exceeds .
For , the Restricted Expected (mean) Survival Time of the random variable is denoted by , and is defined as
For the continuous random variable representing time to event, the Hypertabastic hazard function , which represents the instantaneous failure rate at time given survival up to time , is defined as
The Hypertabastic hazard function has the flexibility to model varieties of hazard shapes.Spirko, L. (2017). Variable Selection and Supervised Dimension Reduction for Large-Scale Genomic Data with Censored Survival Outcomes (PDF) (PhD thesis). Temple University. These different hazard shapes could apply to different mechanisms for which the hazard functions may not agree with conventional models. The following is a list of possible shapes for the Hypertabastic hazard function: For , the Hypertabastic hazard function is monotonically decreasing indicating higher likelihood of failure at early times. For , the Hypertabastic hazard curve first increases with time until it reaches its maximum failure rate and thereafter the failure decreases with time (unimodal). For , the Hypertabastic hazard function initially increases with time, then it reaches its horizontal asymptote . For , the Hypertabastic hazard function first increases with time with an upward concavity until it reaches its inflection point and subsequently continues to increase with a downward concavity. For , the Hypertabastic hazard function initially increases with an upward concavity until it reaches its point of inflection, thereafter becoming a linear asymptote with slope . For , the Hypertabastic hazard function increases with an upward concavity.
The Hypertabastic cumulative hazard function is
The hazard function of the Hypertabastic proportional hazards model has the form
where is a p-dimensional vector of explanatory variables and is a vector of unknown parameters. The combined effect of explanatory variables is a non-negative function of with . The Hypertabastic survival function for the proportional hazards model is defined as:
and the Hypertabastic probability density function for the proportional hazard model is given by
Depending on the type of censoring, the maximum likelihood function technique along with an appropriate log-likelihood function may be used to estimate the model parameters. If the sample consists of right censored data and the model to use is Hypertabastic proportional hazards model, then, the proportional hazards log-likelihood function is
When the covariates act multiplicatively on the time-scale, the model is called accelerated failure time model. The Hypertabastic survival function for the accelerated failure time model is given by
The Hypertabastic accelerated failure time model has a hazard function of the form
The Hypertabastic probability density function for the accelerated failure time model is
For the right censored data, the log-likelihood function for the Hypertabastic accelerated failure time model is given by
where .
A modified chi-squared type test, known as Nikulin-Rao-Robson statistic is used to test the goodness-of-fit for Hypertabastic accelerated failure time models and its comparison with unimodal hazard rate functions. Simulation studies have shown that the Hypertabastic distribution can be used as an alternative to log-logistic and log-normal distribution because of its flexible shape of hazard functions. The Hypertabastic distribution is a competitor for statistical modeling when compared with Birnbaum-Saunders and inverse Gaussian distributions [2] [6]
Consider a sample of survival times of n individuals with associated p-dimensional covariate vectors and an unknown parameter vector . Let and stand for the corresponding probability density function, cumulative distribution function, survival function and hazard function respectively. In the absence of censoring (censoring normally occurs when the failure time of some individuals cannot be observed), the likelihood function is
and the log-likelihood is
For the right censored data, the likelihood function is
or equivalently,
and the log-likelihood function is
or equivalently,
where
In the presence of left censored data, the likelihood function is
and the corresponding log-likelihood function is
where
In the presence of interval censored data, the likelihood function is
and the log-likelihood function is
where for all interval censored observations and
If the intended sample consists of all types of censored data (right censored, left censored and interval censored), then its likelihood function takes the following form
and its corresponding log-likelihood function is given by
The Hypertabastic Accelerated Failure Time model was used to analyze a total of 27,532 patients regarding the impact of histology on the survival of patients with cutaneous or mucosal melanoma. Understanding patients’ histological subtypes and their failure rate assessment would enable clinicians and healthcare providers to perform individualized treatment, resulting in a lower risk of complication and higher survivability of patients. [7]
The quantities of 49 locations of the same area of an oil field was examined to identify its underlying distribution. Using generalized chi-squared, the distribution of oil field quantities was represented by the Hyperbolastic distribution and compared with the lognormal (LN), log-logistic (LL), Birnbaum-Saunders (BS) and inverse Gaussian (IG) distributions. [8]
The times of remission from clinical trial for acute leukemia of children study were used to analyze the remission duration of acute leukemia data for two groups of patients controlling for log of white blood cell counts. The Hypertabastic accelerated failure time model was used to analyze the remission duration of acute leukemia patient. [8]
A randomized clinical trial comparing two chemotherapy regimens for 447 individuals with malignant glioma. A total of 293 patients died within a five-year time period and the median survival time was about 11 months. The overall model fit, in comparison with other parametric distributions, was performed using the generalized chi-square test statistics and proportional hazards model. [8]
The Hypertabastic proportional hazard model was used to analyze numerous breast cancer data including the survival of breast cancer patients by exploring the role of a metastasis variable in combination with clinical and gene expression variables. [9] [10]
One hundred five Nigerian patients who were diagnosed with hypertension from January 2013 to July 2018 were included in this study, where death was the event of interest. Six parametric models such as; exponential, Weibull, lognormal, log-logistic, Gompertz and Hypertabastic distributions were fitted to the data using goodness of fit tests such as S.E., AIC, and BIC to determine the best fit model. The parametric models were considered because they are all lifetime distributions. S.E., AIC, and BIC measures were used to compare these parametric models. [11]
Stress fractures in older individuals are very important due to the growing number of elderly. Fatigue tests on 23 female bone samples from three individuals were analyzed. Hypertabastic survival and hazard functions of the normalized stress level and age were developed using previously published bone fatigue stress data. The event of interest was the number of cycles until the bone gets microscopic fracture. Furthermore, Hypertabastic proportional hazard models were used to investigate tensile fatigue and cycle-to-fatigue for cortical bone data. [12]
Hypertabastic survival models have been used in the analysis of unemployment data and its comparison with the cox regression model. [13]
Using National Cancer Institute data from 1975 to 2016, the impact of histological subtypes on the survival probability of 134,150 kidney carcinoma patients were examined. The study variables were a race/ethnicity, age, sex, tumor grade, type of surgery, geographical location of patient and stage of disease. The Hypertabastic proportional hazards model was used to analyze the survival time of patients diagnosed with kidney carcinoma to explore the effect of histological subtypes on their survival probability and assess the relationship between the histological subtypes, tumor stage, tumor grade, and type of surgery. [14]
Sample code in SAS:
Proc nlp data=sasuser.KidneyCarcinoma tech=quanew cov=2 vardef=n pcov phes maxiter=250; /* Hypertabastic Proportional Hazards Model with Log Time */ title1 'Kidney Carcinoma'; max logf; /* Model Parameter Initial Values for Explanatory Variables */ parms a=0.01,b=0.1, c=0.01, /* Age *//* Continuous */ d=-.01, /* Male *//* reference: Female */ r1=.071, /* Hispanic */ r2=.044, /* Asian */ r3=.134, /* Black *//* reference: White */ h1=.205, /* Adeno Carcinoma w/ Mixed Subtypes */ h2=.505, /* Papillary Adeno Carcinoma NOS */ h3=.537, /* Clear Cell Adeno Carcinoma */ h4=.316, /* Renal Cell Adeno Carcinoma */ h5=1.15, /* Chromophobe Renal Cell Carcinoma */ h6=-.21, /* Sarcomatoid Renal Cell Carcinoma */ h7=.378, /* Granular Cell Carcinoma *//* reference: Other */ g1=.03, /* East */ g2=.088, /* Northern Plains */ g3=.06, /* Pacific Coast *//* reference: Southwest */ s1=1.2, /* Localized */ s2=-1.3, /* Distant *//* reference: Regional */ gr1=1.169, /* Well Differentiated */ gr2=.99, /* Moderately Differentiated */ gr3=.413, /* Poorly Differentiated *//* reference: Undifferentiated */ su1=-.945, /* No Surgery */ su2=.84, /* Cryocergery */ su3=.56, /* Thermal Ablation */ su4=.574, /* Cryosurgery */ su5=1.173, /* Partial Nephrectomy or Partial Uretterectomy */ su6=.25, /* Complete Nephrectomy */ su7=.073, /* Radical Nephrectomy */ su8=-.096, /* Any Nephrectomy */ su9=.028; /* Nephrectomy, Urectomy *//* reference: Other *//* Log-Likelihood Function */ in6=exp(-(c*Age+ d*Gender+ r1*Race1+r2*Race2+r3*Race3+ h1*Hist1+h2*Hist2+h3*Hist3+h4*Hist4+h5*Hist5+h6*Hist6+h7*Hist7+ g1*Geo1+g2*Geo2+g3*Geo3+ s1*Stage1+s2*Stage2+ gr1*Grade1+gr2*Grade2+gr3*Grade3+ su1*Surgery1+su2*Surgery2+su3*Surgery3+su4*Surgery4+su5*Surgery5+su6*Surgery6+su7*Surgery7+su8*Surgery8+su9*Surgery9)); /* covariates */ s = log(1/cosh(a*(1-(time**b)/tanh(time**b))/b))*in6+Status*log(((a*time**(-1+2*b)/sinh(time**b)**2- a*time**(-1+b)/tanh(time**b))*tanh(a*(1-time**b/tanh(time**b))/b))*in6); logf=s;run;
Although survival analysis tools and techniques have been widely used in medical and biomedical applications over the last few decades, their applications to engineering problems have been more sporadic and limited. The probabilistic assessment of service life of a wide variety of engineering systems, from small mechanical components to large bridge structures, [15] can substantially benefit from the well-established survival analysis techniques. Modeling of time-to-event phenomena in engineering applications can be performed under the influence of numerical and categorical covariates using observational or test data. The "survival" of an engineering component or system is synonymous with the more commonly used term "reliability". The term "hazard rate" or "conditional failure rate" (defined as probability of survival per unit time assuming survival up to that time) is an important measure of the change in the rate of failure over time. In this context, failure is defined as reaching the target event in the time-to-event process. This could be defined as reaching a particular serviceability condition state, localized/partial structural failure, or global/catastrophic failure [16] applied the Hypertabastic parametric accelerated failure time survival model to develop probabilistic models of bridge deck service life for Wisconsin. Bridge decks are typically concrete slabs on which traffic rides as seen in the Marquette Interchange bridge. The authors used the National Bridge Inventory (NBI) dataset to obtain the needed data for their study. NBI records include discrete numerical ratings for bridge decks (and other bridge components) as well as other basic information such as Average Daily Traffic (ADT) and deck surface area (obtained by multiplying the provided bridge length with bridge deck width). The numerical ratings range from 0 to 9 with 9 corresponding to brand new condition and 0 being complete failure. A deck condition rating of 5 was selected as the effective end of service life of bridge deck. The numerical covariates used were the ADT and deck surface area, while the categorical covariate was the superstructure material (structural steel or concrete).
The hypertabastic Proportional Hazards and Accelerated Failure Time models are useful techniques in analyzing bridge-related structures due to its flexibility of hazard curves, which can be monotonically increasing or decreasing with upward or downward concavity. It can also take the shape of a single mound curve. [16] [1] [17] This flexibility in modeling various hazard shapes makes the model suitable for a wide variety of engineering problems. [16]
Tabatabai et al. extended the Hypertabastic bridge deck models developed for Wisconsin bridges to bridges in six northern US states Nabizadeh, A. (2015). Reliability of Bridge Superstructures in Wisconsin. Master's Thesis (Thesis). UWM Digital Commons. and then to all 50 US states. [18] The study of bridge decks in all 50 states indicated important differences in reliability of bridge decks in different states and regions. Stevens et al. [19] discuss the importance of survival analyses in identifying key bridge performance indicators and discuss the use of Hypertabastic survival models for bridges. [20] and Nabizadeh et al. [21] further extended the use of Hypertabastic survival models to bridge superstructures. The covariates used were ADT, maximum bridge span length and superstructure type. The survival function can be used to determine the expected life using the following equation (area under the entire survival curve)
It is important to note that both the survival function and the expected life would change as the time passes by. The conditional survival function is a function of time and survival time and is defined as [22]
Nabizadeh et al. [22] used the Hypertabastic survival functions developed for Wisconsin to analyze conditional survival functions and conditional expected service lives
The conditional expected life would continue to increase as the survival time increases. Nabizadeh et al. term this additional expected life as "survival dividend.”
An important mode of failure in bridge engineering is metal fatigue, which can result from repetitive applications of stress cycles to various details and connections in the structure. As the number of cycles increase, the probability of fatigue failure increases. An important factor in fatigue life is the stress range (Sr)(maximum minus minimum stress in a cycle). The probabilistic engineering fatigue problem can be treated as a "time"-to-event survival analysis problem if the number of cycles is treated as a fictitious time variable [23]
This would facilitate the application of well-established survival analysis techniques to engineering fatigue problems [23] and Tabatabai et al. [24] The survival function , probability density function , hazard rate , and cumulative probability of failure can then be defined as
The hypertabastic accelerated failure time model was used to analyze probabilistic fatigue life for various detailed categories in steel bridges. [23]
A likelihood function measures how well a statistical model explains observed data by calculating the probability of seeing that data under different parameter values of the model. It is constructed from the joint probability distribution of the random variable that (presumably) generated the observations. When evaluated on the actual data points, it becomes a function solely of the model parameters.
In statistics, sufficiency is a property of a statistic computed on a sample dataset in relation to a parametric model of the dataset. A sufficient statistic contains all of the information that the dataset provides about the model parameters. It is closely related to the concepts of an ancillary statistic which contains no information about the model parameters, and of a complete statistic which only contains information about the parameters and no ancillary information.
In mechanics and geometry, the 3D rotation group, often denoted SO(3), is the group of all rotations about the origin of three-dimensional Euclidean space under the operation of composition.
In probability theory and statistics, the gamma distribution is a versatile two-parameter family of continuous probability distributions. The exponential distribution, Erlang distribution, and chi-squared distribution are special cases of the gamma distribution. There are two equivalent parameterizations in common use:
Flight dynamics is the science of air vehicle orientation and control in three dimensions. The three critical flight dynamics parameters are the angles of rotation in three dimensions about the vehicle's center of gravity (cg), known as pitch, roll and yaw. These are collectively known as aircraft attitude, often principally relative to the atmospheric frame in normal flight, but also relative to terrain during takeoff or landing, or when operating at low elevation. The concept of attitude is not specific to fixed-wing aircraft, but also extends to rotary aircraft such as helicopters, and dirigibles, where the flight dynamics involved in establishing and controlling attitude are entirely different.
In statistics, the Neyman–Pearson lemma describes the existence and uniqueness of the likelihood ratio as a uniformly most powerful test in certain contexts. It was introduced by Jerzy Neyman and Egon Pearson in a paper in 1933. The Neyman–Pearson lemma is part of the Neyman–Pearson theory of statistical testing, which introduced concepts like errors of the second kind, power function, and inductive behavior. The previous Fisherian theory of significance testing postulated only one hypothesis. By introducing a competing hypothesis, the Neyman–Pearsonian flavor of statistical testing allows investigating the two types of errors. The trivial cases where one always rejects or accepts the null hypothesis are of little interest but it does prove that one must not relinquish control over one type of error while calibrating the other. Neyman and Pearson accordingly proceeded to restrict their attention to the class of all level tests while subsequently minimizing type II error, traditionally denoted by . Their seminal paper of 1933, including the Neyman–Pearson lemma, comes at the end of this endeavor, not only showing the existence of tests with the most power that retain a prespecified level of type I error, but also providing a way to construct such tests. The Karlin-Rubin theorem extends the Neyman–Pearson lemma to settings involving composite hypotheses with monotone likelihood ratios.
In Bayesian probability theory, if, given a likelihood function , the posterior distribution is in the same probability distribution family as the prior probability distribution , the prior and posterior are then called conjugate distributions with respect to that likelihood function and the prior is called a conjugate prior for the likelihood function .
In linear algebra, a rotation matrix is a transformation matrix that is used to perform a rotation in Euclidean space. For example, using the convention below, the matrix
The classical XY model is a lattice model of statistical mechanics. In general, the XY model can be seen as a specialization of Stanley's n-vector model for n = 2.
In probability theory, a distribution is said to be stable if a linear combination of two independent random variables with this distribution has the same distribution, up to location and scale parameters. A random variable is said to be stable if its distribution is stable. The stable distribution family is also sometimes referred to as the Lévy alpha-stable distribution, after Paul Lévy, the first mathematician to have studied it.
The Gauss–Newton algorithm is used to solve non-linear least squares problems, which is equivalent to minimizing a sum of squared function values. It is an extension of Newton's method for finding a minimum of a non-linear function. Since a sum of squares must be nonnegative, the algorithm can be viewed as using Newton's method to iteratively approximate zeroes of the components of the sum, and thus minimizing the sum. In this sense, the algorithm is also an effective method for solving overdetermined systems of equations. It has the advantage that second derivatives, which can be challenging to compute, are not required.
In statistics, ordinary least squares (OLS) is a type of linear least squares method for choosing the unknown parameters in a linear regression model by the principle of least squares: minimizing the sum of the squares of the differences between the observed dependent variable in the input dataset and the output of the (linear) function of the independent variable. Some sources consider OLS to be linear regression.
In calculus, the Leibniz integral rule for differentiation under the integral sign, named after Gottfried Wilhelm Leibniz, states that for an integral of the form where and the integrands are functions dependent on the derivative of this integral is expressible as where the partial derivative indicates that inside the integral, only the variation of with is considered in taking the derivative.
In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.
A theoretical motivation for general relativity, including the motivation for the geodesic equation and the Einstein field equation, can be obtained from special relativity by examining the dynamics of particles in circular orbits about the Earth. A key advantage in examining circular orbits is that it is possible to know the solution of the Einstein Field Equation a priori. This provides a means to inform and verify the formalism.
In mathematics, the Weyl character formula in representation theory describes the characters of irreducible representations of compact Lie groups in terms of their highest weights. It was proved by Hermann Weyl. There is a closely related formula for the character of an irreducible representation of a semisimple Lie algebra. In Weyl's approach to the representation theory of connected compact Lie groups, the proof of the character formula is a key step in proving that every dominant integral element actually arises as the highest weight of some irreducible representation. Important consequences of the character formula are the Weyl dimension formula and the Kostant multiplicity formula.
A ratio distribution is a probability distribution constructed as the distribution of the ratio of random variables having two other known distributions. Given two random variables X and Y, the distribution of the random variable Z that is formed as the ratio Z = X/Y is a ratio distribution.
Non-linear least squares is the form of least squares analysis used to fit a set of m observations with a model that is non-linear in n unknown parameters (m ≥ n). It is used in some forms of nonlinear regression. The basis of the method is to approximate the model by a linear one and to refine the parameters by successive iterations. There are many similarities to linear least squares, but also some significant differences. In economic theory, the non-linear least squares method is applied in (i) the probit regression, (ii) threshold regression, (iii) smooth regression, (iv) logistic link regression, (v) Box–Cox transformed regressors ().
The hyperbolastic functions, also known as hyperbolastic growth models, are mathematical functions that are used in medical statistical modeling. These models were originally developed to capture the growth dynamics of multicellular tumor spheres, and were introduced in 2005 by Mohammad Tabatabai, David Williams, and Zoran Bursac. The precision of hyperbolastic functions in modeling real world problems is somewhat due to their flexibility in their point of inflection. These functions can be used in a wide variety of modeling problems such as tumor growth, stem cell proliferation, pharma kinetics, cancer growth, sigmoid activation function in neural networks, and epidemiological disease progression or regression.
{{cite conference}}
: CS1 maint: DOI inactive as of September 2024 (link)