Relative survival of a disease, in survival analysis, is calculated by dividing the overall survival after diagnosis by the survival as observed in a similar population not diagnosed with that disease. A similar population is composed of individuals with at least age and gender similar to those diagnosed with the disease.
When describing the survival experience of a group of people or patients typically the method of overall survival is used, and it presents estimates of the proportion of people or patients alive at a certain point in time. The problem with measuring overall survival by using the Kaplan-Meier or actuarial survival methods is that the estimates include two causes of death: deaths from the disease of interest and deaths from all other causes, which includes old age, other cancers, trauma and any other possible cause of death. In general, survival analysis is interested in the deaths by a disease rather than all causes. Thus, a "cause-specific survival analysis" is employed to measure disease-specific survival. Thus, there are two ways in performing a cause-specific survival analysis "competing risks survival analysis" and "relative survival."
This form of analysis is known by its use of death certificates. In traditional overall survival analysis, the cause of death is irrelevant to the analysis. In a competing risks survival analyses, each death certificate is reviewed. If the disease of interest is cancer, and the patient dies of a car accident, the patient is labelled as censored at death instead of being labelled as having died. Issues with this method arise, as each hospital and or registry may code for causes of death differently.[ citation needed ]
For example, there is variability in the way a patient who has cancer and commits suicide is coded/labelled. In addition, if a patient has an eye removed from an ocular cancer and dies getting hit while crossing the road because he did not see the car, he would often be considered to be censored rather than having died from the cancer or its subsequent effects.
The relative survival form of analysis is more complex than "competing risks" but is considered the gold-standard for performing a cause-specific survival analysis. It is based on two rates: the overall hazard rate observed in a diseased population and the background or expected hazard rate in the general or background population.
Deaths from the disease in a single time period are the total number of deaths (overall number of deaths) minus the expected number of deaths in the general population. If 10 deaths per hundred population occur in a population of cancer patients, but only 1 death occurs per hundred general population, the disease specific number of deaths (excess hazard rate) is 9 deaths per hundred population. The classic equation for the excess hazard rate is as follows:
The equation does not define a survival proportion but simply describes the relationships between disease-specific death (excess hazard) rates, background mortality rates (expected death rate) and the overall observed mortality rates. The excess hazard rate is related to relative survival, just as hazard rates are related to overall survival.
Relative survival is typically used in the analysis of cancer registry data. [1] Cause-specific survival estimation using the coding of death certificates has considerable inaccuracy and inconsistency and does not permit the comparison of rates across registries.
The diagnosis of cause-of-death is varied between practitioners. How does one code for a patient who dies of heart failure after receiving a chemotherapeutic agent with known deleterious cardiac side-effects? In essence, what really matters is not why the population dies but if the rate of death is higher than that of the general population.
If all patients are dying of car crashes, perhaps the tumour or treatment predisposes them to have visual or perceptual disturbances, which lead them to be more likely to die in a car crash. In addition, it has been shown that patients coded in a large US cancer registry as suffering from a non-cancer death are 1.37 times as likely to die than does a member of the general population. [2]
If the coding was accurate, this figure should approximate 1.0 as the rate of those dying of non-cancer deaths (in a population of cancer sufferers) should approximate that of the general population. Thus, the use of relative survival provides an accurate way to measure survival rates that are associated with the cancer in question.
In epidemiology, relative survival (as opposed to overall survival and associated with excess hazard rates) is defined as the ratio of observed survival in a population to the expected or background survival rate. [3] It can be thought of as the kaplan-meier survivor function for a particular year, divided by the expected survival rate in that particular year. That is typically known as the relative survival (RS).
If five consecutive years are multiplied, the resulting figure would be known as cumulative relative survival (CRS). It is analogous to the five-year overall survival rate, but it is a way of describing cancer-specific risk of death over five years after diagnosis.
There are several software suites available to estimate relative survival rates. Regression modelling can be performed using maximum likelihood estimation methods by using Stata or R. [4] [5] For example, the R package cmprsk may be used for competing risk analyses which utilize sub-distribution or 'Fine and Gray' regression methods. [6]
Epidemiology is the study and analysis of the distribution, patterns and determinants of health and disease conditions in a defined population.
In survival analysis, the hazard ratio (HR) is the ratio of the hazard rates corresponding to the conditions characterised by two distinct levels of a treatment variable of interest. For example, in a clinical study of a drug, the treated population may die at twice the rate per unit time of the control population. The hazard ratio would be 2, indicating higher hazard of death from the treatment.
Cross-validation, sometimes called rotation estimation or out-of-sample testing, is any of various similar model validation techniques for assessing how the results of a statistical analysis will generalize to an independent data set. Cross-validation is a resampling method that uses different portions of the data to test and train a model on different iterations. It is mainly used in settings where the goal is prediction, and one wants to estimate how accurately a predictive model will perform in practice. In a prediction problem, a model is usually given a dataset of known data on which training is run, and a dataset of unknown data against which the model is tested. The goal of cross-validation is to test the model's ability to predict new data that was not used in estimating it, in order to flag problems like overfitting or selection bias and to give an insight on how the model will generalize to an independent dataset.
Survival analysis is a branch of statistics for analyzing the expected duration of time until one event occurs, such as death in biological organisms and failure in mechanical systems. This topic is called reliability theory or reliability analysis in engineering, duration analysis or duration modelling in economics, and event history analysis in sociology. Survival analysis attempts to answer certain questions, such as what is the proportion of a population which will survive past a certain time? Of those that survive, at what rate will they die or fail? Can multiple causes of death or failure be taken into account? How do particular circumstances or characteristics increase or decrease the probability of survival?
In healthcare, a differential diagnosis (DDx) is a method of analysis of a patient's history and physical examination to arrive at the correct diagnosis. It involves distinguishing a particular disease or condition from others that present with similar clinical features. Differential diagnostic procedures are used by clinicians to diagnose the specific disease in a patient, or, at least, to consider any imminently life-threatening conditions. Often, each individual option of a possible disease is called a differential diagnosis.
Clinical endpoints or clinical outcomes are outcome measures referring to occurrence of disease, symptom, sign or laboratory abnormality constituting a target outcome in clinical research trials. The term may also refer to any disease or sign that strongly motivates withdrawal of an individual or entity from the trial, then often termed a humane (clinical) endpoint.
In statistics, Poisson regression is a generalized linear model form of regression analysis used to model count data and contingency tables. Poisson regression assumes the response variable Y has a Poisson distribution, and assumes the logarithm of its expected value can be modeled by a linear combination of unknown parameters. A Poisson regression model is sometimes known as a log-linear model, especially when used to model contingency tables.
Overdiagnosis is the diagnosis of disease that will never cause symptoms or death during a patient's ordinarily expected lifetime and thus presents no practical threat regardless of being pathologic. Overdiagnosis is a side effect of screening for early forms of disease. Although screening saves lives in some cases, in others it may turn people into patients unnecessarily and may lead to treatments that do no good and perhaps do harm. Given the tremendous variability that is normal in biology, it is inherent that the more one screens, the more incidental findings will generally be found. For a large percentage of them, the most appropriate medical response is to recognize them as something that does not require intervention; but determining which action a particular finding warrants can be very difficult, whether because the differential diagnosis is uncertain or because the risk ratio is uncertain.
Proportional hazards models are a class of survival models in statistics. Survival models relate the time that passes, before some event occurs, to one or more covariates that may be associated with that quantity of time. In a proportional hazards model, the unique effect of a unit increase in a covariate is multiplicative with respect to the hazard rate. For example, taking a drug may halve one's hazard rate for a stroke occurring, or, changing the material from which a manufactured component is constructed may double its hazard rate for failure. Other types of survival models such as accelerated failure time models do not exhibit proportional hazards. The accelerated failure time model describes a situation where the biological or mechanical life history of an event is accelerated.
The five-year survival rate is a type of survival rate for estimating the prognosis of a particular disease, normally calculated from the point of diagnosis. Lead time bias from earlier diagnosis can affect interpretation of the five-year survival rate.
Survival rate is a part of survival analysis. It is the proportion of people in a study or treatment group still alive at a given period of time after diagnosis. It is a method of describing prognosis in certain disease conditions, and can be used for the assessment of standards of therapy. The survival period is usually reckoned from date of diagnosis or start of treatment. Survival rates are based on the population as a whole and cannot be applied directly to an individual. There are various types of survival rates. They often serve as endpoints of clinical trials and should not be confused with mortality rates, a population metric.
In statistics, censoring is a condition in which the value of a measurement or observation is only partially known.
The logrank test, or log-rank test, is a hypothesis test to compare the survival distributions of two samples. It is a nonparametric test and appropriate to use when the data are right skewed and censored. It is widely used in clinical trials to establish the efficacy of a new treatment in comparison with a control treatment when the measurement is the time to event. The test is sometimes called the Mantel–Cox test. The logrank test can also be viewed as a time-stratified Cochran–Mantel–Haenszel test.
In the statistical area of survival analysis, an accelerated failure time model is a parametric model that provides an alternative to the commonly used proportional hazards models. Whereas a proportional hazards model assumes that the effect of a covariate is to multiply the hazard by some constant, an AFT model assumes that the effect of a covariate is to accelerate or decelerate the life course of a disease by some constant. This is especially appealing in a technical context where the 'disease' is a result of some mechanical process with a known sequence of intermediary stages.
Plasma cell dyscrasias are a spectrum of progressively more severe monoclonal gammopathies in which a clone or multiple clones of pre-malignant or malignant plasma cells over-produce and secrete into the blood stream a myeloma protein, i.e. an abnormal monoclonal antibody or portion thereof. The exception to this rule is the disorder termed non-secretory multiple myeloma; this disorder is a form of plasma cell dyscrasia in which no myeloma protein is detected in serum or urine of individuals who have clear evidence of an increase in clonal bone marrow plasma cells and/or evidence of clonal plasma cell-mediated tissue injury. Here, a clone of plasma cells refers to group of plasma cells that are abnormal in that they have an identical genetic identity and therefore are descendants of a single genetically distinct ancestor cell.
The epidemiology of cancer is the study of the factors affecting cancer, as a way to infer possible trends and causes. The study of cancer epidemiology uses epidemiological methods to find the cause of cancer and to identify and develop improved treatments.
Pre-test probability and post-test probability are the probabilities of the presence of a condition before and after a diagnostic test, respectively. Post-test probability, in turn, can be positive or negative, depending on whether the test falls out as a positive test or a negative test, respectively. In some cases, it is used for the probability of developing the condition of interest in the future.
The survival function is a function that gives the probability that a patient, device, or other object of interest will survive past a certain time. The survival function is also known as the survivor function or reliability function. The term reliability function is common in engineering while the term survival function is used in a broader range of applications, including human mortality. The survival function is the complementary cumulative distribution function of the lifetime. Sometimes complementary cumulative distribution functions are called survival functions in general.
In statistics, a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations.
Recurrent event analysis is a branch of survival analysis that analyzes the time until recurrences occur, such as recurrences of traits or diseases. Recurrent events are often analysed in social sciences and medical studies, for example recurring infections, depressions or cancer recurrences. Recurrent event analysis attempts to answer certain questions, such as: how many recurrences occur on average within a certain time interval? Which factors are associated with a higher or lower risk of recurrence?