Recursive partitioning

Last updated
A recursive partitioning tree showing survival of passengers on the Titanic ("sibsp" is the number of spouses or siblings aboard). The figures under the leaves show the probability of survival and the percentage of observations in the leaf. Summarizing: Your chances of survival were good if you were (i) a female or (ii) a young boy without several family members. CART tree titanic survivors.png
A recursive partitioning tree showing survival of passengers on the Titanic ("sibsp" is the number of spouses or siblings aboard). The figures under the leaves show the probability of survival and the percentage of observations in the leaf. Summarizing: Your chances of survival were good if you were (i) a female or (ii) a young boy without several family members.

Recursive partitioning is a statistical method for multivariable analysis. [1] Recursive partitioning creates a decision tree that strives to correctly classify members of the population by splitting it into sub-populations based on several dichotomous independent variables. The process is termed recursive because each sub-population may in turn be split an indefinite number of times until the splitting process terminates after a particular stopping criterion is reached.

Contents

Recursive partitioning methods have been developed since the 1980s. Well known methods of recursive partitioning include Ross Quinlan's ID3 algorithm and its successors, C4.5 and C5.0 and Classification and Regression Trees (CART). Ensemble learning methods such as Random Forests help to overcome a common criticism of these methods – their vulnerability to overfitting of the data – by employing different algorithms and combining their output in some way.

This article focuses on recursive partitioning for medical diagnostic tests, but the technique has far wider applications. See decision tree.

As compared to regression analysis, which creates a formula that health care providers can use to calculate the probability that a patient has a disease, recursive partition creates a rule such as 'If a patient has finding x, y, or z they probably have disease q'.

A variation is 'Cox linear recursive partitioning'. [2]

Advantages and disadvantages

Compared to other multivariable methods, recursive partitioning has advantages and disadvantages.

Examples

Examples are available of using recursive partitioning in research of diagnostic tests. [6] [7] [8] [9] [10] [11] Goldman used recursive partitioning to prioritize sensitivity in the diagnosis of myocardial infarction among patients with chest pain in the emergency room. [11]

See also

Related Research Articles

<span class="mw-page-title-main">Randomized controlled trial</span> Form of scientific experiment

A randomized controlled trial is a form of scientific experiment used to control factors not under direct experimental control. Examples of RCTs are clinical trials that compare the effects of drugs, surgical techniques, medical devices, diagnostic procedures or other medical treatments.

<span class="mw-page-title-main">C-reactive protein</span> Mammalian protein found in humans

C-reactive protein (CRP) is an annular (ring-shaped) pentameric protein found in blood plasma, whose circulating concentrations rise in response to inflammation. It is an acute-phase protein of hepatic origin that increases following interleukin-6 secretion by macrophages and T cells. Its physiological role is to bind to lysophosphatidylcholine expressed on the surface of dead or dying cells in order to activate the complement system via C1q.

<span class="mw-page-title-main">Decision tree learning</span> Machine learning algorithm

Decision tree learning is a supervised learning approach used in statistics, data mining and machine learning. In this formalism, a classification or regression decision tree is used as a predictive model to draw conclusions about a set of observations.

COX-2 inhibitors (coxibs) are a type of nonsteroidal anti-inflammatory drug (NSAID) that directly target cyclooxygenase-2, COX-2, an enzyme responsible for inflammation and pain. Targeting selectivity for COX-2 reduces the risk of peptic ulceration and is the main feature of celecoxib, rofecoxib, and other members of this drug class.

Stress hyperglycemia is a medical term referring to transient elevation of the blood glucose due to the stress of illness. It usually resolves spontaneously, but must be distinguished from various forms of diabetes mellitus.

<span class="mw-page-title-main">Acute coronary syndrome</span> Medical condition

Acute coronary syndrome (ACS) is a syndrome due to decreased blood flow in the coronary arteries such that part of the heart muscle is unable to function properly or dies. The most common symptom is centrally located pressure-like chest pain, often radiating to the left shoulder or angle of the jaw, and associated with nausea and sweating. Many people with acute coronary syndromes present with symptoms other than chest pain, particularly women, older people, and people with diabetes mellitus.

<span class="mw-page-title-main">Primary care physician</span> US term for medical professional providing first-line care

A primary care physician (PCP) is a physician who provides both the first contact for a person with an undiagnosed health concern as well as continuing care of varied medical conditions, not limited by cause, organ system, or diagnosis. The term is primarily used in the United States. In the past, the equivalent term was 'general practitioner' in the US; however in the United Kingdom and other countries the term general practitioner is still used. With the advent of nurses as PCPs, the term PCP has also been expanded to denote primary care providers.

<span class="mw-page-title-main">Ottawa ankle rules</span>

In medicine, the Ottawa ankle rules are a set of guidelines for clinicians to help decide if a patient with foot or ankle pain should be offered X-rays to diagnose a possible bone fracture. Before the introduction of the rules most patients with ankle injuries would have been imaged. However the vast majority of patients with unclear ankle injuries do not have bone fractures. As a result, many unnecessary X-rays were taken, which was costly, time-consuming and a slight health risk due to radiation exposure.

David Lawrence Sackett was an American-Canadian physician and a pioneer in evidence-based medicine. He is known as one of the fathers of Evidence-Based Medicine. He founded the first department of clinical epidemiology in Canada at McMaster University, and the Oxford Centre for Evidence-Based Medicine. He is well known for his textbooks Clinical Epidemiology and Evidence-Based Medicine.

The history of invasive and interventional cardiology is complex, with multiple groups working independently on similar technologies. Invasive and interventional cardiology is currently closely associated with cardiologists, though the development and most of its early research and procedures were performed by diagnostic and interventional radiologists.

<span class="mw-page-title-main">Alcohol detoxification</span> Abrupt cessation of alcohol intake in individuals who have alcohol use disorder

Alcohol detoxification is the abrupt cessation of alcohol intake in individuals that have alcohol use disorder. This process is often coupled with substitution of drugs that have effects similar to the effects of alcohol in order to prevent alcohol withdrawal. When withdrawal does occur, it results in symptoms of varying severity.

A clinical prediction rule or clinical probability assessment specifies how to use medical signs, symptoms, and other findings to estimate the probability of a specific disease or clinical outcome.

Critical illness–related corticosteroid insufficiency is a form of adrenal insufficiency in critically ill patients who have blood corticosteroid levels which are inadequate for the severe stress response they experience. Combined with decreased glucocorticoid receptor sensitivity and tissue response to corticosteroids, this adrenal insufficiency constitutes a negative prognostic factor for intensive care patients.

The management of atrial fibrillation (AF) is focused on preventing temporary circulatory instability, stroke and other ischemic events. Control of heart rate and rhythm are principally used to achieve the former, while anticoagulation may be employed to decrease the risk of stroke. Within the context of stroke, the discipline may be referred to as stroke prevention in atrial fibrillation (SPAF). In emergencies, when circulatory collapse is imminent due to uncontrolled rapid heart rate, immediate cardioversion may be indicated.

The Ontario Prehospital Advanced Life Support (OPALS) Studies were a series of multi-center before-and-after clinical trials looking at the impact of prehospital advanced life support services. The studies have not found the addition of advanced life support services to increase survival to hospital discharge for cardiac arrest patients.

<span class="mw-page-title-main">Reperfusion therapy</span>

Reperfusion therapy is a medical treatment to restore blood flow, either through or around, blocked arteries, typically after a heart attack. Reperfusion therapy includes drugs and surgery. The drugs are thrombolytics and fibrinolytics used in a process called thrombolysis. Surgeries performed may be minimally-invasive endovascular procedures such as a percutaneous coronary intervention (PCI), which involves coronary angioplasty. The angioplasty uses the insertion of a balloon and/or stents to open up the artery. Other surgeries performed are the more invasive bypass surgeries that graft arteries around blockages.

The Ottawa knee rules are a set of rules used to help physicians determine whether an x-ray of the knee is needed.

Steven E. Nissen is an American cardiologist, researcher and patient advocate. He was chairman of cardiovascular medicine at the Cleveland Clinic, in Cleveland, Ohio.

In statistics, the one in ten rule is a rule of thumb for how many predictor parameters can be estimated from data when doing regression analysis while keeping the risk of overfitting and finding spurious correlations low. The rule states that one predictive variable can be studied for every ten events. For logistic regression the number of events is given by the size of the smallest of the outcome categories, and for survival analysis it is given by the number of uncensored events.

The Canadian CT head rule is a medical scale used to decide whether patients with minor head injuries should undergo cranial CT scans. It was originally described by Stiell et al. in a paper published in the Lancet in 2001, in which they initially used it only on patients with Glasgow Coma Scale scores of between 13 and 15. Since then, the CCTHR has become the most widely researched and extensively validated rule of its kind, though a 2011 systematic review noted that "its exclusion criteria make it difficult to apply universally."

References

  1. Breiman, Leo (1984). Classification and Regression Trees. Boca Raton: Chapman & Hall/CRC. ISBN   978-0-412-04841-8.
  2. 1 2 Cook EF, Goldman L (1984). "Empiric comparison of multivariate analytic techniques: advantages and disadvantages of recursive partitioning analysis". Journal of Chronic Diseases. 37 (9–10): 721–31. doi:10.1016/0021-9681(84)90041-9. PMID   6501544.
  3. James KE, White RF, Kraemer HC (2005). "Repeated split sample validation to assess logistic regression and recursive partitioning: an application to the prediction of cognitive impairment". Statistics in Medicine. 24 (19): 3019–35. doi:10.1002/sim.2154. PMID   16149128.
  4. Kattan MW, Hess KR, Beck JR (1998). "Experiments to determine whether recursive partitioning (CART) or an artificial neural network overcomes theoretical limitations of Cox proportional hazards regression". Comput. Biomed. Res. 31 (5): 363–73. doi:10.1006/cbmr.1998.1488. PMID   9790741.
  5. Lee JW, Um SH, Lee JB, Mun J, Cho H (2006). "Scoring and staging systems using cox linear regression modeling and recursive partitioning". Methods of Information in Medicine. 45 (1): 37–43. doi:10.1055/s-0038-1634034. PMID   16482368.
  6. Fonarow GC, Adams KF, Abraham WT, Yancy CW, Boscardin WJ (2005). "Risk stratification for in-hospital mortality in acutely decompensated heart failure: classification and regression tree analysis". JAMA. 293 (5): 572–80. doi: 10.1001/jama.293.5.572 . PMID   15687312.
  7. Stiell IG, Wells GA, Vandemheen KL, et al. (2001). "The Canadian C-spine rule for radiography in alert and stable trauma patients". JAMA. 286 (15): 1841–8. doi: 10.1001/jama.286.15.1841 . PMID   11597285.
  8. Haydel MJ, Preston CA, Mills TJ, Luber S, Blaudeau E, DeBlieux PM (2000). "Indications for computed tomography in patients with minor head injury". N. Engl. J. Med. 343 (2): 100–5. doi: 10.1056/NEJM200007133430204 . PMID   10891517.
  9. Edworthy SM, Zatarain E, McShane DJ, Bloch DA (1988). "Analysis of the 1982 ARA lupus criteria data set by recursive partitioning methodology: new insights into the relative merit of individual criteria". J. Rheumatol. 15 (10): 1493–8. PMID   3060613.
  10. Stiell IG, Greenberg GH, Wells GA, et al. (1996). "Prospective validation of a decision rule for the use of radiography in acute knee injuries". JAMA. 275 (8): 611–5. doi:10.1001/jama.275.8.611. PMID   8594242.
  11. 1 2 Goldman L, Weinberg M, Weisberg M, et al. (1982). "A computer-derived protocol to aid in the diagnosis of emergency room patients with acute chest pain". N. Engl. J. Med. 307 (10): 588–96. doi:10.1056/NEJM198209023071004. PMID   7110205.