WikiMili The Free Encyclopedia

This article needs additional citations for verification .(October 2013) (Learn how and when to remove this template message) |

A **decision tree** is a decision support tool that uses a tree-like model of decisions and their possible consequences, including chance event outcomes, resource costs, and utility. It is one way to display an algorithm that only contains conditional control statements.

A **decision support system** (**DSS**) is an information system that supports business or organizational decision-making activities. DSSs serve the management, operations and planning levels of an organization and help people make decisions about problems that may be rapidly changing and not easily specified in advance—i.e. unstructured and semi-structured decision problems. Decision support systems can be either fully computerized or human-powered, or a combination of both.

In mathematics, and, more specifically, in graph theory, a **tree** is an undirected graph in which any two vertices are connected by *exactly one* path. Every acyclic connected graph is a tree, and vice versa. A **forest** is a disjoint union of trees, or equivalently an acyclic graph that is not necessarily connected.

A **causal model** is a conceptual model that describes the causal mechanisms of a system. Causal models can improve study designs by providing clear rules for deciding which independent variables need to be included/controlled for.

- Overview
- Decision tree building blocks
- Decision tree elements
- Decision rules
- Decision tree using flowchart symbols
- Analysis example
- Influence diagram
- Association rule induction
- Advantages and disadvantages
- See also
- References
- External links

Decision trees are commonly used in operations research, specifically in decision analysis, to help identify a strategy most likely to reach a goal, but are also a popular tool in machine learning.

**Operations research**, or **operational research (OR)** in British usage, is a discipline that deals with the application of advanced analytical methods to help make better decisions. Further, the term **operational analysis** is used in the British military as an intrinsic part of capability development, management and assurance. In particular, operational analysis forms part of the Combined Operational Effectiveness and Investment Appraisals, which support British defense capability acquisition decision-making.

**Decision analysis** (**DA**) is the discipline comprising the philosophy, methodology, and professional practice necessary to address important decisions in a formal manner. Decision analysis includes many procedures, methods, and tools for identifying, clearly representing, and formally assessing important aspects of a decision, for prescribing a recommended course of action by applying the maximum expected utility action axiom to a well-formed representation of the decision, and for translating the formal representation of a decision and its corresponding recommendation into insight for the decision maker and other stakeholders.

A **goal** is an idea of the future or desired result that a person or a group of people envisions, plans and commits to achieve. People endeavor to reach goals within a finite time by setting deadlines.

A decision tree is a flowchart-like structure in which each internal node represents a "test" on an attribute (e.g. whether a coin flip comes up heads or tails), each branch represents the outcome of the test, and each leaf node represents a class label (decision taken after computing all attributes). The paths from root to leaf represent classification rules.

A **flowchart** is a type of diagram that represents an algorithm, workflow or process. The flowchart shows the steps as boxes of various kinds, and their order by connecting the boxes with arrows. This diagrammatic representation illustrates a solution model to a given problem. Flowcharts are used in analyzing, designing, documenting or managing a process or program in various fields.

In decision analysis, a decision tree and the closely related influence diagram are used as a visual and analytical decision support tool, where the expected values (or expected utility) of competing alternatives are calculated.

An **influence diagram** (**ID**) is a compact graphical and mathematical representation of a decision situation. It is a generalization of a Bayesian network, in which not only probabilistic inference problems but also decision making problems can be modeled and solved.

In probability theory, the **expected value** of a random variable, intuitively, is the long-run average value of repetitions of the **same experiment** it represents. For example, the expected value in rolling a six-sided die is 3.5, because the average of all the numbers that come up is 3.5 as the number of rolls approaches infinity. In other words, the law of large numbers states that the arithmetic mean of the values almost surely converges to the expected value as the number of repetitions approaches infinity. The expected value is also known as the **expectation**, **mathematical expectation**, **EV**, **average**, **mean value**, **mean**, or **first moment**.

A decision tree consists of three types of nodes:^{ [1] }

- Decision nodes – typically represented by squares
- Chance nodes – typically represented by circles
- End nodes – typically represented by triangles

Decision trees are commonly used in operations research and operations management. If, in practice, decisions have to be taken online with no recall under incomplete knowledge, a decision tree should be paralleled by a probability model as a best choice model or online selection model algorithm. Another use of decision trees is as a descriptive means for calculating conditional probabilities.

**Operations management** is an area of management concerned with designing and controlling the process of production and redesigning business operations in the production of goods or services. It involves the responsibility of ensuring that business operations are efficient in terms of using as few resources as needed and effective in terms of meeting customer requirements. Operations management is primarily concerned with planning, organizing and supervising in the contexts of production, manufacturing or the provision of services. It is concerned with managing an entire production system which is the process that converts inputs into outputs, or delivers a product or services. Operations produce products, manage quality and creates service. Operation management covers sectors like banking systems, hospitals, companies, working with suppliers, customers, and using technology. Operations is one of the major functions in an organization along with supply chains, marketing, finance and human resources. The operations function requires management of both the strategic and day-to-day production of goods and services.

**Probability** is the measure of the likelihood that an event will occur. See glossary of probability and statistics. Probability quantifies as a number between 0 and 1, where, loosely speaking, 0 indicates impossibility and 1 indicates certainty. The higher the probability of an event, the more likely it is that the event will occur. A simple example is the tossing of a fair (unbiased) coin. Since the coin is fair, the two outcomes are both equally probable; the probability of "heads" equals the probability of "tails"; and since no other outcomes are possible, the probability of either "heads" or "tails" is 1/2.

In mathematics and computer science, an **algorithm** is an unambiguous specification of how to solve a class of problems. Algorithms can perform calculation, data processing, automated reasoning, and other tasks.

Decision trees, influence diagrams, utility functions, and other decision analysis tools and methods are taught to undergraduate students in schools of business, health economics, and public health, and are examples of operations research or management science methods.

Drawn from left to right, a decision tree has only burst nodes (splitting paths) but no sink nodes (converging paths). Therefore, used manually, they can grow very big and are then often hard to draw fully by hand. Traditionally, decision trees have been created manually — as the aside example shows — although increasingly, specialized software is employed.

The decision tree can be linearized into **decision rules**,^{ [2] } where the outcome is the contents of the leaf node, and the conditions along the path form a conjunction in the if clause. In general, the rules have the form:

*if*condition1*and*condition2*and*condition3*then*outcome.

Decision rules can be generated by constructing association rules with the target variable on the right. They can also denote temporal or causal relations.^{ [3] }

Commonly a decision tree is drawn using flowchart symbols as it is easier for many to read and understand.

Analysis can take into account the decision maker's (e.g., the company's) preference or utility function, for example:

The basic interpretation in this situation is that the company prefers B's risk and payoffs under realistic risk preference coefficients (greater than $400K—in that range of risk aversion, the company would need to model a third strategy, "Neither A nor B").

Another example, commonly used in operations research courses, is the distribution of lifeguards on beaches (a.k.a. the "Life's a Beach" example).^{ [4] } The example describes two beaches with lifeguards to be distributed on each beach. There is maximum budget *B* that can be distributed among the two beaches (in total), and using a marginal returns table, analysts can decide how many lifeguards to allocate to each beach.

Lifeguards on each beach | Drownings prevented in total, beach #1 | Drownings prevented in total, beach #2 |
---|---|---|

1 | 1 | 3 |

2 | 4 | 0 |

In this example, a decision tree can be drawn to illustrate the principles of diminishing returns on beach #2.

The decision tree illustrates that when sequentially distributing lifeguards, placing a first lifeguard on beach #1 would be optimal if there is only the budget for 1 lifeguard. But if there is a budget for two guards, then placing both on beach #2 would prevent more overall drownings.

Much of the information in a decision tree can be represented more compactly as an influence diagram, focusing attention on the issues and relationships between events.

Decision trees can also be seen as generative models of induction rules from empirical data. An optimal decision tree is then defined as a tree that accounts for most of the data, while minimizing the number of levels (or "questions").^{ [5] } Several algorithms to generate such optimal trees have been devised, such as ID3/4/5,^{ [6] } CLS, ASSISTANT, and CART.

Among decision support tools, decision trees (and influence diagrams) have several advantages. Decision trees:

- Are simple to understand and interpret. People are able to understand decision tree models after a brief explanation.
- Have value even with little hard data. Important insights can be generated based on experts describing a situation (its alternatives, probabilities, and costs) and their preferences for outcomes.
- Help determine worst, best and expected values for different scenarios.
- Use a white box model. If a given result is provided by a model.
- Can be combined with other decision techniques.

Disadvantages of decision trees:

- They are unstable, meaning that a small change in the data can lead to a large change in the structure of the optimal decision tree.
- They are often relatively inaccurate. Many other predictors perform better with similar data. This can be remedied by replacing a single decision tree with a random forest of decision trees, but a random forest is not as easy to interpret as a single decision tree.
- For data including categorical variables with different number of levels, information gain in decision trees is biased in favor of those attributes with more levels.
^{ [7] } - Calculations can get very complex, particularly if many values are uncertain and/or if many outcomes are linked.

**Machine learning** (ML) is the scientific study of algorithms and statistical models that computer systems use to effectively perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model of sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task. Machine learning algorithms are used in a wide variety of applications, such as email filtering, and computer vision, where it is infeasible to develop an algorithm of specific instructions for performing the task. Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a field of study within machine learning, and focuses on exploratory data analysis through unsupervised learning. In its application across business problems, machine learning is also referred to as predictive analytics.

**Branch and bound** is an algorithm design paradigm for discrete and combinatorial optimization problems, as well as mathematical optimization. A branch-and-bound algorithm consists of a systematic enumeration of candidate solutions by means of state space search: the set of candidate solutions is thought of as forming a rooted tree with the full set at the root. The algorithm explores *branches* of this tree, which represent subsets of the solution set. Before enumerating the candidate solutions of a branch, the branch is checked against upper and lower estimated *bounds* on the optimal solution, and is discarded if it cannot produce a better solution than the best one found so far by the algorithm.

In computing, a **visual programming language** (**VPL**) is any programming language that lets users create programs by manipulating program elements *graphically* rather than by specifying them *textually*. A VPL allows programming with visual expressions, spatial arrangements of text and graphic symbols, used either as elements of syntax or secondary notation. For example, many VPLs are based on the idea of "boxes and arrows", where boxes or other screen objects are treated as entities, connected by arrows, lines or arcs which represent relations.

In computer science, a **binary decision diagram** (**BDD**) or **branching program** is a data structure that is used to represent a Boolean function. On a more abstract level, BDDs can be considered as a compressed representation of sets or relations. Unlike other compressed representations, operations are performed directly on the compressed representation, i.e. without decompression. Other data structures used to represent Boolean functions include negation normal form (NNF), Zhegalkin polynomials, and propositional directed acyclic graphs (PDAG).

In computer science, **Decision tree learning** uses a decision tree to go from observations about an item to conclusions about the item's target value. It is one of the predictive modeling approaches used in statistics, data mining and machine learning. Tree models where the target variable can take a discrete set of values are called **classification trees**; in these tree structures, leaves represent class labels and branches represent conjunctions of features that lead to those class labels. Decision trees where the target variable can take continuous values are called **regression trees**.

In computer programming, **gene expression programming (GEP)** is an evolutionary algorithm that creates computer programs or models. These computer programs are complex tree structures that learn and adapt by changing their sizes, shapes, and composition, much like a living organism. And like living organisms, the computer programs of GEP are also encoded in simple linear chromosomes of fixed length. Thus, GEP is a genotype–phenotype system, benefiting from a simple genome to keep and transmit the genetic information and a complex phenotype to explore the environment and adapt to it.

In machine learning and statistics, **classification** is the problem of identifying to which of a set of categories (sub-populations) a new observation belongs, on the basis of a training set of data containing observations whose category membership is known. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient. Classification is an example of pattern recognition.

**Ben Shneiderman** is an American computer scientist, a Distinguished University Professor in the Department of Computer Science, which is part of the University of Maryland College of Computer, Mathematical, and Natural Sciences at the University of Maryland, College Park, and the founding director (1983-2000) of the University of Maryland Human-Computer Interaction Lab. He conducted fundamental research in the field of human–computer interaction, developing new ideas, methods, and tools such as the direct manipulation interface, and his eight rules of design.

In decision tree learning, **ID3** is an algorithm invented by Ross Quinlan used to generate a decision tree from a dataset. ID3 is the precursor to the C4.5 algorithm, and is typically used in the machine learning and natural language processing domains.

**C4.5** is an algorithm used to generate a decision tree developed by Ross Quinlan. C4.5 is an extension of Quinlan's earlier ID3 algorithm. The decision trees generated by C4.5 can be used for classification, and for this reason, C4.5 is often referred to as a statistical classifier. In 2011, authors of the Weka machine learning software described the C4.5 algorithm as "a landmark decision tree program that is probably the machine learning workhorse most widely used in practice to date".

**Waikato Environment for Knowledge Analysis** (**Weka**) is a suite of machine learning software written in Java, developed at the University of Waikato, New Zealand. It is free software licensed under the GNU General Public License.

**Predictive analytics** encompasses a variety of statistical techniques from data mining, predictive modelling, and machine learning, that analyze current and historical facts to make predictions about future or otherwise unknown events.

**Decision intelligence** is an engineering discipline that augments data science with theory from social science, decision theory, and managerial science. Its application provides a framework for best practices in organizational decision-making and processes for applying machine learning at scale.

**Information fuzzy networks (IFN)** is a greedy machine learning algorithm for supervised learning. The data structure produced by the learning algorithm is also called Info Fuzzy Network. IFN construction is quite similar to decision trees' construction. However, IFN constructs a directed graph and not a tree. IFN also uses the conditional mutual information metric in order to choose features during the construction stage while decision trees usually use other metrics like entropy or gini.

**Contrast set learning** is a form of association rule learning that seeks to identify meaningful differences between separate groups by reverse-engineering the key predictors that identify for each particular group. For example, given a set of attributes for a pool of students, a contrast set learner would identify the *contrasting* features between students seeking bachelor's degrees and those working toward PhD degrees.

DEX is a qualitative multi-criteria decision analysis (MCDA) method for decision making and is implemented in DEXi software. This method was developed by a research team led by Bohanec, Bratko, and Rajkovič. The method supports decision makers in making complex decisions based on multiple, possibly conflicting, attributes. In DEX, all attributes are qualitative and can take values represented by words, such as “low” or “excellent”. Attributes are generally organized in a hierarchy. The evaluation of decision alternatives is carried out by utility functions, which are represented in the form of decision rules. All attributes are assumed to be discrete. Additionally, they can be preferentially ordered, so that a higher ordinal value represents a better preference.

*Most of the terms listed in Wikipedia glossaries are already defined and explained within Wikipedia itself. However, glossaries like this one are useful for looking up, comparing and reviewing large numbers of terms together. You can help enhance this page by adding new terms or writing definitions for existing ones.*

- ↑ Kamiński, B.; Jakubczyk, M.; Szufel, P. (2017). "A framework for sensitivity analysis of decision trees".
*Central European Journal of Operations Research*.**26**(1): 135–159. doi:10.1007/s10100-017-0479-6. PMC 5767274 . PMID 29375266. - ↑ Quinlan, J. R. (1987). "Simplifying decision trees".
*International Journal of Man-Machine Studies*.**27**(3): 221–234. CiteSeerX 10.1.1.18.4267 . doi:10.1016/S0020-7373(87)80053-6. - ↑ K. Karimi and H.J. Hamilton (2011), "Generation and Interpretation of Temporal Decision Rules", International Journal of Computer Information Systems and Industrial Management Applications, Volume 3
- ↑ Wagner, Harvey M. (1975-09-01).
*Principles of Operations Research: With Applications to Managerial Decisions*(2nd ed.). Englewood Cliffs, NJ: Prentice Hall. ISBN 9780137095926. - ↑ R. Quinlan, "Learning efficient classification procedures",
*Machine Learning: an artificial intelligence approach*, Michalski, Carbonell & Mitchell (eds.), Morgan Kaufmann, 1983, p. 463-482. doi : 10.1007/978-3-662-12405-5_15 - ↑ Utgoff, P. E. (1989). Incremental induction of decision trees. Machine learning, 4(2), 161-186. doi : 10.1023/A:1022699900025
- ↑ Deng,H.; Runger, G.; Tuv, E. (2011).
*Bias of importance measures for multi-valued attributes and solutions*(PDF). Proceedings of the 21st International Conference on Artificial Neural Networks (ICANN).

Wikimedia Commons has media related to . decision diagrams |

This page is based on this Wikipedia article

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.

Text is available under the CC BY-SA 4.0 license; additional terms may apply.

Images, videos and audio are available under their respective licenses.