Dynamic decision-making (DDM) is interdependent decision-making that takes place in an environment that changes over time either due to the previous actions of the decision maker or due to events that are outside of the control of the decision maker. [1] [2] In this sense, dynamic decisions, unlike simple and conventional one-time decisions, are typically more complex and occur in real-time and involve observing the extent to which people are able to use their experience to control a particular complex system, including the types of experience that lead to better decisions over time. [3]
In psychology, decision-making is regarded as the cognitive process resulting in the selection of a belief or a course of action among several alternative possibilities. Decision-making is the process of identifying and choosing alternatives based on the values, preferences and beliefs of the decision-maker. Every decision-making process produces a final choice, which may or may not prompt action.
A complex system is a system composed of many components which may interact with each other. Examples of complex systems are Earth's global climate, organisms, the human brain, infrastructure such as power grid, transportation or communication systems, social and economic organizations, an ecosystem, a living cell, and ultimately the entire universe.
Dynamic decision making research uses computer simulations which are laboratory analogues for real-life situations. These computer simulations are also called “microworlds” [4] and are used to examine people's behavior in simulated real world settings where people typically try to control a complex system where later decisions are affected by earlier decisions. [5] The following differentiate DDM research from more classical forms of decision making research of the past:
Also, the use of microworlds as a tool to investigate DDM not only provides experimental control to DDM researchers but also makes the DDM field contemporary unlike the classical decision making research which is very old.
A scientific control is an experiment or observation designed to minimize the effects of variables other than the independent variable. This increases the reliability of the results, often through a comparison between control measurements and the other measurements. Scientific controls are a part of the scientific method.
Examples of dynamic decision making situations include managing climate change, factory production and inventory, air traffic control, firefighting, and driving a car, military command and control in a battle field. Research in DDM has focused on investigating the extent to which decision makers use their experience to control a particular system; the factors that underlie the acquisition and use of experience in making decisions; and the type of experiences that lead to better decisions in dynamic tasks.
The primary characteristics of dynamic decision environments are dynamics, complexity, opaqueness, and dynamic complexity. The dynamics of the environments refers to the dependence of the system's state on its state at an earlier time. Dynamics in the system could be driven by positive feedback (self-amplifying loops) or negative feedback (self-correcting loops), examples of which could be the accrual of interest in a saving bank account or the assuage of hunger due to eating respectively.
Positive feedback is a process that occurs in a feedback loop which exacerbates the effects of a small disturbance. That is, the effects of a peturbation on a system include an increase in the magnitude of the perturbation. That is, A produces more of B which in turn produces more of A. In contrast, a system in which the results of a change act to reduce or counteract it has negative feedback. Both concepts play an important role in science and engineering, including biology, chemistry, and cybernetics.
Negative feedback occurs when some function of the output of a system, process, or mechanism is fed back in a manner that tends to reduce the fluctuations in the output, whether caused by changes in the input or by other disturbances.
Complexity largely refers to the number of interacting or interconnected elements within a system that can make it difficult to predict the behavior of the system. But the definition of complexity could still have problems as system components can vary in terms of how many components there are in the system, number of relationships between them, and the nature of those relationships. Complexity may also be a function of the decision maker's ability.
Opaqueness refers to the physical invisibility of some aspects of a dynamic system and it might also be dependent upon a decision maker's ability to acquire knowledge of the components of the system.
Dynamic complexity refers to the decision maker's ability to control the system using the feedback the decision maker receives from the system. Diehl and Sterman [6] have further broken down dynamic complexity into three components. The opaqueness present in the system might cause unintended side-effects. There might be non-linear relationships between components of a system and feedback delays between actions taken and their outcomes. The dynamic complexity of a system might eventually make it hard for the decision makers to understand and control the system.
A microworld is a complex simulation used in controlled experiments designed to study dynamic decision-making. Research in dynamic decision-making is mostly laboratory-based and uses computer simulation microworld tools (i.e., Decision Making Games, DMGames). The microworlds are also known by other names, including synthetic task environments, high fidelity simulations, interactive learning environments, virtual environments , and scaled worlds. Microworlds become the laboratory analogues for real-life situations and help DDM investigators to study decision-making by compressing time and space while maintaining experimental control.
Virtual reality (VR) is a simulated experience that can be similar to or completely different from the real world. Applications of virtual reality can include entertainment and educational purposes. Other, distinct types of VR style technology include augmented reality and mixed reality.
The DMGames compress the most important elements of the real-world problems they represent and are important tools for collecting human actions DMGames have helped investigate a variety of factors, such as cognitive ability, type of feedback, timing of feedback, strategies used while making decisions, and knowledge acquisition while performing DDM tasks. However, even though DMGames aim to represent the essential elements of real-world systems, they differ from the real-world task in various respects. Stakes might be higher in real-life tasks and expertise of the decision maker has often been acquired over a period of many years rather than minutes, hours or days as in DDM tasks. Thus, DDM differs in many respects from naturalistic decision-making (NDM).
In DDM tasks people have been shown to perform below the optimal levels of performance, if an optimal could be ascertained or known. For example, in a forest firefighting simulation game, participants frequently allowed their headquarters to be burned down. [7] In similar DDM studies participants acting as doctors in an emergency room allowed their patients to die while they kept waiting for results of test that were actually non-diagnostic. [8] [9] An interesting insight into decisions from experience in DDM is that mostly the learning is implicit, and despite people's improvement of performance with repeated trials they are unable to verbalize the strategy they followed to do so. [10]
Learning forms an integral part of DDM research. One of the main research activities in DDM has been to investigate using microworlds simulations tools the extent to which people are able to learn to control a particular simulated system and investigating the factors that might explain the learning in DDM tasks.
One theory of learning relies on the use of strategies or rules of action that relate to a particular task. These rules specify the conditions under which a certain rule or strategy will apply. These rules are of the form if you recognize situation S, then carry out action/strategy A. For example, Anzai [11] implemented a set of production rules or strategies which performed the DDM task of steering a ship through a certain set of gates. The Anzai strategies did reasonably well to mimic the performance on the task by human participants. Similarly, Lovett and Anderson [12] have shown how people use production rules or strategies of the if – then type in the building-sticks task which is an isomorph of Lurchins' waterjug problem. [13] [14] The goal in the building-sticks task is to construct a stick of a particular desired length given three stick lengths from which to build (there is an unlimited supply of sticks of each length). There are basically two strategies to use in trying to solve this problem. The undershoot strategy is to take smaller sticks and build up to the target stick. The overshoot strategy is to take the stick longer than the goal and cut off pieces equal in length to the smaller stick until one reaches the target length. Lovett and Anderson arranged it so that only one strategy would work for a particular problem and gave subjects problems where one of the two strategies worked on a majority of the problems (and she counterbalanced over subjects which was the more successful strategy).
Some other researchers have suggested that learning in DDM tasks can be explained by a connectionist theory or connectionism. The connections between units, whose strength or weighing depend upon previous experience. Thus, the output of a given unit depends upon the output of the previous unit weighted by the strength of the connection. As an example, Gibson et al. [15] has shown that a connectionist neural network machine learning model does a good job to explain human behavior in the Berry and Broadbent's Sugar Production Factory task[ clarification needed ].
The Instance-Based Learning Theory (IBLT) is a theory of how humans make decisions in dynamic tasks developed by Cleotilde Gonzalez, Christian Lebiere, and Javier Lerch. [3] The theory has been extended to two different paradigms of dynamic tasks, called sampling and repeated-choice, by Cleotilde Gonzalez and Varun Dutt. [16] Gonzalez and Dutt [16] have shown that in these dynamic tasks, IBLT provides the best explanation of human behavior and performs better than many other competing models and approaches. According to IBLT, individuals rely on their accumulated experience to make decisions by retrieving past solutions to similar situations stored in memory. Thus, decision accuracy can only improve gradually and through interaction with similar situations.
IBLT assumes that specific instances or experiences or exemplars are stored in the memory. [17] These instances have a very concrete structure defined by three distinct parts which include the situation, decision, and utility (or SDU):
In addition to a predefined structure of an instance, IBLT relies on the global, high-level decision making process, consisting of five stages: recognition, judgment, choice, execution, and feedback. [16] When people are faced with a particular environment's situation, people are likely to retrieve similar instances from memory to make a decision. In atypical situations (those that are not similar to anything encountered in the past), retrieval from memory is not possible and people would need to use a heuristic (which does not rely on memory) to make a decision. In situations that are typical and where inss can be retrieved, evaluation of the utility of the similar instances takes place until a necessity level is crossed. [16]
Necessity is typically determined by the decision maker's “aspiration level,” similar to Simon and March's satisficing strategy. But the necessity level might also be determined by external environmental factors like time constraints (as in the medical domain with doctors in an emergency room treating patients in a time critical situation). Once that necessity level is crossed, the decision involving the instance with the highest utility is made. The outcome of the decision, when received, is then used to update the utility of the instance that was used to make the decision in the first place (from expected to experienced). This generic decision making process is assumed to apply to any dynamic decision making situation, when decisions are made from experience.
The computational representation of IBLT relies on several learning mechanisms proposed by a generic theory of cognition, ACT-R. Currently, there are many decision tasks that have been implemented in the IBLT that reproduces and explains human behavior accurately. [18] [19]
Although feedback interventions have been found to benefit performance on DDM tasks, outcome feedback has been shown to work for tasks that are simple, require lower cognitive abilities, and that are repeatedly practiced. [20] For example, IBLT suggests that in DDM situations, learning from only outcome feedback is slow and generally ineffective. [21]
The presence of feedback delays in the DDM tasks and its misperceptions by the participants contributes to less than optimal performance on DDM tasks. [22] Such delays in feedback make it harder for people to understand the relationships that govern the system dynamics of the task due to the delay between the actions of the decision makers and the outcome from the dynamic system.
A familiar example of the effect of feedback delays is the Beer Distribution Game (or Beer Game). There is a time delay built into the game between placing an order by a role and reception of the ordered cases of beer. If a role runs out of beer (i.e., unable to satisfy a customer's current demand for beer cases), there is a fine of $1 per case. This might lead people to overstock beer to satisfy any future unanticipated demands. Results, contrary to economic theory which predicts a long term stable equilibrium, show people ordering too much. This happens because the time delay between placing an order and receiving inventory makes people think that the inventory is running out as new orders come in, so they react and place larger orders. Once they build up the inventory and realize the incoming orders they drastically cut future orders which leads the beer industry experience oscillating patterns of over-ordering and under-ordering, that is, costly cycles of boom and bust.
Similar examples on effects of feedback delay have been reported among fire fighters in a fire fighting game called NEWFIRE in the past where on account of task complexity and feedback delay between actions of firefighters and outcomes, led participants to frequently allow their headquarters to be burned down.
Growing evidence in DDM indicates that adults share a robust problem in understanding some of the basic building blocks of simple dynamic systems, including stocks, inflows, and outflows. Many adults have shown a failure to interpret a basic principle of dynamics: a stock (or accumulation) rises (or falls) when the inflow exceeds (or is less than) the outflow. This problem, termed Stock-Flow failure (SF Failure), has been shown to be persistent even in simple tasks, with well motivated participants, in familiar contexts and simplified information displays. The belief that the stock behaves like the flows is a common but wrong heuristic (named the “correlation heuristic") that people often use when judging non-linear systems. [23] The use of correlation heuristic or proportional reasoning is widespread across different domains and has been found to be a robust problem in both school children and educated adults (Cronin et al. 2009; Larrick & Soll, 2008; De Bock 2002; Greer, 1993; Van Dooren et al., 2005; Van Dooren et al., 2006; Verschaffel et al., 1994).
Individual performance on DDM tasks is accompanied by tremendous amount of variability, which might be a result of the varying amount of skill and cognitive abilities of individuals who interact with the DDM tasks. Although individual differences exist and are often shown on DDM tasks, there has been a debate on whether these differences arise as a result of differences in cognitive abilities. Some studies have failed to find evidence of a link between cognitive abilities as measured by intelligence tests and performance on DDM tasks. But later studies contend that this lack is due to absence of reliable performance measures on DDM tasks. [24] [25]
Other studies have suggested a relationship between workload and cognitive abilities. [26] It was found that low ability participants are generally outperformed by high ability participants. Under demanding conditions of workload, low ability participants do not show improvement in performance in either training or test trials. Evidence shows that low ability participants use more heuristics particularly when the task demands faster trials or time pressure and this happens both during training and test conditions. [27]
In connection to DDM using laboratory microworld tools to investigate decision making there has also been a recent emphasis in DDM research to focus on decision making in the real world. This does not discount research in the laboratory but reveals the broad conception of the research underlying DDM. Under the DDM in the real world people are more interested in processes like goal setting, planning, perceptual and attention processes, forecasting, comprehension processes and many others including attending to feedback. The study of these processes brings DDM research closer to situation awareness and expertise.
For example, it has been shown in DDM research that motorists who have more than 10 years of experience or expertise (in terms years of driving experience) are faster to respond to hazards than drivers with less than three years of experience. [28] Also, owing to their greater experience, such motorists tend to perform a more effective and efficient search for hazards cues than their not so experienced counterparts. [29] A way to explain such behavior is based upon the premise that situation awareness in DDM tasks makes certain behaviors automatic for people with expertise. In this regard, the search for cue in the environment that could possibly lead to hazards for experienced motorists might be an automatic process whereas lack of situation awareness among novice motorists might lead them to a conscious non-automatic effort to find such cues leading them to become more prone to hazards by not noticing them at all. This behavior has also been documented for pilots and platoon commanders. [30] The considerations of novice and experienced platoon commanders in a virtual reality battle simulator has shown that more experience was associated with higher perceptual skills, higher comprehension skills. Thus, experience on different DDM tasks makes a decision maker more situational aware with higher levels of perceptual and comprehension skills.
Related fields
Instructional scaffolding is the support given to a student by an instructor throughout the learning process. This support is specifically tailored to each student; this instructional approach allows students to experience student-centered learning, which tends to facilitate more efficient learning than teacher-centered learning. This learning process promotes a deeper level of learning than many other common teaching strategies.
Theoretical approach within the field of Psychology, follows a movement in which triggers the response to behaviorism, derived its name from the Latin cognoscere, referring to knowing and information, thus cognitive psychology is an information-processing psychology derived in part from earlier traditions of the investigation of thought and problem solving.
A cognitive model is an approximation to animal cognitive processes for the purposes of comprehension and prediction. Cognitive models can be developed within or without a cognitive architecture, though the two are not always easily distinguishable.
Soar is a cognitive architecture, originally created by John Laird, Allen Newell, and Paul Rosenbloom at Carnegie Mellon University. It is now maintained and developed by John Laird's research group at the University of Michigan.
A cognitive tutor is a particular kind of intelligent tutoring system that utilizes a cognitive model to provide feedback to students as they are working through problems. This feedback will immediately inform students of the correctness, or incorrectness, of their actions in the tutor interface; however, cognitive tutors also have the ability to provide context-sensitive hints and instruction to guide students towards reasonable next steps.
Crowd simulation is the process of simulating the movement of a large number of entities or characters. It is commonly used to create virtual scenes for visual media like films and video games, and is also used in crisis training, architecture and urban planning, and evacuation simulation.
A mental model is an explanation of someone's thought process about how something works in the real world. It is a representation of the surrounding world, the relationships between its various parts and a person's intuitive perception about his or her own acts and their consequences. Mental models can help shape behaviour and set an approach to solving problems and doing tasks.
Problem solving consists of using generic or ad hoc methods in an orderly manner to find solutions to problems. Some of the problem-solving techniques developed and used in philosophy, artificial intelligence, computer science, engineering, mathematics, or medicine are related to mental problem-solving techniques studied in psychology.
The somatic marker hypothesis, formulated by Antonio Damasio and associated researchers, proposes that emotional processes guide behavior, particularly decision-making.
In artificial intelligence, an intelligent agent (IA) refers to an autonomous entity which acts, directing its activity towards achieving goals, upon an environment using observation through sensors and consequent actuators. Intelligent agents may also learn or use knowledge to achieve their goals. They may be very simple or very complex. A reflex machine, such as a thermostat, is considered an example of an intelligent agent.
The naturalistic decision making (NDM) framework emerged as a means of studying how people make decisions and perform cognitively complex functions in demanding, real-world situations. These include situations marked by limited time, uncertainty, high stakes, team and organizational constraints, unstable conditions, and varying amounts of experience.
Behavioral operations research (BOR) examines and takes into consideration human behavior and emotions when facing complex decision problems. BOR is part of Operational Research. BOR relates to the behavioural aspects of the use of operations research in problem solving and decision support. Specifically, it focuses on understanding behaviour in, with and beyond models. The general purpose is to make better use and improve the use of operations research theories and practice, so that the benefits received from the potential improvements to operations research approaches in practice, that arise from recent findings in behavioural sciences, are realised. BOR approaches have heavily influenced supply chain management research, amongst others.
Action selection is a way of characterizing the most basic problem of intelligent systems: what to do next. In artificial intelligence and computational cognitive science, "the action selection problem" is typically associated with intelligent agents and animats—artificial systems that exhibit complex behaviour in an agent environment. The term is also sometimes used in ethology or animal behavior.
Social cognitive theory (SCT), used in psychology, education, and communication, holds that portions of an individual's knowledge acquisition can be directly related to observing others within the context of social interactions, experiences, and outside media influences. This theory was advanced by Albert Bandura as an extension of his social learning theory. The theory states that when people observe a model performing a behavior and the consequences of that behavior, they remember the sequence of events and use this information to guide subsequent behaviors. Observing a model can also prompt the viewer to engage in behavior they already learned. In other words, people do not learn new behaviors solely by trying them and either succeeding or failing, but rather, the survival of humanity is dependent upon the replication of the actions of others. Depending on whether people are rewarded or punished for their behavior and the outcome of the behavior, the observer may choose to replicate behavior modeled. Media provides models for a vast array of people in many different environmental settings.
Psi-theory, developed by Dietrich Dörner at the University of Bamberg, is a systemic psychological theory covering human action regulation, intention selection and emotion. It models the human mind as an information processing agent, controlled by a set of basic physiological, social and cognitive drives. Perceptual and cognitive processing are directed and modulated by these drives, which allow the autonomous establishment and pursuit of goals in an open environment.
Real-time Control System (RCS) is a reference model architecture, suitable for many software-intensive, real-time control problem domains. RCS is a reference model architecture that defines the types of functions that are required in a real-time intelligent control system, and how these functions are related to each other.
Goal orientation is an "individual disposition toward developing or validating one's ability in achievement settings". Previous research has examined goal orientation as a motivation variable useful for recruitment, climate and culture, performance appraisal, and selection. Studies have also used goal orientation to predict sales performance, goal setting, learning and adaptive behaviors in training, and leadership. Due to the many theoretical and practical applications of goal orientation, it is important to understand the construct and how it relates to other variables. In this entry, goal orientation will be reviewed in terms of its history, stability, dimensionality, antecedents, its relationship to goal setting and consequences, its relevance to motivation, and future directions for research.
Adaptive collaborative control is the decision-making approach used in hybrid models consisting of finite-state machines with functional models as subcomponents to simulate behavior of systems formed through the partnerships of multiple agents for the execution of tasks and the development of work products. The term “collaborative control” originated from work developed in the late 90's and early 2000 by Fong, Thorpe, and Baur (1999). It is important to note that according to Fong et al. in order for robots to function in collaborative control, they must be self-reliant, aware, and adaptive. In literature, the adjective “adaptive” is not always shown but is noted in the official sense as it is an important element of collaborative control. The adaptation of traditional applications of control theory in teleoperations sought initially to reduce the sovereignty of “humans as controllers/robots as tools” and had humans and robots working as peers, collaborating to perform tasks and to achieve common goals. Early implementations of adaptive collaborative control centered on vehicle teleoperation. Recent uses of adaptive collaborative control cover training, analysis, and engineering applications in teleoperations between humans and multiple robots, multiple robots collaborating among themselves, unmanned vehicle control, and fault tolerant controller design.
Cognitive bias mitigation is the prevention and reduction of the negative effects of cognitive biases – unconscious, automatic influences on human judgment and decision making that reliably produce reasoning errors.