Soar [1] is a cognitive architecture, [2] originally created by John Laird, Allen Newell, and Paul Rosenbloom at Carnegie Mellon University.
The goal of the Soar project is to develop the fixed computational building blocks necessary for general intelligent agents – agents that can perform a wide range of tasks and encode, use, and learn all types of knowledge to realize the full range of cognitive capabilities found in humans, such as decision making, problem solving, planning, and natural-language understanding. It is both a theory of what cognition is and a computational implementation of that theory. Since its beginnings in 1983 as John Laird’s thesis, it has been widely used by AI researchers to create intelligent agents and cognitive models of different aspects of human behavior. The most current and comprehensive description of Soar is the 2012 book, The Soar Cognitive Architecture. [1]
Rosenbloom continued to serve as co-principal investigator after moving to Stanford University, then to the University of Southern California's Information Sciences Institute. It is now maintained and developed by John Laird's research group at the University of Michigan.
Soar embodies multiple hypotheses about the computational structures underlying general intelligence, many of which are shared with other cognitive architectures, including ACT-R, which was created by John R. Anderson, and LIDA, which was created by Stan Franklin. Recently, the emphasis on Soar has been on general AI (functionality and efficiency), whereas the emphasis on ACT-R has always been on cognitive modeling (detailed modeling of human cognition).
The original theory of cognition underlying Soar is the Problem Space Hypothesis, which is described in Allen Newell's book, Unified Theories of Cognition . [2] and dates back to one of the first AI systems created, Newell, Simon, and Shaw's Logic Theorist, first presented in 1955. The Problem Space Hypothesis contends that all goal-oriented behavior can be cast as search through a space of possible states (a problem space) while attempting to achieve a goal. At each step, a single operator is selected, and then applied to the agent’s current state, which can lead to internal changes, such as retrieval of knowledge from long-term memory or modifications or external actions in the world. (Soar’s name is derived from this basic cycle of State, Operator, And Result; however, it is no longer regarded as an acronym.) Inherent to the Problem Space Hypothesis is that all behavior, even a complex activity such as planning, is decomposable into a sequence of selection and application of primitive operators, which when mapped onto human behavior take ~50ms.
A second hypothesis of Soar’s theory is that although only a single operator can be selected at each step, forcing a serial bottleneck, the processes of selection and application are implemented through parallel rule firings, which provide context-dependent retrieval of procedural knowledge.
A third hypothesis is that if the knowledge to select or apply an operator is incomplete or uncertain, an impasse arises and the architecture automatically creates a substate. In the substate, the same process of problem solving is recursively used, but with the goal to retrieve or discover knowledge so that decision making can continue. This can lead to a stack of substates, where traditional problem methods, such as planning or hierarchical task decomposition, naturally arise. When results created in the substate resolve the impasse, the substate and its associated structures are removed. The overall approach is called Universal Subgoaling.
These assumptions lead to an architecture that supports three levels of processing. At the lowest level, is bottom-up, parallel, and automatic processing. The next level is the deliberative level, where knowledge from the first level is used to propose, select, and apply a single action. These two levels implement fast, skilled behavior, and roughly correspond to Kahneman’s System 1 processing level. More complex behavior arises automatically when knowledge is incomplete or uncertain, through a third level of processing using substates, roughly corresponding to System 2.
A fourth hypothesis in Soar is that the underlying structure is modular, but not in terms of task or capability based modules, such as planning or language, but instead as task independent modules including: a decision making module; memory modules (short-term spatial/visual and working memories; long-term procedural, declarative, and episodic memories), learning mechanisms associated with all long-term memories; and perceptual and motor modules. There are further assumptions about the specific properties of these memories described below, including that all learning is online and incremental.
A fifth hypothesis is that memory elements (except those in the spatial/visual memory) are represented as symbolic, relational structures. The hypothesis that a symbolic system is necessary for general intelligence is known as the physical symbol system hypothesis. An important evolution in Soar is that all symbolic structures have associated statistical metadata (such as information on recency and frequency of use, or expected future reward) that influences retrieval, maintenance, and learning of the symbolic structures.
Soar’s main processing cycle arises from the interaction between procedural memory (its knowledge about how to do things) and working memory (its representation of the current situation) to support the selection and application of operators. Information in working memory is represented as a symbolic graph structure, rooted in a state. The knowledge in procedural memory is represented as if-then rules (sets of conditions and actions), that are continually matched against the contents of working memory. When the conditions of a rule matches structures in working memory, it fires and performs its actions. This combination of rules and working memory is also called a production system. In contrast to most production systems, in Soar, all rules that match, fire in parallel.
Instead of having the selection of a single rule being the crux of decision making, Soar’s decision making occurs through the selection and applications of operators, that are proposed, evaluated, and applied by rules. An operator is proposed by rules that test the current state and create a representation of the operator in working memory as well as an acceptable preference, which indicates that the operator should be considered for selection and application. Additional rules match with the proposed operator and create additional preferences that compare and evaluate it against other proposed operators. The preferences are analyzed by a decision procedure, which selects the preferred operator and installs it as the current operator in working memory. Rules that match the current operator then fire to apply it and make changes to working memory. The changes to working memory can be simple inferences, queries for retrieval from Soar’s long-term semantic or episodic memories, commands to the motor system to perform actions in an environment, or interactions with the Spatial Visual System (SVS), which is working memory’s interface to perception. These changes to working memory lead to new operators being proposed and evaluated, followed by the selection of one and its application.
Soar supports reinforcement learning, which tunes the values of rules that create numeric preferences for evaluating operators, based on reward. To provide maximal flexibility, there is a structure in working memory where reward is created.
If the preferences for the operators are insufficient to specify the selection of a single operator, or there are insufficient rules to apply an operator, an impasse arises. In response to an impasse, a substate is created in working memory, with the goal being to resolve the impasse. Additional procedural knowledge can then propose and select operators in the substate to gain more knowledge, and either create preferences in the original state or modify that state so the impasse is resolved. Substates provide a means for on-demand complex reasoning, including hierarchical task decomposition, planning, and access to the declarative long-term memories. Once the impasse is resolved, all of the structures in the substate are removed except for any results. Soar’s chunking mechanism compiles the processing in the substate which led to results into rules. In the future, the learned rules automatically fire in similar situations so that no impasse arises, incrementally converting complex reasoning into automatic/reactive processing. Recently, the overall Universal Subgoaling procedure has been extended through a mechanism of goal-directed and automatic knowledge base augmentation that allows to solve an impasse by recombining, in an innovative and problem-oriented way, the knowledge possessed by a Soar agent. [3]
Symbolic input and output occurs through working memory structures attached to the top state called the input-link and the output-link. If structures are created on the output-link in working memory, these are translated into commands for external actions (e.g., motor control).
To support interaction with vision systems and non-symbolic reasoning, Soar has its Spatial Visual System (SVS). SVS internally represents the world as a scene graph, a collection of objects and component subobjects each with spatial properties such as shape, location, pose, relative position, and scale. A Soar agent using SVS can create filters to automatically extract features and relations from its scene graph, which are then added to working memory. In addition, a Soar agent can add structures to SVS and use it for mental imagery. For example, an agent can create a hypothetical object in SVS at a given location and query to see if it collides with any perceived objects.
Semantic Memory (SMEM) in Soar is designed to be a very large long-term memory of fact-like structures. Data in SMEM is represented as directed cyclic graphs. Structures can be stored or retrieved by rules that create commands in a reserved area of working memory. Retrieved structures are added to working memory.
SMEM structures have activation values that represent the frequency or recency of usage of each memory, implementing the base-level activation scheme originally developed for ACT-R. During retrieval, the structure in SMEM that matches the query and has the highest activation is retrieved. Soar also supports spreading activation , where activation spreads from SMEM structures that have been retrieved into working memory to other long-term memories that they are linked to. [4] These memories in turn spread activation to their neighbor memories, with some decay. Spreading activation is a mechanism for allowing the current context to influence retrievals from semantic memory.
Episodic Memory (EPMEM) automatically records snapshots of working memory in a temporal stream. Prior episodes can be retrieved into working memory through query. Once an episode has been retrieved, the next (or previous) episode can then be retrieved. An agent may employ EPMEM to sequentially play through episodes from its past (allowing it to predict the effects of actions), retrieve specific memories, or query for episodes possessing certain memory structures.
Each of Soar’s long-term memories have associated online learning mechanisms that create new structures or modify metadata based on an agent’s experience. For example, Soar learns new rules for procedural memory through a process called chunking and uses reinforcement learning to tune rules involved in the selection of operators.
The standard approach to developing an agent in Soar starts with writing rules that are loaded into procedural memory, and initializing semantic memory with appropriate declarative knowledge. The process of agent development is explained in detail in the official Soar manual as well as in several tutorials which are provided at the research group's website.
The Soar architecture is maintained and extended by John Laird's research group at the University of Michigan. The current architecture is written in a combination of C and C++, and is freely available (BSD license) at the research group's website.
Soar can interface with external language environments including C++, Java, Tcl, and Python through the Soar Markup Language (SML). SML is a primary mechanism for creating instances of Soar agents and interacting with their I/O links.
JSoar is an implementation of Soar written in Java. It is maintained by SoarTech, an AI research and development company. JSoar closely follows the University of Michigan architecture implementation, although it generally does not reflect the latest developments and changes of that C/C++ version. [5]
Below is a historical list of different areas of applications that have been implemented in Soar. There have been over a hundred systems implemented in Soar, although the vast majority of them are toy tasks or puzzles.
Throughout its history, Soar has been used to implement a wide variety of classic AI puzzles and games, such as Tower of Hanoi, Water Jug, Tic Tac Toe, Eight Puzzle, Missionaries and Cannibals, and variations of the Blocks world. One of the initial achievements of Soar was showing that many different weak methods would naturally arise from the task knowledge that was encoded in it, a property called, the Universal Weak Method. [6]
The first large-scale application of Soar was R1-Soar, a partial reimplementation by Paul Rosenbloom of the R1 (XCON) expert system John McDermott developed for configuring DEC computers. R1-Soar demonstrated the ability of Soar to scale to moderate-size problems, use hierarchical task decomposition and planning, and convert deliberate planning and problem solving to reactive execution through chunking. [7]
NL-Soar was a natural-language understanding system developed in Soar by Jill Fain Lehman, Rick Lewis, Nancy Green, Deryle Lonsdale and Greg Nelson. It included capabilities for natural-language comprehension, generation, and dialogue, emphasizing real-time incremental parsing and generation. NL-Soar was used in an experimental version of TacAir-Soar and in NTD-Soar. [8]
The second large-scale application of Soar involved developing agents for use in training in large-scale distributed simulation. Two major systems for flying U.S. tactical air missions were co-developed at the University of Michigan and Information Sciences Institute (ISI) of University of Southern California. The Michigan system was called TacAir-Soar and flew (in simulation) fixed-wing U. S. military tactical missions (such as close-air support, strikes, CAPs, refueling, and SEAD missions). The ISI system was called RWA-Soar and flew rotary-wing (helicopter) missions. Some of the capabilities incorporated in TacAir-Soar and RWA-Soar were attention, situational awareness and adaptation, real-time planning and dynamic replanning, and complex communication, coordination, and cooperation among combinations of Soar agents and humans. These systems participated in DARPA’s Synthetic Theater of War (STOW-97) Advanced Concept Technology Demonstration (ACTD), which at the time was the largest fielding of synthetic agents in a joint battlespace over a 48-hour period, and involved training of active duty personnel. These systems demonstrated the viability of using AI agents for large-scale training. [9]
One of the important outgrowths of the RWA-Soar project was the development of STEAM by Milind Tambe, [10] a framework for flexible teamwork in which agents maintained models of their teammates using the joint intentions framework by Cohen & Levesque. [11]
NTD-Soar was a simulation of the NASA Test Director (NTD), the person responsible for coordinating the preparation of the NASA Space Shuttle before launch. It was an integrated cognitive model that incorporated many different complex cognitive capabilities including natural-language processing, attention and visual search, and problem solving in a broad agent model. [12]
Soar has been used to simulate virtual humans supporting face-to-face dialogues and collaboration within a virtual world developed at the Institute of Creative Technology at USC. Virtual humans have integrated capabilities of perception, natural-language understanding, emotions, body control, and action, among others. [13]
Game AI agents have been built using Soar for games such as StarCraft, [14] Quake II, [15] Descent 3, [16] Unreal Tournament, [17] and Minecraft [ citation needed ], supporting capabilities such as spatial reasoning, real-time strategy, and opponent anticipation. AI agents have also been created for video games including Infinite Mario [18] which used reinforcement learning, and Frogger II, Space Invaders, and Fast Eddie, which used both reinforcement learning and mental imagery. [19]
Soar can run natively on mobile devices. A mobile application for the game Liar’s Dice has been developed for iOS which runs the Soar architecture directly from the phone as the engine for opponent AIs. [20]
Many different robotic applications have been built using Soar since the original Robo-Soar was implemented in 1991 for controlling a Puma robot arm. [21] These have ranged from mobile robot control to humanoid service REEM robots, [22] taskable robotic mules [23] and unmanned underwater vehicles. [24]
A current focus of research and development in the Soar community is Interactive Task Learning (ITL), the automatic learning of new tasks, environment features, behavioral constraints, and other specifications through natural instructor interaction. [25] Research in ITL has been applied to tabletop game playing [26] and multi-room navigation. [27]
Early on, Merle-Soar demonstrated how Soar could learn a complex scheduling task modeled after the lead human scheduler in a windshield production plant located near Pittsburgh. [28] Later, a generalized version of Merle-Soar (Dispatcher-Soar) was used to demonstrate a symbolic, constraint propagation approach in learning to improve schedules and to define task-independent knowledge metrics of architecture-specific learning -- knowledge efficiency, knowledge utility, and knowledge effectiveness. [29]
Melody-Soar demonstrated how the Soar architecture could explain and demonstrate creativity in simple melody generation using hierarchies of problems spaces that parallel the hierarchical structure of melody, allowing unique melodies to be generated from preferences of existing styles (e.g., Bach). [30]
Cognitive science is the interdisciplinary, scientific study of the mind and its processes. It examines the nature, the tasks, and the functions of cognition. Mental faculties of concern to cognitive scientists include language, perception, memory, attention, reasoning, and emotion; to understand these faculties, cognitive scientists borrow from fields such as linguistics, psychology, artificial intelligence, philosophy, neuroscience, and anthropology. The typical analysis of cognitive science spans many levels of organization, from learning and decision to logic and planning; from neural circuitry to modular brain organization. One of the fundamental concepts of cognitive science is that "thinking can best be understood in terms of representational structures in the mind and computational procedures that operate on those structures."
Natural language processing (NLP) is an interdisciplinary subfield of computer science and artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. Typically data is collected in text corpora, using either rule-based, statistical or neural-based approaches in machine learning and deep learning.
Distributed artificial intelligence (DAI) also called Decentralized Artificial Intelligence is a subfield of artificial intelligence research dedicated to the development of distributed solutions for problems. DAI is closely related to and a predecessor of the field of multi-agent systems.
In artificial intelligence, symbolic artificial intelligence is the term for the collection of all methods in artificial intelligence research that are based on high-level symbolic (human-readable) representations of problems, logic and search. Symbolic AI used tools such as logic programming, production rules, semantic nets and frames, and it developed applications such as knowledge-based systems, symbolic mathematics, automated theorem provers, ontologies, the semantic web, and automated planning and scheduling systems. The Symbolic AI paradigm led to seminal ideas in search, symbolic programming languages, agents, multi-agent systems, the semantic web, and the strengths and limitations of formal knowledge and reasoning systems.
A cognitive model is a representation of one or more cognitive processes in humans or other animals for the purposes of comprehension and prediction. There are many types of cognitive models, and they can range from box-and-arrow diagrams to a set of equations to software programs that interact with the same tools that humans use to complete tasks. In terms of information processing, cognitive modeling is modeling of human perception, reasoning, memory and action.
Dendral was a project in artificial intelligence (AI) of the 1960s, and the computer software expert system that it produced. Its primary aim was to study hypothesis formation and discovery in science. For that, a specific task in science was chosen: help organic chemists in identifying unknown organic molecules, by analyzing their mass spectra and using knowledge of chemistry. It was done at Stanford University by Edward Feigenbaum, Bruce G. Buchanan, Joshua Lederberg, and Carl Djerassi, along with a team of highly creative research associates and students. It began in 1965 and spans approximately half the history of AI research.
ACT-R is a cognitive architecture mainly developed by John Robert Anderson and Christian Lebiere at Carnegie Mellon University. Like any cognitive architecture, ACT-R aims to define the basic and irreducible cognitive and perceptual operations that enable the human mind. In theory, each task that humans can perform should consist of a series of these discrete operations.
A cognitive tutor is a particular kind of intelligent tutoring system that utilizes a cognitive model to provide feedback to students as they are working through problems. This feedback will immediately inform students of the correctness, or incorrectness, of their actions in the tutor interface; however, cognitive tutors also have the ability to provide context-sensitive hints and instruction to guide students towards reasonable next steps.
GOMS is a specialized human information processor model for human-computer interaction observation that describes a user's cognitive structure on four components. In the book The Psychology of Human Computer Interaction. written in 1983 by Stuart K. Card, Thomas P. Moran and Allen Newell, the authors introduce: "a set of Goals, a set of Operators, a set of Methods for achieving the goals, and a set of Selections rules for choosing among competing methods for goals." GOMS is a widely used method by usability specialists for computer system designers because it produces quantitative and qualitative predictions of how people will use a proposed system.
A cognitive architecture refers to both a theory about the structure of the human mind and to a computational instantiation of such a theory used in the fields of artificial intelligence (AI) and computational cognitive science. These formalized models can be used to further refine comprehensive theories of cognition and serve as the frameworks for useful artificial intelligence programs. Successful cognitive architectures include ACT-R and SOAR. The research on cognitive architectures as software instantiation of cognitive theories was initiated by Allen Newell in 1990.
Unified Theories of Cognition is a 1990 book by Allen Newell. Newell argues for the need of a set of general assumptions for cognitive models that account for all of cognition: a unified theory of cognition, or cognitive architecture. The research started by Newell on unified theories of cognition represents a crucial element of divergence with respect to the vision of his long-term collaborator, and AI pioneer, Herbert Simon for what concerns the future of artificial intelligence research. Antonio Lieto recently drew attention to such a discrepancy, by pointing out that Herbert Simon decided to focus on the construction of single simulative programs that were considered a sufficient mean to enable the generalisation of “unifying” theories of cognition. Newell, on the other hand, didn’t consider the construction of single simulative microtheories a sufficient mean to enable the generalisation of “unifying” theories of cognition and, in fact, started the enterprise of studying and developing integrated and multi-tasking intelligence via cognitive architectures that would have led to the development of the Soar cognitive architecture.
A physical symbol system takes physical patterns (symbols), combining them into structures (expressions) and manipulating them to produce new expressions.
Action selection is a way of characterizing the most basic problem of intelligent systems: what to do next. In artificial intelligence and computational cognitive science, "the action selection problem" is typically associated with intelligent agents and animats—artificial systems that exhibit complex behaviour in an agent environment. The term is also sometimes used in ethology or animal behavior.
The following outline is provided as an overview of and topical guide to artificial intelligence:
John E. Laird is a computer scientist who, created the Soar cognitive architecture at Carnegie Mellon University with Paul Rosenbloom and Allen Newell. Laird is a professor in the Computer Science and Engineering Division of the Electrical Engineering and Computer Science Department of the University of Michigan.
Connectionist Learning with Adaptive Rule Induction On-line (CLARION) is a computational cognitive architecture that has been used to simulate many domains and tasks in cognitive psychology and social psychology, as well as implementing intelligent systems in artificial intelligence applications. An important feature of CLARION is the distinction between implicit and explicit processes and focusing on capturing the interaction between these two types of processes. The system was created by the research group led by Ron Sun.
Psi-theory, developed by Dietrich Dörner at the University of Bamberg, is a systemic psychological theory covering human action regulation, intention selection and emotion. It models the human mind as an information processing agent, controlled by a set of basic physiological, social and cognitive drives. Perceptual and cognitive processing are directed and modulated by these drives, which allow the autonomous establishment and pursuit of goals in an open environment.
Real-time Control System (RCS) is a reference model architecture, suitable for many software-intensive, real-time computing control problem domains. It defines the types of functions needed in a real-time intelligent control system, and how these functions relate to each other.
FORR is a cognitive architecture for learning and problem solving inspired by Herbert A. Simon's ideas of bounded rationality and satisficing. It was first developed in the early 1990s at the City University of New York. It has been used in game playing, robot pathfinding, recreational park design, spoken dialog systems, and solving NP-hard constraint satisfaction problems, and is general enough for many problem solving applications.
This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence, its sub-disciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.
{{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help){{cite journal}}
: Cite journal requires |journal=
(help)