Connectionism (coined by Edward Thorndike in the 1930s[ citation needed ]) is the name of an approach to the study of human mental processes and cognition that utilizes mathematical models known as connectionist networks or artificial neural networks. [1] Connectionism has had many 'waves' since its beginnings.
The first wave appeared 1943 with Warren Sturgis McCulloch and Walter Pitts both focusing on comprehending neural circuitry through a formal and mathematical approach, [2] and Frank Rosenblatt who published the 1958 book “The Perceptron: A Probabilistic Model For Information Storage and Organization in the Brain” in Psychological Review, while working at the Cornell Aeronautical Laboratory. [3] The first wave ended with the 1969 book about the limitations of the original perceptron idea, written by Marvin Minsky and Papert, which contributed to discouraging major funding agencies in the US from investing in connectionist research. [4] With a few noteworthy deviations, the majority of connectionist research entered a period of inactivity until the mid-1980s. The term connectionist model was reintroduced in the early 1980s in a Cognitive Science paper by Jerome Feldman and Dana Ballard.
The second wave blossomed in the late 1980s, following the 1987 book about Parallel Distributed Processing by James L. McClelland, David E. Rumelhart et al., which introduced a couple of improvements to the simple perceptron idea, such as intermediate processors (known as "hidden layers" now) alongside input and output units and used sigmoid activation function instead of the old 'all-or-nothing' function. Their work has, in turn, built upon that of John Hopfield, who was a key figure investigating the mathematical characteristics of sigmoid activation functions. [3] From the late 1980s to the mid-1990s, connectionism took on an almost revolutionary tone when Schneider, [5] Terence Horgan and Tienson posed the question of whether connectionism represented a fundamental shift in psychology and GOFAI. [3] Some advantages of the second wave connectionist approach included its applicability to a broad array of functions, structural approximation to biological neurons, low requirements for innate structure, and capacity for graceful degradation. [6] Some disadvantages of the second wave connectionist approach included the difficulty in deciphering how ANNs process information, or account for the compositionality of mental representations, and a resultant difficulty explaining phenomena at a higher level. [7]
The current (third) wave has been marked by advances in Deep Learning allowing for Large language models. [3] The success of deep learning networks in the past decade has greatly increased the popularity of this approach, but the complexity and scale of such networks has brought with them increased interpretability problems. [8]
The central connectionist principle is that mental phenomena can be described by interconnected networks of simple and often uniform units. The form of the connections and the units can vary from model to model. For example, units in the network could represent neurons and the connections could represent synapses, as in the human brain. This principle has been seen as an alternative to GOFAI and the classical theories of mind based on symbolic computation, but the extent to which the two approaches are compatible has been the subject of much debate since their inception. [8]
Internal states of any network change over time due to neurons sending a signal to a succeeding layer of neurons in the case of a feedforward network, or to a previous layer in the case of a recurrent network. Discovery of non-linear activation functions has enabled the second wave of connectionism.
Neural networks follow two basic principles:
Most of the variety among the models comes from:
Connectionist work in general does not need to be biologically realistic. [10] [11] [12] [13] [14] [15] [16] One area where connectionist models are thought to be biologically implausible is with respect to error-propagation networks that are needed to support learning, [17] [18] but error propagation can explain some of the biologically-generated electrical activity seen at the scalp in event-related potentials such as the N400 and P600, [19] and this provides some biological support for one of the key assumptions of connectionist learning procedures. Many recurrent connectionist models also incorporate dynamical systems theory. Many researchers, such as the connectionist Paul Smolensky, have argued that connectionist models will evolve toward fully continuous, high-dimensional, non-linear, dynamic systems approaches.
Precursors of the connectionist principles can be traced to early work in psychology, such as that of William James. [20] Psychological theories based on knowledge about the human brain were fashionable in the late 19th century. As early as 1869, the neurologist John Hughlings Jackson argued for multi-level, distributed systems. Following from this lead, Herbert Spencer's Principles of Psychology, 3rd edition (1872), and Sigmund Freud's Project for a Scientific Psychology (composed 1895) propounded connectionist or proto-connectionist theories. These tended to be speculative theories. But by the early 20th century, Edward Thorndike was experimenting on learning that posited a connectionist type network.
Hopfield networks had precursors in the Ising model due to Wilhelm Lenz (1920) and Ernst Ising (1925), though the Ising model conceived by them did not involve time. Monte Carlo simulations of Ising model required the advent of computers in the 1950s. [21]
The first wave begun in 1943 with Warren Sturgis McCulloch and Walter Pitts both focusing on comprehending neural circuitry through a formal and mathematical approach. [3] McCulloch and Pitts showed how neural systems could implement first-order logic: Their classic paper "A Logical Calculus of Ideas Immanent in Nervous Activity" (1943) is important in this development here. They were influenced by the work of Nicolas Rashevsky in the 1930s.
Hebb contributed greatly to speculations about neural functioning, and proposed a learning principle, Hebbian learning. Lashley argued for distributed representations as a result of his failure to find anything like a localized engram in years of lesion experiments. Friedrich Hayek independently conceived the model, first in a brief unpublished manuscript in 1920, [22] [23] then expanded into a book in 1952. [24]
The Perceptron machines were proposed and built by Frank Rosenblatt, who published the 1958 paper “The Perceptron: A Probabilistic Model For Information Storage and Organization in the Brain” in Psychological Review, while working at the Cornell Aeronautical Laboratory. He cited Hebb, Hayek, Uttley, and Ashby as main influences.
Another form of connectionist model was the relational network framework developed by the linguist Sydney Lamb in the 1960s.
The research group led by Widrow empirically searched for methods to train two-layered ADALINE networks, with limited success. [25] [26]
A method to train multilayered perceptrons with arbitrary levels of trainable weights was published by Alexey Grigorevich Ivakhnenko and Valentin Lapa in 1965, called the Group Method of Data Handling. [27] [28] [29] This method employs incremental layer by layer training based on regression analysis, where useless units in hidden layers are pruned with the help of a validation set. In 1972, Shun'ichi Amari made this architecture adaptive. [30] [27]
The first multilayered perceptrons trained by stochastic gradient descent [31] was published in 1967 by Shun'ichi Amari. [32] [27] In computer experiments conducted by Amari's student Saito, a five layer MLP with two modifiable layers learned useful internal representations to classify non-linearily separable pattern classes. [27]
The second wave begun in late 1980s, following the 1987 two-volume book about the Parallel Distributed Processing (PDP) by James L. McClelland, David E. Rumelhart et al., which has introduced a couple of improvements to the simple perceptron idea, such as intermediate processors (known as "hidden layers" now) alongside input and output units and using sigmoid activation function instead of the old 'all-or-nothing' function. Their work has, in turn, built upon John Hopfield, who was a key figure investigating the mathematical characteristics of sigmoid activation functions. [3]
As connectionism became increasingly popular in the late 1980s, some researchers (including Jerry Fodor, Steven Pinker and others) reacted against it. They argued that connectionism, as then developing, threatened to obliterate what they saw as the progress being made in the fields of cognitive science and psychology by the classical approach of computationalism. Computationalism is a specific form of cognitivism that argues that mental activity is computational, that is, that the mind operates by performing purely formal operations on symbols, like a Turing machine. Some researchers argued that the trend in connectionism represented a reversion toward associationism and the abandonment of the idea of a language of thought, something they saw as mistaken. In contrast, those very tendencies made connectionism attractive for other researchers.
Connectionism and computationalism need not be at odds, but the debate in the late 1980s and early 1990s led to opposition between the two approaches. Throughout the debate, some researchers have argued that connectionism and computationalism are fully compatible, though full consensus on this issue has not been reached. Differences between the two approaches include the following:
Despite these differences, some theorists have proposed that the connectionist architecture is simply the manner in which organic brains happen to implement the symbol-manipulation system. This is logically possible, as it is well known that connectionist models can implement symbol-manipulation systems of the kind used in computationalist models, [33] as indeed they must be able if they are to explain the human ability to perform symbol-manipulation tasks. Several cognitive models combining both symbol-manipulative and connectionist architectures have been proposed. Among them are Paul Smolensky's Integrated Connectionist/Symbolic Cognitive Architecture (ICS). [8] [34] and Ron Sun's CLARION (cognitive architecture). But the debate rests on whether this symbol manipulation forms the foundation of cognition in general, so this is not a potential vindication of computationalism. Nonetheless, computational descriptions may be helpful high-level descriptions of cognition of logic, for example.
The debate was largely centred on logical arguments about whether connectionist networks could produce the syntactic structure observed in this sort of reasoning. This was later achieved although using fast-variable binding abilities outside of those standardly assumed in connectionist models. [33] [35]
Part of the appeal of computational descriptions is that they are relatively easy to interpret, and thus may be seen as contributing to our understanding of particular mental processes, whereas connectionist models are in general more opaque, to the extent that they may be describable only in very general terms (such as specifying the learning algorithm, the number of units, etc.), or in unhelpfully low-level terms. In this sense, connectionist models may instantiate, and thereby provide evidence for, a broad theory of cognition (i.e., connectionism), without representing a helpful theory of the particular process that is being modelled. In this sense, the debate might be considered as to some extent reflecting a mere difference in the level of analysis in which particular theories are framed. Some researchers suggest that the analysis gap is the consequence of connectionist mechanisms giving rise to emergent phenomena that may be describable in computational terms. [36]
In the 2000s, the popularity of dynamical systems in philosophy of mind have added a new perspective on the debate; [37] [38] some authors[ which? ] now argue that any split between connectionism and computationalism is more conclusively characterized as a split between computationalism and dynamical systems.
In 2014, Alex Graves and others from DeepMind published a series of papers describing a novel Deep Neural Network structure called the Neural Turing Machine [39] able to read symbols on a tape and store symbols in memory. Relational Networks, another Deep Network module published by DeepMind, are able to create object-like representations and manipulate them to answer complex questions. Relational Networks and Neural Turing Machines are further evidence that connectionism and computationalism need not be at odds.
Smolensky's Subsymbolic Paradigm [40] [41] has to meet the Fodor-Pylyshyn challenge [42] [43] [44] [45] formulated by classical symbol theory for a convincing theory of cognition in modern connectionism. In order to be an adequate alternative theory of cognition, Smolensky's Subsymbolic Paradigm would have to explain the existence of systematicity or systematic relations in language cognition without the assumption that cognitive processes are causally sensitive to the classical constituent structure of mental representations. The subsymbolic paradigm, or connectionism in general, would thus have to explain the existence of systematicity and compositionality without relying on the mere implementation of a classical cognitive architecture. This challenge implies a dilemma: If the Subsymbolic Paradigm could contribute nothing to the systematicity and compositionality of mental representations, it would be insufficient as a basis for an alternative theory of cognition. However, if the Subsymbolic Paradigm's contribution to systematicity requires mental processes grounded in the classical constituent structure of mental representations, the theory of cognition it develops would be, at best, an implementation architecture of the classical model of symbol theory and thus not a genuine alternative (connectionist) theory of cognition. [46] The classical model of symbolism is characterized by (1) a combinatorial syntax and semantics of mental representations and (2) mental operations as structure-sensitive processes, based on the fundamental principle of syntactic and semantic constituent structure of mental representations as used in Fodor's "Language of Thought (LOT)". [47] [48] This can be used to explain the following closely related properties of human cognition, namely its (1) productivity, (2) systematicity, (3) compositionality, and (4) inferential coherence. [49]
This challenge has been met in modern connectionism, for example, not only by Smolensky's "Integrated Connectionist/Symbolic (ICS) Cognitive Architecture", [50] [51] but also by Werning and Maye's "Oscillatory Networks". [52] [53] [54] An overview of this is given for example by Bechtel & Abrahamsen, [55] Marcus [56] and Maurer. [57]
Cognitive science is the interdisciplinary, scientific study of the mind and its processes with input from linguistics, psychology, neuroscience, philosophy, computer science/artificial intelligence, and anthropology. It examines the nature, the tasks, and the functions of cognition. Cognitive scientists study intelligence and behavior, with a focus on how nervous systems represent, process, and transform information. Mental faculties of concern to cognitive scientists include language, perception, memory, attention, reasoning, and emotion; to understand these faculties, cognitive scientists borrow from fields such as linguistics, psychology, artificial intelligence, philosophy, neuroscience, and anthropology. The typical analysis of cognitive science spans many levels of organization, from learning and decision to logic and planning; from neural circuitry to modular brain organization. One of the fundamental concepts of cognitive science is that "thinking can best be understood in terms of representational structures in the mind and computational procedures that operate on those structures."
Synchronization is the coordination of events to operate a system in unison. For example, the conductor of an orchestra keeps the orchestra synchronized or in time. Systems that operate with all parts in synchrony are said to be synchronous or in sync—and those that are not are asynchronous.
Cognitive neuroscience is the scientific field that is concerned with the study of the biological processes and aspects that underlie cognition, with a specific focus on the neural connections in the brain which are involved in mental processes. It addresses the questions of how cognitive activities are affected or controlled by neural circuits in the brain. Cognitive neuroscience is a branch of both neuroscience and psychology, overlapping with disciplines such as behavioral neuroscience, cognitive psychology, physiological psychology and affective neuroscience. Cognitive neuroscience relies upon theories in cognitive science coupled with evidence from neurobiology, and computational modeling.
Jerry Alan Fodor was an American philosopher and the author of many crucial works in the fields of philosophy of mind and cognitive science. His writings in these fields laid the groundwork for the modularity of mind and the language of thought hypotheses, and he is recognized as having had "an enormous influence on virtually every portion of the philosophy of mind literature since 1960." At the time of his death in 2017, he held the position of State of New Jersey Professor of Philosophy, Emeritus, at Rutgers University, and had taught previously at the City University of New York Graduate Center and MIT.
Modularity of mind is the notion that a mind may, at least in part, be composed of innate neural structures or mental modules which have distinct, established, and evolutionarily developed functions. However, different definitions of "module" have been proposed by different authors. According to Jerry Fodor, the author of Modularity of Mind, a system can be considered 'modular' if its functions are made of multiple dimensions or units to some degree. One example of modularity in the mind is binding. When one perceives an object, they take in not only the features of an object, but the integrated features that can operate in sync or independently that create a whole. Instead of just seeing red, round, plastic, and moving, the subject may experience a rolling red ball. Binding may suggest that the mind is modular because it takes multiple cognitive processes to perceive one thing.
The language of thought hypothesis (LOTH), sometimes known as thought ordered mental expression (TOME), is a view in linguistics, philosophy of mind and cognitive science, forwarded by American philosopher Jerry Fodor. It describes the nature of thought as possessing "language-like" or compositional structure. On this view, simple concepts combine in systematic ways to build thoughts. In its most basic form, the theory states that thought, like language, has syntax.
The consciousness and binding problem is the problem of how objects, background and abstract or emotional features are combined into a single experience.
Paul Smolensky is Krieger-Eisenhower Professor of Cognitive Science at the Johns Hopkins University and a Senior Principal Researcher at Microsoft Research, Redmond Washington.
James Lloyd "Jay" McClelland, FBA is the Lucie Stern Professor at Stanford University, where he was formerly the chair of the Psychology Department. He is best known for his work on statistical learning and Parallel Distributed Processing, applying connectionist models to explain cognitive phenomena such as spoken word recognition and visual word recognition. McClelland is to a large extent responsible for the large increase in scientific interest in connectionism in the 1980s.
A cognitive architecture refers to both a theory about the structure of the human mind and to a computational instantiation of such a theory used in the fields of artificial intelligence (AI) and computational cognitive science. The formalized models can be used to further refine a comprehensive theory of cognition and as a useful artificial intelligence program. Successful cognitive architectures include ACT-R and SOAR. The research on cognitive architectures as software instantiation of cognitive theories was initiated by Allen Newell in 1990.
A neural network, also called a neuronal network, is an interconnected population of neurons. Biological neural networks are studied to understand the organization and functioning of nervous systems.
Computational cognition is the study of the computational basis of learning and inference by mathematical modeling, computer simulation, and behavioral experiments. In psychology, it is an approach which develops computational models based on experimental results. It seeks to understand the basis behind the human method of processing of information. Early on computational cognitive scientists sought to bring back and create a scientific form of Brentano's psychology.
Neurophilosophy or philosophy of neuroscience is the interdisciplinary study of neuroscience and philosophy that explores the relevance of neuroscientific studies to the arguments traditionally categorized as philosophy of mind. The philosophy of neuroscience attempts to clarify neuroscientific methods and results using the conceptual rigor and methods of philosophy of science.
The cognitive revolution was an intellectual movement that began in the 1950s as an interdisciplinary study of the mind and its processes, from which emerged a new field known as cognitive science. The preexisting relevant fields were psychology, linguistics, computer science, anthropology, neuroscience, and philosophy. The approaches used were developed within the then-nascent fields of artificial intelligence, computer science, and neuroscience. In the 1960s, the Harvard Center for Cognitive Studies and the Center for Human Information Processing at the University of California, San Diego were influential in developing the academic study of cognitive science. By the early 1970s, the cognitive movement had surpassed behaviorism as a psychological paradigm. Furthermore, by the early 1980s the cognitive approach had become the dominant line of research inquiry across most branches in the field of psychology.
David Everett Rumelhart was an American psychologist who made many contributions to the formal analysis of human cognition, working primarily within the frameworks of mathematical psychology, symbolic artificial intelligence, and parallel distributed processing. He also admired formal linguistic approaches to cognition, and explored the possibility of formulating a formal grammar to capture the structure of stories.
Neural computation is the information processing performed by networks of neurons. Neural computation is affiliated with the philosophical tradition known as Computational theory of mind, also referred to as computationalism, which advances the thesis that neural computation explains cognition. The first persons to propose an account of neural activity as being computational was Warren McCullock and Walter Pitts in their seminal 1943 paper, A Logical Calculus of the Ideas Immanent in Nervous Activity. There are three general branches of computationalism, including classicism, connectionism, and computational neuroscience. All three branches agree that cognition is computation, however, they disagree on what sorts of computations constitute cognition. The classicism tradition believes that computation in the brain is digital, analogous to digital computing. Both connectionism and computational neuroscience do not require that the computations that realize cognition are necessarily digital computations. However, the two branches greatly disagree upon which sorts of experimental data should be used to construct explanatory models of cognitive phenomena. Connectionists rely upon behavioral evidence to construct models to explain cognitive phenomena, whereas computational neuroscience leverages neuroanatomical and neurophysiological information to construct mathematical models that explain cognition.
In philosophy of mind, the computational theory of mind (CTM), also known as computationalism, is a family of views that hold that the human mind is an information processing system and that cognition and consciousness together are a form of computation. Warren McCulloch and Walter Pitts (1943) were the first to suggest that neural activity is computational. They argued that neural computations explain cognition. The theory was proposed in its modern form by Hilary Putnam in 1967, and developed by his PhD student, philosopher, and cognitive scientist Jerry Fodor in the 1960s, 1970s, and 1980s. It was vigorously disputed in analytic philosophy in the 1990s due to work by Putnam himself, John Searle, and others.
Ron Sun is a cognitive scientist who made significant contributions to computational psychology and other areas of cognitive science and artificial intelligence. He is currently professor of cognitive sciences at Rensselaer Polytechnic Institute, and formerly the James C. Dowell Professor of Engineering and Professor of Computer Science at University of Missouri. He received his Ph.D. in 1992 from Brandeis University.
Harmonic grammar is a linguistic model proposed by Geraldine Legendre, Yoshiro Miyata, and Paul Smolensky in 1990. It is a connectionist approach to modeling linguistic well-formedness. During the late 2000s and early 2010s, the term 'harmonic grammar' has been used to refer more generally to models of language that use weighted constraints, including ones that are not explicitly connectionist – see e.g. Pater (2009) and Potts et al. (2010).
In the philosophy of artificial intelligence, GOFAI is classical symbolic AI, as opposed to other approaches, such as neural networks, situated robotics, narrow symbolic AI or neuro-symbolic AI. The term was coined by philosopher John Haugeland in his 1985 book Artificial Intelligence: The Very Idea.