A logical calculus of the ideas immanent in nervous activity

Last updated

A logical calculus of the ideas immanent to nervous activity is a 1943 article written by Warren McCulloch and Walter Pitts. [1] The paper, published in the journal The Bulletin of Mathematical Biophysics . The paper proposed a mathematical model of the nervous system as a network of simple logical elements, later known as artificial neurons, or McCulloch-Pitts neurons. These neurons receive inputs, perform a weighted sum, and fire an output signal based on a threshold function. By connecting these units in various configurations, McCulloch and Pitts demonstrated that their model could perform all logical functions.

Contents

It is a seminal work in computational neuroscience, computer science, and artificial intelligence. It was a foundational result in automata theory. John von Neumann cited it as a significant result. [2]

Mathematics

The artificial neuron used in the original paper is slightly different from the modern version. They considered neural networks that operate in discrete steps of time .

The neural network contains a number of neurons. Let the state of a neuron at time be . The state of a neuron can either be 0 or 1, standing for "not firing" and "firing". Each neuron also has a firing threshold, such that it fires if the total input exceeds the threshold.

Each neuron can connect to any other neuron (including itself) with positive synapses (excitatory) or negative synapses (inhibitory). That is, each neuron can connect to another neuron with a weight taking an integer value. A peripheral afferent is a neuron with no incoming synapses.

We can regard each neural network as a directed graph, with the nodes being the neurons, and the directed edges being the synapses. A neural network has a circle or a circuit iff there exists a directed circle in the graph.

Let be the connection weight from neuron to neuron at time , then its next state iswhere is the Heaviside step function (outputting 1 if the input is greater than or equal to 0, and 0 otherwise).

Symbolic logic

The paper used, as a logical language for describing neural networks, Language II from TheLogical Syntax of Language by Rudolf Carnap with some notations taken from Principia Mathematica by Alfred North Whitehead and Bertrand Russell. Language II covers substantial parts of classical mathematics, including real analysis and portions of set theory. [3]

To describe a neural network with peripheral afferents and non-peripheral afferents they considered logical predicate of formwhere is a first-order logic predicate function (a function that outputs a boolean), are predicates that take as an argument, and is the only free variable in the predicate. Intuitively speaking, specifies the binary input patterns going into the neural network over all time, and is a function that takes some binary input patterns, and constructs an output binary pattern .

A logical sentence is realized by a neural network iff there exists a time-delay , a neuron in the network, and an initial state for the non-peripheral neurons , such that for any time , the truth-value of the logical sentence is equal to the state of the neuron at time . That is,

Equivalence

In the paper, they considered some alternative definitions of artificial neural networks, and have shown them to be equivalent, that is, neural networks under one definition realizes precisely the same logical sentences as neural networks under another definition.

They considered three forms of inhibition: relative inhibition, absolute inhibition, and extinction. The definition above is relative inhibition. By "absolute inhibition" they meant that if any negative synapse fires, then the neuron will not fire. By "extinction" they meant that if at time , any inhibitory synapse fires on a neuron , then for , until the next time an inhibitory synapse fires on . It is required that for all large .

Theorem 4 and 5 state that these are equivalent.

They considered three forms of excitation: spatial summation, temporal summation, and facilitation. The definition above is spatial summation (which they pictured as having multiple synapses placed close together, so that the effect of their firing sums up). By "temporal summation" they meant that the total incoming signal is for some . By "facilitation" they meant the same as extinction, except that . Theorem 6 states that these are equivalent.

They considered neural networks that do not change, and those that change by Hebbian learning. That is, they assume that at , some excitatory synaptic connections are not active. If at any , both , then any latent excitatory synapse between becomes active. Theorem 7 states that these are equivalent.

Logical expressivity

They considered "temporal propositional expressions" (TPE), which are propositional formulas with one free variable . For example, is such an expression. Theorem 1 and 2 together showed that neural nets without circles are equivalent to TPE.

For neural nets with loops, they noted that "realizable may involve reference to past events of an indefinite degree of remoteness". These then encodes for sentences like "There was some x such that x was a ψ" or . Theorems 8 to 10 showed that neural nets with loops can encode all first-order logic with equality and conversely, any looped neural networks is equivalent to a sentence in first-order logic with equality, thus showing that they are equivalent in logical expressiveness. [4]

As a remark, they noted that a neural network, if furnished with a tape, scanners, and write-heads, is equivalent to a Turing machine, and conversely, every Turing machine is equivalent to some such neural network. Thus, these neural networks are equivalent to Turing computability, Church's lambda-definability, and Kleene's primitive recursiveness.

Context

Previous work

The paper built upon several previous strands of work. [5] [6]

In the symbolic logic side, it built on the previous work by Carnap, Whitehead, and Russell. This was contributed by Walter Pitts, who had a strong proficiency with symbolic logic. Pitts provided mathematical and logical rigor to McCulloch’s vague ideas on psychons (atoms of psychological events) and circular causality. [7]

In the neuroscience side, it built on previous work by the mathematical biology research group centered around Nicolas Rashevsky, of which McCulloch was a member. The paper was published in the Bulletin of Mathematical Biophysics, which was founded by Rashevsky in 1939. During the late 1930s, Rashevsky's research group was producing papers that had difficulty publishing in other journals at the time, so Rashevsky decided to found a new journal exclusively devoted to mathematical biophysics. [8]

Also in the Rashevsky's group was Alston Scott Householder, who in 1941 published an abstract model of the steady-state activity of biological neural networks. The model, in modern language, is an artificial neural network with ReLU activation function. [9] In a series of papers, Householder calculated the stable states of very simple networks: a chain, a circle, and a bouquet. Walter Pitts' first two papers formulated a mathematical theory of learning and conditioning. The next three were mathematical developments of Householder’s model. [10]

In 1938, at age 15, Pitts ran away from home in Detroit and arrived in the University of Chicago. Later, he walked into Rudolf Carnap's office with Carnap's book filled with corrections and suggested improvements. He started studying under Carnap and attending classes during 1938--1943. He wrote several early papers on neuronal network modelling and regularly attended Rashevsky's seminars in theoretical biology. The seminar attendants included Gerhard von Bonin and Householder. In 1940, von Bonin introduced Lettvin to McCulloch. In 1942, both Lettvin and Pitts had moved in with McCulloch's home. [11]

McCulloch had been interested in circular causality from studies with causalgia after amputation, epileptic activity of surgically isolated brain, and Lorente de Nò's research showing recurrent neural networks are needed to explain vestibular nystagmus. He had difficulty with treating circular causality until Pitts demonstrated how it can be treated by the appropriate mathematical tools of modular arithmetics and symbolic logic. [4] [10]

Both authors' affiliation in the article was given as "University of Illinois, College of Medicine, Department of Psychiatry at the Illinois Neuropsychiatric Institute, University of Chicago, Chicago, U.S.A."

Subsequent work

It was a foundational result in automata theory. John von Neumann cited it as a significant result. [2] This work led to work on neural networks and their link to finite automata. [12] Marvin Minsky was influenced by McCulloch, built an early example of neural network SNARC (1951), and did a PhD thesis on neural networks (1954). [13]

McCulloch was the chair to the ten Macy conferences (1946--1953) on "Circular Causal and Feedback Mechanisms in Biological and Social Systems". This was a key event in the beginning of cybernetics, and what later became known as cognitive science. Pitts also attended the conferences. [14]

In the 1943 paper, they described how memories can be formed by a neural network with loops in it, or alterable synapses, which are operating over time, and implements logical universals -- "there exists" and "for all". This was generalized for spatial objects, such as geometric figures, in their 1947 paper How we know universals. [15] Norbert Wiener found this a significant evidence for a general method for how animals recognizing objects, by scanning a scene from multiple transformations and finding a canonical representation. He hypothesized that this "scanning" activity is clocked by the alpha wave, which he mistakenly thought was tightly regulated at 10 Hz (instead of the 8 -- 13 Hz as modern research shows). [16]

McCulloch worked with Manuel Blum in studying how a neural network can be "logically stable", that is, can implement a boolean function even if the activation thresholds of individual neurons are varied. [17] :64 They were inspired by the problem of how the brain can perform the same functions, such as breathing, under influence of caffeine or alcohol, which shifts the activation threshold over the entire brain. [4]

See also

Related Research Articles

Warren Sturgis McCulloch was an American Neuropsychologist and cybernetician known for his work on the foundation for certain brain theories and his contribution to the cybernetics movement. Along with Walter Pitts, McCulloch created computational models based on mathematical algorithms called threshold logic which split the inquiry into two distinct approaches, one approach focused on biological processes in the brain and the other focused on the application of neural networks to artificial intelligence.

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

<span class="mw-page-title-main">Connectionism</span> Cognitive science approach

Connectionism is the name of an approach to the study of human mental processes and cognition that utilizes mathematical models known as connectionist networks or artificial neural networks. Connectionism has had many "waves" since its beginnings.

<span class="mw-page-title-main">Walter Pitts</span> American logician and computational neuroscientist

Walter Harry Pitts, Jr. was an American logician who worked in the field of computational neuroscience. He proposed landmark theoretical formulations of neural activity and generative processes that influenced diverse fields such as cognitive sciences and psychology, philosophy, neurosciences, computer science, artificial neural networks, cybernetics and artificial intelligence, together with what has come to be known as the generative sciences. He is best remembered for having written along with Warren McCulloch, a seminal paper in scientific history, titled A Logical Calculus of Ideas Immanent in Nervous Activity (1943). This paper proposed the first mathematical model of a neural network. The unit of this model, a simple formalized neuron, is still the standard of reference in the field of neural networks. It is often called a McCulloch–Pitts neuron. Prior to that paper, he formalized his ideas regarding the fundamental steps to building a Turing machine in "The Bulletin of Mathematical Biophysics" in an essay titled "Some observations on the simple neuron circuit".

<span class="mw-page-title-main">Artificial neuron</span> Mathematical function conceived as a crude model

An artificial neuron is a mathematical function conceived as a model of biological neurons in a neural network. Artificial neurons are the elementary units of artificial neural networks. The artificial neuron is a function that receives one or more inputs, applies weights to these inputs, and sums them to produce an output.

Hebbian theory is a neuropsychological theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptation of brain neurons during the learning process. It was introduced by Donald Hebb in his 1949 book The Organization of Behavior. The theory is also called Hebb's rule, Hebb's postulate, and cell assembly theory. Hebb states it as follows:

Let us assume that the persistence or repetition of a reverberatory activity tends to induce lasting cellular changes that add to its stability. ... When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.

A Hopfield network is a form of recurrent neural network, or a spin glass system, that can serve as a content-addressable memory. The Hopfield network, named for John Hopfield, consists of a single layer of neurons, where each neuron is connected to every other neuron except itself. These connections are bidirectional and symmetric, meaning the weight of the connection from neuron i to neuron j is the same as the weight from neuron j to neuron i. Patterns are associatively recalled by fixing certain inputs, and dynamically evolve the network to minimize an energy function, towards local energy minimum states that correspond to stored patterns. Patterns are associatively learned by a Hebbian learning algorithm.

Recurrent neural networks (RNNs) are a class of artificial neural networks for sequential data processing. Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.

<span class="mw-page-title-main">Feedforward neural network</span> Type of artificial neural network

A feedforward neural network (FNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. Its flow is uni-directional, meaning that the information in the model flows in only one direction—forward—from the input nodes, through the hidden nodes and to the output nodes, without any cycles or loops. Modern feedforward networks are trained using backpropagation, and are colloquially referred to as "vanilla" neural networks.

<span class="mw-page-title-main">Neural circuit</span> Network or circuit of neurons

A neural circuit is a population of neurons interconnected by synapses to carry out a specific function when activated. Multiple neural circuits interconnect with one another to form large scale brain networks.

<span class="mw-page-title-main">Neural network (biology)</span> Structure in nervous systems

A neural network, also called a neuronal network, is an interconnected population of neurons. Biological neural networks are studied to understand the organization and functioning of nervous systems.

A multilayer perceptron (MLP) is a name for a modern feedforward artificial neural network, consisting of fully connected neurons with a nonlinear activation function, organized in at least three layers, notable for being able to distinguish data that is not linearly separable.

BCM theory, BCM synaptic modification, or the BCM rule, named for Elie Bienenstock, Leon Cooper, and Paul Munro, is a physical theory of learning in the visual cortex developed in 1981. The BCM model proposes a sliding threshold for long-term potentiation (LTP) or long-term depression (LTD) induction, and states that synaptic plasticity is stabilized by a dynamic adaptation of the time-averaged postsynaptic activity. According to the BCM model, when a pre-synaptic neuron fires, the post-synaptic neurons will tend to undergo LTP if it is in a high-activity state, or LTD if it is in a lower-activity state. This theory is often used to explain how cortical neurons can undergo both LTP or LTD depending on different conditioning stimulus protocols applied to pre-synaptic neurons.

Nicolas Rashevsky was an American theoretical physicist who was one of the pioneers of mathematical biology, and is also considered the father of mathematical biophysics and theoretical biology.

The Society for Mathematical Biology (SMB) is an international association co-founded in 1972 in the United States by George Karreman, Herbert Daniel Landahl and (initially chaired) by Anthony Bartholomay for the furtherance of joint scientific activities between Mathematics and Biology research communities. The society publishes the Bulletin of Mathematical Biology, as well as the quarterly SMB newsletter.

Models of neural computation are attempts to elucidate, in an abstract and mathematical fashion, the core principles that underlie information processing in biological nervous systems, or functional components thereof. This article aims to provide an overview of the most definitive models of neuro-biological computation as well as the tools commonly used to construct and analyze them.

Sparse distributed memory (SDM) is a mathematical model of human long-term memory introduced by Pentti Kanerva in 1988 while he was at NASA Ames Research Center.

The network of the human nervous system is composed of nodes that are connected by links. The connectivity may be viewed anatomically, functionally, or electrophysiologically. These are presented in several Wikipedia articles that include Connectionism, Biological neural network, Artificial neural network, Computational neuroscience, as well as in several books by Ascoli, G. A. (2002), Sterratt, D., Graham, B., Gillies, A., & Willshaw, D. (2011), Gerstner, W., & Kistler, W. (2002), and David Rumelhart, McClelland, J. L., and PDP Research Group (1986) among others. The focus of this article is a comprehensive view of modeling a neural network. Once an approach based on the perspective and connectivity is chosen, the models are developed at microscopic, mesoscopic, or macroscopic (system) levels. Computational modeling refers to models that are developed using computing tools.

<span class="mw-page-title-main">Theta model</span>

The theta model, or Ermentrout–Kopell canonical model, is a biological neuron model originally developed to mathematically describe neurons in the animal Aplysia. The model is particularly well-suited to describe neural bursting, which is characterized by periodic transitions between rapid oscillations in the membrane potential followed by quiescence. This bursting behavior is often found in neurons responsible for controlling and maintaining steady rhythms such as breathing, swimming, and digesting. Of the three main classes of bursting neurons, the theta model describes parabolic bursting, which is characterized by a parabolic frequency curve during each burst.

A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or mathematical models. While individual neurons are simple, many of them together in a network can perform complex tasks. There are two main types of neural network.

References

  1. McCulloch, Warren S.; Pitts, Walter (December 1943). "A logical calculus of the ideas immanent in nervous activity". The Bulletin of Mathematical Biophysics. 5 (4): 115–133. doi:10.1007/BF02478259. ISSN   0007-4985.
  2. 1 2 von Neumann, J. (1951). The general and logical theory of automata . In L. A. Jeffress (Ed.), Cerebral mechanisms in behavior; the Hixon Symposium (pp. 1–41). Wiley.
  3. "Rudolf Carnap > G. Logical Syntax of Language (Stanford Encyclopedia of Philosophy)". plato.stanford.edu. Retrieved 2024-10-13.
  4. 1 2 3 McCulloch, Warren (1961). "What is a number, that a man may know it, and a man, that he may know a Number" (PDF). General Semantics Bulletin (26 & 27): 7–18.
  5. Abraham, Tara H. (2002). "(Physio)logical circuits: The intellectual origins of the McCulloch-Pitts neural networks". Journal of the History of the Behavioral Sciences. 38 (1): 3–25. doi:10.1002/jhbs.1094. ISSN   0022-5061. PMID   11835218.
  6. Piccinini, Gualtiero (August 2004). "The First Computational Theory of Mind and Brain: A Close Look at Mcculloch and Pitts's "Logical Calculus of Ideas Immanent in Nervous Activity"". Synthese. 141 (2): 175–215. doi:10.1023/B:SYNT.0000043018.52445.3e. ISSN   0039-7857.
  7. Aizawa, Kenneth (September 2012). "Warren McCulloch's Turn to Cybernetics: What Walter Pitts Contributed". Interdisciplinary Science Reviews. 37 (3): 206–217. Bibcode:2012ISRv...37..206A. doi:10.1179/0308018812Z.00000000017. ISSN   0308-0188.
  8. Abraham, Tara H. (2004). "Nicolas Rashevsky's Mathematical Biophysics". Journal of the History of Biology. 37 (2): 333–385. doi:10.1023/B:HIST.0000038267.09413.0d. ISSN   0022-5010.
  9. Householder, Alston S. (June 1941). "A theory of steady-state activity in nerve-fiber networks: I. Definitions and preliminary lemmas". The Bulletin of Mathematical Biophysics. 3 (2): 63–69. doi:10.1007/BF02478220. ISSN   0007-4985.
  10. 1 2 Schlatter, Mark; Aizawa, Ken (May 2008). "Walter Pitts and "A Logical Calculus"". Synthese. 162 (2): 235–250. doi:10.1007/s11229-007-9182-9. ISSN   0039-7857.
  11. Smalheiser, Neil R (December 2000). "Walter Pitts". Perspectives in Biology and Medicine. 43 (2): 217–226. doi:10.1353/pbm.2000.0009. ISSN   1529-8795. PMID   10804586.
  12. Kleene, S. C. (1956-12-31), Shannon, C. E.; McCarthy, J. (eds.), "Representation of Events in Nerve Nets and Finite Automata", Automata Studies. (AM-34), Princeton University Press, pp. 3–42, doi:10.1515/9781400882618-002, ISBN   978-1-4008-8261-8 , retrieved 2024-10-14
  13. Arbib, Michael A (2000). "Warren McCulloch's Search for the Logic of the Nervous System". Perspectives in Biology and Medicine. 43 (2): 193–216. doi:10.1353/pbm.2000.0001. ISSN   1529-8795. PMID   10804585.
  14. "Summary: The Macy Conferences". asc-cybernetics.org. Retrieved 2024-10-14.
  15. Pitts, Walter; McCulloch, Warren S. (1947-09-01). "How we know universals the perception of auditory and visual forms". The Bulletin of Mathematical Biophysics. 9 (3): 127–147. doi:10.1007/BF02478291. ISSN   1522-9602. PMID   20262674.
  16. Masani, P. R. (1990), "McCulloch, Pitts and the Evolution of Wiener's Neurophysiological Ideas", Norbert Wiener 1894–1964, Basel: Birkhäuser Basel, pp. 218–238, doi:10.1007/978-3-0348-9252-0_16, ISBN   978-3-0348-9963-5 , retrieved 2024-10-14
  17. Blum, Manuel. "Properties of a neuron with many inputs." Bionics Symposium: Living Prototypes--the Key to New Technology, 13-14-15 September 1960. WADD technical report, 60-600. (1961)