Leabra

Last updated

Leabra stands for local, error-driven and associative, biologically realistic algorithm. It is a model of learning which is a balance between Hebbian and error-driven learning with other network-derived characteristics. This model is used to mathematically predict outcomes based on inputs and previous learning influences. This model is heavily influenced by and contributes to neural network designs and models. This algorithm is the default algorithm in emergent (successor of PDP++) when making a new project, and is extensively used in various simulations.

Contents

Hebbian learning is performed using conditional principal components analysis (CPCA) algorithm with correction factor for sparse expected activity levels.

Error-driven learning is performed using GeneRec, which is a generalization of the recirculation algorithm, and approximates Almeida–Pineda recurrent backpropagation. The symmetric, midpoint version of GeneRec is used, which is equivalent to the contrastive Hebbian learning algorithm (CHL). See O'Reilly (1996; Neural Computation) for more details.

The activation function is a point-neuron approximation with both discrete spiking and continuous rate-code output.

Layer or unit-group level inhibition can be computed directly using a k-winners-take-all (KWTA) function, producing sparse distributed representations.

The net input is computed as an average, not a sum, over connections, based on normalized, sigmoidally transformed weight values, which are subject to scaling on a connection-group level to alter relative contributions. Automatic scaling is performed to compensate for differences in expected activity level in the different projections.

Documentation about this algorithm can be found in the book "Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain" published by MIT press. [1] and in the Emergent Documentation

Overview of the leabra algorithm

The pseudocode for Leabra is given here, showing exactly how the pieces of the algorithm described in more detail in the subsequent sections fit together.

Iterate over minus and plus phases of settling for each event.  o At start of settling, for all units:    - Initialize all state variables (activation, v_m, etc).    - Apply external patterns (clamp input in minus, input & output in      plus).    - Compute net input scaling terms (constants, computed      here so network can be dynamically altered).    - Optimization: compute net input once from all static activations      (e.g., hard-clamped external inputs).  o During each cycle of settling, for all non-clamped units:    - Compute excitatory netinput (g_e(t), aka eta_j or net)       -- sender-based optimization by ignoring inactives.    - Compute kWTA inhibition for each layer, based on g_i^Q:      * Sort units into two groups based on g_i^Q: top k and        remaining k+1 -> n.      * If basic, find k and k+1th highest        If avg-based, compute avg of 1 -> k & k+1 -> n.      * Set inhibitory conductance g_i from g^Q_k and g^Q_k+1    - Compute point-neuron activation combining excitatory input and      inhibition  o After settling, for all units, record final settling activations    as either minus or plus phase (y^-_j or y^+_j). After both phases update the weights (based on linear current    weight values), for all connections:  o Compute error-driven weight changes with CHL with soft weight bounding  o Compute Hebbian weight changes with CPCA from plus-phase activations  o Compute net weight change as weighted sum of error-driven and Hebbian  o Increment the weights according to net weight change.

Implementations

Emergent is the original implementation of Leabra; its most recent implementation is written in Go. It was written chiefly by Dr. O'Reilly, but professional software engineers were recently hired to improve the existing codebase. This is the fastest implementation, suitable for constructing large networks. Although emergent has a graphical user interface, it is very complex and has a steep learning curve.

If you want to understand the algorithm in detail, it will be easier to read non-optimized code. For this purpose, check out the MATLAB version. There is also an R version available, that can be easily installed via install.packages("leabRa") in R and has a short introduction to how the package is used. The MATLAB and R versions are not suited for constructing very large networks, but they can be installed quickly and (with some programming background) are easy to use. Furthermore, they can also be adapted easily.

Special algorithms

Related Research Articles

Artificial neural network Computational model used in machine learning, based on connected, hierarchical functions

Artificial neural networks (ANNs), usually simply called neural networks (NNs) or, more simply yet, neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.

Unsupervised learning Machine learning technique

Unsupervised learning is a type of algorithm that learns patterns from untagged data. The hope is that through mimicry, which is an important mode of learning in people, the machine is forced to build a compact internal representation of its world and then generate imaginative content from it. In contrast to supervised learning where data is tagged by an expert, e.g. as a "ball" or "fish", unsupervised methods exhibit self-organization that captures patterns as probability densities or a combination of neural feature preferences. The other levels in the supervision spectrum are reinforcement learning where the machine is given only a numerical performance score as guidance, and semi-supervised learning where a smaller portion of the data is tagged.

Connectionism Cognitive science approach

Connectionism refers to both an approach in the field of cognitive science that hopes to explain mental phenomena using artificial neural networks (ANN). and to a wide range of techniques and algorithms using ANNs in the context of artificial intelligence to build more intelligent machines. Connectionism presents a cognitive theory based on simultaneously occurring, distributed signal activity via connections that can be represented numerically, where learning occurs by modifying connection strengths based on experience.

Hebbian theory is a neuroscientific theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptation of brain neurons during the learning process. It was introduced by Donald Hebb in his 1949 book The Organization of Behavior. The theory is also called Hebb's rule, Hebb's postulate, and cell assembly theory. Hebb states it as follows:

Let us assume that the persistence or repetition of a reverberatory activity tends to induce lasting cellular changes that add to its stability. ... When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.

Spike-timing-dependent plasticity (STDP) is a biological process that adjusts the strength of connections between neurons in the brain. The process adjusts the connection strengths based on the relative timing of a particular neuron's output and input action potentials. The STDP process partially explains the activity-dependent development of nervous systems, especially with regard to long-term potentiation and long-term depression.

A Hopfield network is a form of recurrent artificial neural network and a type of spin glass system popularised by John Hopfield in 1982 as described earlier by Little in 1974 based on Ernst Ising's work with Wilhelm Lenz on the Ising model. Hopfield networks serve as content-addressable ("associative") memory systems with binary threshold nodes, or with continuous variables. Hopfield networks also provide a model for understanding human memory.

Backpropagation Optimization algorithm for artificial neural networks

In machine learning, backpropagation is a widely used algorithm for training feedforward neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In fitting a neural network, backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this is an example of dynamic programming.

Feedforward neural network Type of artificial neural network

A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do not form a cycle. As such, it is different from its descendant: recurrent neural networks.

Neural network Structure in biology and artificial intelligence

A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological neurons, or an artificial neural network, used for solving artificial intelligence (AI) problems. The connections of the biological neuron are modeled in artificial neural networks as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1.

Multilayer perceptron Type of feedforward neural network

A multilayer perceptron (MLP) is a fully connected class of feedforward artificial neural network (ANN). The term MLP is used ambiguously, sometimes loosely to mean any feedforward ANN, sometimes strictly to refer to networks composed of multiple layers of perceptrons ; see § Terminology. Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden layer.

ADALINE Early single-layer artificial neural network

ADALINE is an early single-layer artificial neural network and the name of the physical device that implemented this network. The network uses memistors. It was developed by Professor Bernard Widrow and his doctorate student Ted Hoff at Stanford University in 1960. It is based on the McCulloch–Pitts neuron. It consists of a weight, a bias and a summation function.

Echo state network Type of reservoir computer

An echo state network (ESN) is a type of reservoir computer that uses a recurrent neural network with a sparsely connected hidden layer. The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system.

Oja's learning rule, or simply Oja's rule, named after Finnish computer scientist Erkki Oja, is a model of how neurons in the brain or in artificial neural networks change connection strength, or learn, over time. It is a modification of the standard Hebb's Rule that, through multiplicative normalization, solves all stability problems and generates an algorithm for principal components analysis. This is a computational form of an effect which is believed to happen in biological neurons.

Neural cryptography is a branch of cryptography dedicated to analyzing the application of stochastic algorithms, especially artificial neural network algorithms, for use in encryption and cryptanalysis.

Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the specialization of each node in the network. It is well suited to finding clusters within data.

GeneRec is a generalization of the recirculation algorithm, and approximates Almeida-Pineda recurrent backpropagation. It is used as part of the Leabra algorithm for error-driven learning.

Prefrontal cortex basal ganglia working memory (PBWM) is an algorithm that models working memory in the prefrontal cortex and the basal ganglia.

A Bayesian Confidence Propagation Neural Network (BCPNN) is an artificial neural network inspired by Bayes' theorem, which regards neural computation and processing as probabilistic inference. Neural unit activations represent probability ("confidence") in the presence of input features or categories, synaptic weights are based on estimated correlations and the spread of activation corresponds to calculating posterior probabilities. It was originally proposed by Anders Lansner and Örjan Ekeberg at KTH Royal Institute of Technology. This probabilistic neural network model can also be run in generative mode to produce spontaneous activations and temporal sequences.

An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weights and bias levels of a network when a network is simulated in a specific data environment. A learning rule may accept existing conditions of the network and will compare the expected result and actual result of the network to give new and improved values for weights and bias. Depending on the complexity of actual model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations.

Convolutional neural network Artificial neural network

In deep learning, a convolutional neural network is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Networks (SIANN), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Counter-intuitively, most convolutional neural networks are not invariant to translation, due to the downsampling operation they apply to the input. They have applications in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain–computer interfaces, and financial time series.

References

  1. O'Reilly, R. C., Munakata, Y. (2000). Computational explorations in cognitive neuroscience. Cambridge: MIT Press. ISBN   0-19-510491-9.{{cite book}}: CS1 maint: multiple names: authors list (link)