Leabra

Last updated

Leabra stands for local, error-driven and associative, biologically realistic algorithm. It is a model of learning which is a balance between Hebbian and error-driven learning with other network-derived characteristics. This model is used to mathematically predict outcomes based on inputs and previous learning influences. Leabra is heavily influenced by and contributes to neural network designs and models, including emergent.

Contents

Background

It is the default algorithm in emergent (successor of PDP++) when making a new project, and is extensively used in various simulations.

Hebbian learning is performed using conditional principal components analysis (CPCA) algorithm with correction factor for sparse expected activity levels.

Error-driven learning is performed using GeneRec, which is a generalization of the recirculation algorithm, and approximates Almeida–Pineda recurrent backpropagation. The symmetric, midpoint version of GeneRec is used, which is equivalent to the contrastive Hebbian learning algorithm (CHL). See O'Reilly (1996; Neural Computation) for more details.

The activation function is a point-neuron approximation with both discrete spiking and continuous rate-code output.

Layer or unit-group level inhibition can be computed directly using a k-winners-take-all (KWTA) function, producing sparse distributed representations.

The net input is computed as an average, not a sum, over connections, based on normalized, sigmoidally transformed weight values, which are subject to scaling on a connection-group level to alter relative contributions. Automatic scaling is performed to compensate for differences in expected activity level in the different projections.

Documentation about this algorithm can be found in the book "Computational Explorations in Cognitive Neuroscience: Understanding the Mind by Simulating the Brain" published by MIT press. [1] and in the Emergent Documentation

Overview of the leabra algorithm

The pseudocode for Leabra is given here, showing exactly how the pieces of the algorithm described in more detail in the subsequent sections fit together.

Iterate over minus and plus phases of settling for each event.  o At start of settling, for all units:    - Initialize all state variables (activation, v_m, etc).    - Apply external patterns (clamp input in minus, input & output in      plus).    - Compute net input scaling terms (constants, computed      here so network can be dynamically altered).    - Optimization: compute net input once from all static activations      (e.g., hard-clamped external inputs).  o During each cycle of settling, for all non-clamped units:    - Compute excitatory netinput (g_e(t), aka eta_j or net)       -- sender-based optimization by ignoring inactives.    - Compute kWTA inhibition for each layer, based on g_i^Q:      * Sort units into two groups based on g_i^Q: top k and        remaining k+1 -> n.      * If basic, find k and k+1th highest        If avg-based, compute avg of 1 -> k & k+1 -> n.      * Set inhibitory conductance g_i from g^Q_k and g^Q_k+1    - Compute point-neuron activation combining excitatory input and      inhibition  o After settling, for all units, record final settling activations    as either minus or plus phase (y^-_j or y^+_j). After both phases update the weights (based on linear current    weight values), for all connections:  o Compute error-driven weight changes with CHL with soft weight bounding  o Compute Hebbian weight changes with CPCA from plus-phase activations  o Compute net weight change as weighted sum of error-driven and Hebbian  o Increment the weights according to net weight change.

Implementations

Emergent is the original implementation of Leabra; its most recent implementation is written in Go. It was written chiefly by Dr. O'Reilly, but professional software engineers were recently hired to improve the existing codebase. This is the fastest implementation, suitable for constructing large networks. Although emergent has a graphical user interface, it is very complex and has a steep learning curve.

If you want to understand the algorithm in detail, it will be easier to read non-optimized code. For this purpose, check out the MATLAB version. There is also an R version available, that can be easily installed via install.packages("leabRa") in R and has a short introduction to how the package is used. The MATLAB and R versions are not suited for constructing very large networks, but they can be installed quickly and (with some programming background) are easy to use. Furthermore, they can also be adapted easily.

Special algorithms

Related Research Articles

<span class="mw-page-title-main">Neural network (machine learning)</span> Computational model used in machine learning, based on connected, hierarchical functions

In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.

Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision. Some researchers consider self-supervised learning a form of unsupervised learning.

<span class="mw-page-title-main">Artificial neuron</span> Mathematical function conceived as a crude model

An artificial neuron is a mathematical function conceived as a model of biological neurons in a neural network. Artificial neurons are the elementary units of artificial neural networks. The artificial neuron is a function that receives one or more inputs, applies weights to these inputs, and sums them to produce an output.

Hebbian theory is a neuropsychological theory claiming that an increase in synaptic efficacy arises from a presynaptic cell's repeated and persistent stimulation of a postsynaptic cell. It is an attempt to explain synaptic plasticity, the adaptation of brain neurons during the learning process. It was introduced by Donald Hebb in his 1949 book The Organization of Behavior. The theory is also called Hebb's rule, Hebb's postulate, and cell assembly theory. Hebb states it as follows:

Let us assume that the persistence or repetition of a reverberatory activity tends to induce lasting cellular changes that add to its stability. ... When an axon of cell A is near enough to excite a cell B and repeatedly or persistently takes part in firing it, some growth process or metabolic change takes place in one or both cells such that A’s efficiency, as one of the cells firing B, is increased.

Spike-timing-dependent plasticity (STDP) is a biological process that adjusts the strength of connections between neurons in the brain. The process adjusts the connection strengths based on the relative timing of a particular neuron's output and input action potentials. The STDP process partially explains the activity-dependent development of nervous systems, especially with regard to long-term potentiation and long-term depression.

In machine learning, backpropagation is a gradient estimation method commonly used for training neural networks to compute the network parameter updates.

Recurrent neural networks (RNNs) are a class of artificial neural networks for sequential data processing. Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.

<span class="mw-page-title-main">ADALINE</span> Early single-layer artificial neural network

ADALINE is an early single-layer artificial neural network and the name of the physical device that implemented this network. It was developed by professor Bernard Widrow and his doctoral student Ted Hoff at Stanford University in 1960. It is based on the perceptron. It consists of a weight, a bias and a summation function. The weights and biases were implemented by rheostats, and later, memistors.

Oja's learning rule, or simply Oja's rule, named after Finnish computer scientist Erkki Oja, is a model of how neurons in the brain or in artificial neural networks change connection strength, or learn, over time. It is a modification of the standard Hebb's Rule that, through multiplicative normalization, solves all stability problems and generates an algorithm for principal components analysis. This is a computational form of an effect which is believed to happen in biological neurons.

Neural cryptography is a branch of cryptography dedicated to analyzing the application of stochastic algorithms, especially artificial neural network algorithms, for use in encryption and cryptanalysis.

The generalized Hebbian algorithm (GHA), also known in the literature as Sanger's rule, is a linear feedforward neural network for unsupervised learning with applications primarily in principal components analysis. First defined in 1989, it is similar to Oja's rule in its formulation and stability, except it can be applied to networks with multiple outputs. The name originates because of the similarity between the algorithm and a hypothesis made by Donald Hebb about the way in which synaptic strengths in the brain are modified in response to experience, i.e., that changes are proportional to the correlation between the firing of pre- and post-synaptic neurons.

Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the specialization of each node in the network. It is well suited to finding clusters within data.

GeneRec is a generalization of the recirculation algorithm, and approximates Almeida-Pineda recurrent backpropagation. It is used as part of the Leabra algorithm for error-driven learning.

Error-driven learning is a type of reinforcement learning method. This method tweaks a model’s parameters based on the difference between the proposed and actual results. These models stand out as they depend on environmental feedback instead of explicit labels or categories. They are based on the idea that language acquisition involves the minimization of the prediction error (MPSE). By leveraging these prediction errors, the models consistently refine expectations and decrease computational complexity. Typically, these algorithms are operated by the GeneRec algorithm.

Prefrontal cortex basal ganglia working memory (PBWM) is an algorithm that models working memory in the prefrontal cortex and the basal ganglia.

There are many types of artificial neural networks (ANN).

A Bayesian Confidence Propagation Neural Network (BCPNN) is an artificial neural network inspired by Bayes' theorem, which regards neural computation and processing as probabilistic inference. Neural unit activations represent probability ("confidence") in the presence of input features or categories, synaptic weights are based on estimated correlations and the spread of activation corresponds to calculating posterior probabilities. It was originally proposed by Anders Lansner and Örjan Ekeberg at KTH Royal Institute of Technology. This probabilistic neural network model can also be run in generative mode to produce spontaneous activations and temporal sequences.

An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weights and bias levels of a network when a network is simulated in a specific data environment. A learning rule may accept existing conditions of the network and will compare the expected result and actual result of the network to give new and improved values for weights and bias. Depending on the complexity of actual model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations.

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns features by itself via filter optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently have been replaced -- in some cases -- by more recent deep learning architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

An artificial neural network (ANN) combines biological principles with advanced statistics to solve problems in domains such as pattern recognition and game-play. ANNs adopt the basic model of neuron analogues connected to each other in a variety of ways.

References

  1. O'Reilly, R. C., Munakata, Y. (2000). Computational explorations in cognitive neuroscience. Cambridge: MIT Press. ISBN   0-19-510491-9.{{cite book}}: CS1 maint: multiple names: authors list (link)