Cerebellar model articulation controller

Last updated

A block diagram of the CMAC system for a single joint. The vector S is presented as input to all joints. Each joint separately computes an S -> A* mapping and a joint actuator signal pi. The adjustable weights for all joints may reside in the same physical memory. CMAC system block diagram.jpg
A block diagram of the CMAC system for a single joint. The vector S is presented as input to all joints. Each joint separately computes an S -> A* mapping and a joint actuator signal pi. The adjustable weights for all joints may reside in the same physical memory.

The cerebellar model arithmetic computer (CMAC) is a type of neural network based on a model of the mammalian cerebellum. It is also known as the cerebellar model articulation controller. It is a type of associative memory. [2]

Contents

The CMAC was first proposed as a function modeler for robotic controllers by James Albus in 1975 [1] (hence the name), but has been extensively used in reinforcement learning and also as for automated classification in the machine learning community. The CMAC is an extension of the perceptron model. It computes a function for input dimensions. The input space is divided up into hyper-rectangles, each of which is associated with a memory cell. The contents of the memory cells are the weights, which are adjusted during training. Usually, more than one quantisation of input space is used, so that any point in input space is associated with a number of hyper-rectangles, and therefore with a number of memory cells. The output of a CMAC is the algebraic sum of the weights in all the memory cells activated by the input point.

A change of value of the input point results in a change in the set of activated hyper-rectangles, and therefore a change in the set of memory cells participating in the CMAC output. The CMAC output is therefore stored in a distributed fashion, such that the output corresponding to any point in input space is derived from the value stored in a number of memory cells (hence the name associative memory). This provides generalisation.

Building blocks

CMAC, represented as a 2D space CmacHashing.jpg
CMAC, represented as a 2D space

In the adjacent image, there are two inputs to the CMAC, represented as a 2D space. Two quantising functions have been used to divide this space with two overlapping grids (one shown in heavier lines). A single input is shown near the middle, and this has activated two memory cells, corresponding to the shaded area. If another point occurs close to the one shown, it will share some of the same memory cells, providing generalisation.

The CMAC is trained by presenting pairs of input points and output values, and adjusting the weights in the activated cells by a proportion of the error observed at the output. This simple training algorithm has a proof of convergence. [3]

It is normal to add a kernel function to the hyper-rectangle, so that points falling towards the edge of a hyper-rectangle have a smaller activation than those falling near the centre. [4]

One of the major problems cited in practical use of CMAC is the memory size required, which is directly related to the number of cells used. This is usually ameliorated by using a hash function, and only providing memory storage for the actual cells that are activated by inputs.

One-step convergent algorithm

Initially least mean square (LMS) method is employed to update the weights of CMAC. The convergence of using LMS for training CMAC is sensitive to the learning rate and could lead to divergence. In 2004, [5] a recursive least squares (RLS) algorithm was introduced to train CMAC online. It does not need to tune a learning rate. Its convergence has been proved theoretically and can be guaranteed to converge in one step. The computational complexity of this RLS algorithm is O(N3).

Parallel pipeline structure of CMAC neural network Cmac.jpg
Parallel pipeline structure of CMAC neural network
Left panel: real functions; right panel: CMAC approximation with derivatives Ccmac.jpg
Left panel: real functions; right panel: CMAC approximation with derivatives

Hardware implementation infrastructure

Based on QR decomposition, an algorithm (QRLS) has been further simplified to have an O(N) complexity. Consequently, this reduces memory usage and time cost significantly. A parallel pipeline array structure on implementing this algorithm has been introduced. [6]

Overall by utilizing QRLS algorithm, the CMAC neural network convergence can be guaranteed, and the weights of the nodes can be updated using one step of training. Its parallel pipeline array structure offers its great potential to be implemented in hardware for large-scale industry usage.

Continuous CMAC

Since the rectangular shape of CMAC receptive field functions produce discontinuous staircase function approximation, by integrating CMAC with B-splines functions, continuous CMAC offers the capability of obtaining any order of derivatives of the approximate functions.

Deep CMAC

In recent years, numerous studies have confirmed that by stacking several shallow structures into a single deep structure, the overall system could achieve better data representation, and, thus, more effectively deal with nonlinear and high complexity tasks. In 2018, [7] a deep CMAC (DCMAC) framework was proposed and a backpropagation algorithm was derived to estimate the DCMAC parameters. Experimental results of an adaptive noise cancellation task showed that the proposed DCMAC can achieve better noise cancellation performance when compared with that from the conventional single-layer CMAC.

Summary

ScalabilityStraightforward to extend to millions of neurons or further
ConvergenceThe training can always converge in one step
Function derivativesStraightforward to obtain by employing B-splines interpolation
Hardware structureParallel pipeline structure
Memory usageLinear with respect to the number of neurons
Computational complexityO(N)

See also

Related Research Articles

<span class="mw-page-title-main">Neural network (machine learning)</span> Computational model used in machine learning, based on connected, hierarchical functions

In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.

<span class="mw-page-title-main">Cerebellum</span> Structure at the rear of the vertebrate brain, beneath the cerebrum

The cerebellum is a major feature of the hindbrain of all vertebrates. Although usually smaller than the cerebrum, in some animals such as the mormyrid fishes it may be as large as it or even larger. In humans, the cerebellum plays an important role in motor control and cognitive functions such as attention and language as well as emotional control such as regulating fear and pleasure responses, but its movement-related functions are the most solidly established. The human cerebellum does not initiate movement, but contributes to coordination, precision, and accurate timing: it receives input from sensory systems of the spinal cord and from other parts of the brain, and integrates these inputs to fine-tune motor activity. Cerebellar damage produces disorders in fine movement, equilibrium, posture, and motor learning in humans.

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision. Some researchers consider self-supervised learning a form of unsupervised learning.

<span class="mw-page-title-main">Artificial neuron</span> Mathematical function conceived as a crude model

An artificial neuron is a mathematical function conceived as a model of a biological neuron in a neural network. The artificial neuron is the elementary unit of an artificial neural network.

<span class="mw-page-title-main">James S. Albus</span> American engineer (1935–2011)

James Sacra Albus was an American engineer, Senior NIST Fellow and founder and former chief of the Intelligent Systems Division of the Manufacturing Engineering Laboratory at the National Institute of Standards and Technology (NIST).

A Hopfield network is a form of recurrent neural network, or a spin glass system, that can serve as a content-addressable memory. The Hopfield network, named for John Hopfield, consists of a single layer of neurons, where each neuron is connected to every other neuron except itself. These connections are bidirectional and symmetric, meaning the weight of the connection from neuron i to neuron j is the same as the weight from neuron j to neuron i. Patterns are associatively recalled by fixing certain inputs, and dynamically evolve the network to minimize an energy function, towards local energy minimum states that correspond to stored patterns. Patterns are associatively learned by a Hebbian learning algorithm.

In machine learning, backpropagation is a gradient estimation method commonly used for training a neural network to compute its parameter updates.

Recurrent neural networks (RNNs) are a class of artificial neural network commonly used for sequential data processing. Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.

<span class="mw-page-title-main">Feedforward neural network</span> Type of artificial neural network

A feedforward neural network (FNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. Its flow is uni-directional, meaning that the information in the model flows in only one direction—forward—from the input nodes, through the hidden nodes and to the output nodes, without any cycles or loops. Modern feedforward networks are trained using backpropagation, and are colloquially referred to as "vanilla" neural networks.

Adaptive resonance theory (ART) is a theory developed by Stephen Grossberg and Gail Carpenter on aspects of how the brain processes information. It describes a number of artificial neural network models which use supervised and unsupervised learning methods, and address problems such as pattern recognition and prediction.

<span class="mw-page-title-main">Quantum neural network</span> Quantum Mechanics in Neural Networks

Quantum neural networks are computational neural network models which are based on the principles of quantum mechanics. The first ideas on quantum neural computation were published independently in 1995 by Subhash Kak and Ron Chrisley, engaging with the theory of quantum mind, which posits that quantum effects play a role in cognitive function. However, typical research in quantum neural networks involves combining classical artificial neural network models with the advantages of quantum information in order to develop more efficient algorithms. One important motivation for these investigations is the difficulty to train classical neural networks, especially in big data applications. The hope is that features of quantum computing such as quantum parallelism or the effects of interference and entanglement can be used as resources. Since the technological implementation of a quantum computer is still in a premature stage, such quantum neural network models are mostly theoretical proposals that await their full implementation in physical experiments.

<span class="mw-page-title-main">ADALINE</span> Early single-layer artificial neural network

ADALINE is an early single-layer artificial neural network and the name of the physical device that implemented it. It was developed by professor Bernard Widrow and his doctoral student Marcian Hoff at Stanford University in 1960. It is based on the perceptron and consists of weights, a bias, and a summation function. The weights and biases were implemented by rheostats, and later, memistors.

Computational neurogenetic modeling (CNGM) is concerned with the study and development of dynamic neuronal models for modeling brain functions with respect to genes and dynamic interactions between genes. These include neural network models and their integration with gene network models. This area brings together knowledge from various scientific disciplines, such as computer and information science, neuroscience and cognitive science, genetics and molecular biology, as well as engineering.

<span class="mw-page-title-main">Echo state network</span> Type of reservoir computer

An echo state network (ESN) is a type of reservoir computer that uses a recurrent neural network with a sparsely connected hidden layer. The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behavior is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system.

<span class="mw-page-title-main">Long short-term memory</span> Type of recurrent neural network architecture

Long short-term memory (LSTM) is a type of recurrent neural network (RNN) aimed at mitigating the vanishing gradient problem commonly encountered by traditional RNNs. Its relative insensitivity to gap length is its advantage over other RNNs, hidden Markov models, and other sequence learning methods. It aims to provide a short-term memory for RNN that can last thousands of timesteps. The name is made in analogy with long-term memory and short-term memory and their relationship, studied by cognitive psychologists since the early 20th century.

In neuroethology and the study of learning, anti-Hebbian learning describes a particular class of learning rule by which synaptic plasticity can be controlled. These rules are based on a reversal of Hebb's postulate, and therefore can be simplistically understood as dictating reduction of the strength of synaptic connectivity between neurons following a scenario in which a neuron directly contributes to production of an action potential in another neuron.

<span class="mw-page-title-main">Real-time Control System</span> Reference model architecture

Real-time Control System (RCS) is a reference model architecture, suitable for many software-intensive, real-time computing control problem domains. It defines the types of functions needed in a real-time intelligent control system, and how these functions relate to each other.

There are many types of artificial neural networks (ANN).

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns features by itself via filter optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently have been replaced -- in some cases -- by newer deep learning architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

References

  1. 1 2 Albus, J. S. (1 September 1975). "A New Approach to Manipulator Control: The Cerebellar Model Articulation Controller (CMAC)". Journal of Dynamic Systems, Measurement, and Control. 97 (3): 220–227. doi:10.1115/1.3426922. ISSN   0022-0434.
  2. Albus, James S. (August 1979). "Mechanisms of planning and problem solving in the brain". Mathematical Biosciences. 45 (3–4): 247–293. doi:10.1016/0025-5564(79)90063-4.
  3. Wong, Y.; Sideris, A. (January 1992). "Learning convergence in the cerebellar model articulation controller". IEEE Transactions on Neural Networks. 3 (1): 115–121. doi:10.1109/72.105424. PMID   18276412.
  4. P.C.E. An, W.T. Miller, and P.C. Parks, Design Improvements in Associative Memories for Cerebellar Model Articulation Controllers, Proc. ICANN, pp. 1207–10, 1991.
  5. Qin, Ting; Chen, Zonghai; Zhang, Haitao; Li, Sifu; Xiang, Wei; Li, Ming (1 February 2004). "A Learning Algorithm of CMAC Based on RLS". Neural Processing Letters. 19 (1): 49–61. doi:10.1023/B:NEPL.0000016847.18175.60. ISSN   1573-773X.
  6. 1 2 Qin, Ting; Zhang, Haitao; Chen, Zonghai; Xiang, Wei (1 August 2005). "Continuous CMAC-QRLS and Its Systolic Array". Neural Processing Letters. 22 (1): 1–16. doi:10.1007/s11063-004-2694-0. ISSN   1573-773X.
  7. Tsa, Yu; Chu, Hao-Chun; Fang, Shih-Hau; Lee, Junghsi; Lin, Chih-Min (2018). "Adaptive Noise Cancellation Using Deep Cerebellar Model Articulation Controller". IEEE Access. 6: 37395–37402. arXiv: 1705.00945 . doi:10.1109/ACCESS.2018.2827699. ISSN   2169-3536.

Further reading