Winner-take-all (computing)

Last updated February 01, 2023

Winner-take-all is a computational principle applied in computational models of neural networks by which neurons compete with each other for activation. In the classical form, only the neuron with the highest activation stays active while all other neurons shut down; however, other variations allow more than one neuron to be active, for example the soft winner take-all, by which a power function is applied to the neurons.

Neural networks

In the theory of artificial neural networks, winner-take-all networks are a case of competitive learning in recurrent neural networks. Output nodes in the network mutually inhibit each other, while simultaneously activating themselves through reflexive connections. After some time, only one node in the output layer will be active, namely the one corresponding to the strongest input. Thus the network uses nonlinear inhibition to pick out the largest of a set of inputs. Winner-take-all is a general computational primitive that can be implemented using different types of neural network models, including both continuous-time and spiking networks.^[1]^[2]

Winner-take-all networks are commonly used in computational models of the brain, particularly for distributed decision-making or action selection in the cortex. Important examples include hierarchical models of vision,^[3] and models of selective attention and recognition.^[4]^[5] They are also common in artificial neural networks and neuromorphic analog VLSI circuits. It has been formally proven that the winner-take-all operation is computationally powerful compared to other nonlinear operations, such as thresholding.^[6]

In many practical cases, there is not only one single neuron which becomes active but there are exactly k neurons which become active for a fixed number k. This principle is referred to as k-winners-take-all.

Circuit example

A simple, but popular CMOS winner-take-all circuit is shown on the right. This circuit was originally proposed by Lazzaro et al. (1989)^[7] using MOS transistors biased to operate in the weak-inversion or subthreshold regime. In the particular case shown there are only two inputs (I_IN,1 and I_IN,2), but the circuit can be easily extended to multiple inputs in a straightforward way. It operates on continuous-time input signals (currents) in parallel, using only two transistors per input. In addition, the bias current I_BIAS is set by a single global transistor that is common to all the inputs.

The largest of the input currents sets the common potential V_C. As a result, the corresponding output carries almost all the bias current, while the other outputs have currents that are close to zero. Thus, the circuit selects the larger of the two input currents, i.e., if I_IN,1 > I_IN,2, we get I_OUT,1 = I_BIAS and I_OUT,2 = 0. Similarly, if I_IN,2 > I_IN,1, we get I_OUT,1 = 0 and I_OUT,2 = I_BIAS.

A SPICE-based DC simulation of the CMOS winner-take-all circuit in the two-input case is shown on the right. As shown in the top subplot, the input I_IN,1 was fixed at 6nA, while I_IN,2 was linearly increased from 0 to 10nA. The bottom subplot shows the two output currents. As expected, the output corresponding to the larger of the two inputs carries the entire bias current (10nA in this case), forcing the other output current nearly to zero.

Other uses

In stereo matching algorithms, following the taxonomy proposed by Scharstein and Szelliski,^[8] winner-take-all is a local method for disparity computation. Adopting a winner-take-all strategy, the disparity associated with the minimum or maximum cost value is selected at each pixel.

It is axiomatic that in the electronic commerce market, early dominant players such as AOL or Yahoo! get most of the rewards. By 1998, one study^{[ clarification needed ]} found the top 5% of all web sites garnered more than 74% of all traffic.

The winner-take-all hypothesis in economics suggests that once a technology or a firm gets ahead, it will do better and better over time, whereas lagging technology and firms will fall further behind. See First-mover advantage.

Related Research Articles

Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.

Unsupervised learning is a type of algorithm that learns patterns from untagged data. The goal is that through mimicry, which is an important mode of learning in people, the machine is forced to build a concise representation of its world and then generate imaginative content from it.

An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs and sums them to produce an output. Usually each input is separately weighted, and the sum is passed through a non-linear function known as an activation function or transfer function. The transfer functions usually have a sigmoid shape, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. They are also often monotonically increasing, continuous, differentiable and bounded. Non-monotonic, unbounded and oscillating activation functions with multiple zeros that outperform sigmoidal and ReLU like activation functions on many tasks have also been recently explored. The thresholding function has inspired building logic gates referred to as threshold logic; applicable to building logic circuits resembling brain processing. For example, new devices such as memristors have been extensively used to develop such logic in recent times.

In machine learning, backpropagation is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In fitting a neural network, backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this is an example of dynamic programming.

A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do not form a cycle. As such, it is different from its descendant: recurrent neural networks.

A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological neurons, or an artificial neural network, used for solving artificial intelligence (AI) problems. The connections of the biological neuron are modeled in artificial neural networks as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1.

Quantum neural networks are computational neural network models which are based on the principles of quantum mechanics. The first ideas on quantum neural computation were published independently in 1995 by Subhash Kak and Ron Chrisley, engaging with the theory of quantum mind, which posits that quantum effects play a role in cognitive function. However, typical research in quantum neural networks involves combining classical artificial neural network models with the advantages of quantum information in order to develop more efficient algorithms. One important motivation for these investigations is the difficulty to train classical neural networks, especially in big data applications. The hope is that features of quantum computing such as quantum parallelism or the effects of interference and entanglement can be used as resources. Since the technological implementation of a quantum computer is still in a premature stage, such quantum neural network models are mostly theoretical proposals that await their full implementation in physical experiments.

Pulse-coupled networks or pulse-coupled neural networks (PCNNs) are neural models proposed by modeling a cat's visual cortex, and developed for high-performance biomimetic image processing.

Neural coding is a neuroscience field concerned with characterising the hypothetical relationship between the stimulus and the individual or ensemble neuronal responses and the relationship among the electrical activity of the neurons in the ensemble. Based on the theory that sensory and other information is represented in the brain by networks of neurons, it is thought that neurons can encode both digital and analog information.

The floating-gate MOSFET (FGMOS), also known as a floating-gate MOS transistor or floating-gate transistor, is a type of metal–oxide–semiconductor field-effect transistor (MOSFET) where the gate is electrically isolated, creating a floating node in direct current, and a number of secondary gates or inputs are deposited above the floating gate (FG) and are electrically isolated from it. These inputs are only capacitively connected to the FG. Since the FG is surrounded by highly resistive material, the charge contained in it remains unchanged for long periods of time, nowadays typically longer than 10 years. Usually Fowler-Nordheim tunneling and hot-carrier injection mechanisms are used to modify the amount of charge stored in the FG.

<span class="mw-page-title-main">Echo state network</span> Type of reservoir computer

An echo state network (ESN) is a type of reservoir computer that uses a recurrent neural network with a sparsely connected hidden layer. The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system.

<span class="mw-page-title-main">Synaptic gating</span>

Synaptic gating is the ability of neural circuits to gate inputs by either suppressing or facilitating specific synaptic activity. Selective inhibition of certain synapses has been studied thoroughly, and recent studies have supported the existence of permissively gated synaptic transmission. In general, synaptic gating involves a mechanism of central control over neuronal output. It includes a sort of gatekeeper neuron, which has the ability to influence transmission of information to selected targets independently of the parts of the synapse upon which it exerts its action.

<span class="mw-page-title-main">Spiking neural network</span>

Spiking neural networks (SNNs) are artificial neural networks that more closely mimic natural neural networks. In addition to neuronal and synaptic state, SNNs incorporate the concept of time into their operating model. The idea is that neurons in the SNN do not transmit information at each propagation cycle, but rather transmit information only when a membrane potential – an intrinsic quality of the neuron related to its membrane electrical charge – reaches a specific value, called the threshold. When the membrane potential reaches the threshold, the neuron fires, and generates a signal that travels to other neurons which, in turn, increase or decrease their potentials in response to this signal. A neuron model that fires at the moment of threshold crossing is also called a spiking neuron model.

Hierarchical temporal memory (HTM) is a biologically constrained machine intelligence technology developed by Numenta. Originally described in the 2004 book On Intelligence by Jeff Hawkins with Sandra Blakeslee, HTM is primarily used today for anomaly detection in streaming data. The technology is based on neuroscience and the physiology and interaction of pyramidal neurons in the neocortex of the mammalian brain.

Neural cryptography is a branch of cryptography dedicated to analyzing the application of stochastic algorithms, especially artificial neural network algorithms, for use in encryption and cryptanalysis.

Models of neural computation are attempts to elucidate, in an abstract and mathematical fashion, the core principles that underlie information processing in biological nervous systems, or functional components thereof. This article aims to provide an overview of the most definitive models of neuro-biological computation as well as the tools commonly used to construct and analyze them.

There are many types of artificial neural networks (ANN).

Network of human nervous system comprises nodes that are connected by links. The connectivity may be viewed anatomically, functionally, or electrophysiologically. These are presented in several Wikipedia articles that include Connectionism, Biological neural network, Artificial neural network, Computational neuroscience, as well as in several books by Ascoli, G. A. (2002), Sterratt, D., Graham, B., Gillies, A., & Willshaw, D. (2011), Gerstner, W., & Kistler, W. (2002), and Rumelhart, J. L., McClelland, J. L., and PDP Research Group (1986) among others. The focus of this article is a comprehensive view of modeling a neural network. Once an approach based on the perspective and connectivity is chosen, the models are developed at microscopic, mesoscopic, or macroscopic (system) levels. Computational modeling refers to models that are developed using computing tools.

An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weights and bias levels of a network when a network is simulated in a specific data environment. A learning rule may accept existing conditions of the network and will compare the expected result and actual result of the network to give new and improved values for weights and bias. Depending on the complexity of actual model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations.

In deep learning, a convolutional neural network is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Networks (SIANN), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Counter-intuitively, most convolutional neural networks are not invariant to translation, due to the downsampling operation they apply to the input. They have applications in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain–computer interfaces, and financial time series.

References

↑ Grossberg, Stephen (1982), "Contour Enhancement, Short Term Memory, and Constancies in Reverberating Neural Networks", Studies of Mind and Brain, Boston Studies in the Philosophy of Science, Dordrecht: Springer Netherlands, vol. 70, pp. 332–378, doi:10.1007/978-94-009-7758-7_8, ISBN 978-90-277-1360-5 , retrieved 2022-11-05
↑ Oster, Matthias; Rodney, Douglas; Liu, Shih-Chii (2009). "Computation with Spikes in a Winner-Take-All Network". Neural Computation. 21 (9): 2437–2465. doi:10.1162/neco.2009.07-08-829. PMID 19548795. S2CID 7259946.
↑ Riesenhuber, Maximilian; Poggio, Tomaso (1999-11-01). "Hierarchical models of object recognition in cortex". Nature Neuroscience. 2 (11): 1019–1025. doi:10.1038/14819. ISSN 1097-6256. PMID 10526343. S2CID 8920227.
↑ Carpenter, Gail A. (1987). "A massively parallel architecture for a self-organizing neural pattern recognition machine". Computer Vision, Graphics, and Image Processing. 37 (1): 54–115. doi:10.1016/S0734-189X(87)80014-2.
↑ Itti, Laurent; Koch, Christof (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis". IEEE Transactions on Pattern Analysis and Machine Intelligence. 20 (11): 1254–1259. doi:10.1109/34.730558.
↑ Maass, Wolfgang (2000-11-01). "On the Computational Power of Winner-Take-All". Neural Computation. 12 (11): 2519–2535. doi:10.1162/089976600300014827. ISSN 0899-7667. PMID 11110125. S2CID 10304135.
↑ Lazzaro, J.; Ryckebusch, S.; Mahowald, M. A.; Mead, C. A. (1988-01-01). "Winner-Take-All Networks of O(N) Complexity". Fort Belvoir, VA. doi:10.21236/ada451466.{{cite journal}}: Cite journal requires |journal= (help)
↑ Scharstein, Daniel; Szeliski, Richard (2002). "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms". International Journal of Computer Vision. 47 (1/3): 7–42. doi:10.1023/A:1014573219977. S2CID 195859047.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Grossberg, Stephen (1982), "Contour Enhancement, Short Term Memory, and Constancies in Reverberating Neural Networks", Studies of Mind and Brain, Boston Studies in the Philosophy of Science, Dordrecht: Springer Netherlands, vol. 70, pp. 332–378, doi:10.1007/978-94-009-7758-7_8, ISBN 978-90-277-1360-5 , retrieved 2022-11-05

[:0-2] Oster, Matthias; Rodney, Douglas; Liu, Shih-Chii (2009). "Computation with Spikes in a Winner-Take-All Network". Neural Computation. 21 (9): 2437–2465. doi:10.1162/neco.2009.07-08-829. PMID 19548795. S2CID 7259946.

[3] Riesenhuber, Maximilian; Poggio, Tomaso (1999-11-01). "Hierarchical models of object recognition in cortex". Nature Neuroscience. 2 (11): 1019–1025. doi:10.1038/14819. ISSN 1097-6256. PMID 10526343. S2CID 8920227.

[4] Carpenter, Gail A. (1987). "A massively parallel architecture for a self-organizing neural pattern recognition machine". Computer Vision, Graphics, and Image Processing. 37 (1): 54–115. doi:10.1016/S0734-189X(87)80014-2.

[5] Itti, Laurent; Koch, Christof (1998). "A Model of Saliency-Based Visual Attention for Rapid Scene Analysis". IEEE Transactions on Pattern Analysis and Machine Intelligence. 20 (11): 1254–1259. doi:10.1109/34.730558.

[6] Maass, Wolfgang (2000-11-01). "On the Computational Power of Winner-Take-All". Neural Computation. 12 (11): 2519–2535. doi:10.1162/089976600300014827. ISSN 0899-7667. PMID 11110125. S2CID 10304135.

[7] Lazzaro, J.; Ryckebusch, S.; Mahowald, M. A.; Mead, C. A. (1988-01-01). "Winner-Take-All Networks of O(N) Complexity". Fort Belvoir, VA. doi:10.21236/ada451466.{{cite journal}}: Cite journal requires |journal= (help)

[8] Scharstein, Daniel; Szeliski, Richard (2002). "A Taxonomy and Evaluation of Dense Two-Frame Stereo Correspondence Algorithms". International Journal of Computer Vision. 47 (1/3): 7–42. doi:10.1023/A:1014573219977. S2CID 195859047.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]