Probabilistic neural network

Last updated

A probabilistic neural network (PNN) [1] is a feedforward neural network, which is widely used in classification and pattern recognition problems. In the PNN algorithm, the parent probability distribution function (PDF) of each class is approximated by a Parzen window and a non-parametric function. Then, using PDF of each class, the class probability of a new input data is estimated and Bayes’ rule is then employed to allocate the class with highest posterior probability to new input data. By this method, the probability of mis-classification is minimized. [2] This type of artificial neural network (ANN) was derived from the Bayesian network [3] and a statistical algorithm called Kernel Fisher discriminant analysis. [4] It was introduced by D.F. Specht in 1966. [5] [6] In a PNN, the operations are organized into a multilayered feedforward network with four layers:

Contents

Layers

PNN is often used in classification problems. [7] When an input is present, the first layer computes the distance from the input vector to the training input vectors. This produces a vector where its elements indicate how close the input is to the training input. The second layer sums the contribution for each class of inputs and produces its net output as a vector of probabilities. Finally, a compete transfer function on the output of the second layer picks the maximum of these probabilities, and produces a 1 (positive identification) for that class and a 0 (negative identification) for non-targeted classes.

Input layer

Each neuron in the input layer represents a predictor variable. In categorical variables, N-1 neurons are used when there are N number of categories. It standardizes the range of the values by subtracting the median and dividing by the interquartile range. Then the input neurons feed the values to each of the neurons in the hidden layer.

Pattern layer

This layer contains one neuron for each case in the training data set. It stores the values of the predictor variables for the case along with the target value. A hidden neuron computes the Euclidean distance of the test case from the neuron's center point and then applies the radial basis function kernel using the sigma values.

Summation layer

For PNN there is one pattern neuron for each category of the target variable. The actual target category of each training case is stored with each hidden neuron; the weighted value coming out of a hidden neuron is fed only to the pattern neuron that corresponds to the hidden neuron’s category. The pattern neurons add the values for the class they represent.

Output layer

The output layer compares the weighted votes for each target category accumulated in the pattern layer and uses the largest vote to predict the target category.

Advantages

There are several advantages and disadvantages using PNN instead of multilayer perceptron. [8]

Disadvantages

Applications based on PNN

Related Research Articles

<span class="mw-page-title-main">Artificial neural network</span> Computational model used in machine learning, based on connected, hierarchical functions

Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.

Pattern recognition is the automated recognition of patterns and regularities in data. It has applications in statistical data analysis, signal processing, image analysis, information retrieval, bioinformatics, data compression, computer graphics and machine learning. Pattern recognition has its origins in statistics and engineering; some modern approaches to pattern recognition include the use of machine learning, due to the increased availability of big data and a new abundance of processing power. These activities can be viewed as two facets of the same field of application, and they have undergone substantial development over the past few decades.

<span class="mw-page-title-main">Perceptron</span> Algorithm for supervised learning of binary classifiers

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

<span class="mw-page-title-main">Unsupervised learning</span> Machine learning task

Unsupervised learning is a type of algorithm that learns patterns from untagged data. The goal is that through mimicry, which is an important mode of learning in people, the machine is forced to build a concise representation of its world and then generate imaginative content from it.

An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs and sums them to produce an output. Usually each input is separately weighted, and the sum is passed through a non-linear function known as an activation function or transfer function. The transfer functions usually have a sigmoid shape, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. They are also often monotonically increasing, continuous, differentiable and bounded. Non-monotonic, unbounded and oscillating activation functions with multiple zeros that outperform sigmoidal and ReLU like activation functions on many tasks have also been recently explored. The thresholding function has inspired building logic gates referred to as threshold logic; applicable to building logic circuits resembling brain processing. For example, new devices such as memristors have been extensively used to develop such logic in recent times.

<span class="mw-page-title-main">Backpropagation</span> Optimization algorithm for artificial neural networks

In machine learning, backpropagation is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In fitting a neural network, backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this is an example of dynamic programming.

In statistics, classification is the problem of identifying which of a set of categories (sub-populations) an observation belongs to. Examples are assigning a given email to the "spam" or "non-spam" class, and assigning a diagnosis to a given patient based on observed characteristics of the patient.

<span class="mw-page-title-main">Recurrent neural network</span> Computational model used in machine learning

A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs. This makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. Recurrent neural networks are theoretically Turing complete and can run arbitrary programs to process arbitrary sequences of inputs.

<span class="mw-page-title-main">Feedforward neural network</span> Type of artificial neural network

A feedforward neural network (FNN) is an artificial neural network wherein connections between the nodes do not form a cycle. As such, it is different from its descendant: recurrent neural networks.

<span class="mw-page-title-main">Multilayer perceptron</span> Type of feedforward neural network

A multilayer perceptron (MLP) is a fully connected class of feedforward artificial neural network (ANN). The term MLP is used ambiguously, sometimes loosely to mean any feedforward ANN, sometimes strictly to refer to networks composed of multiple layers of perceptrons ; see § Terminology. Multilayer perceptrons are sometimes colloquially referred to as "vanilla" neural networks, especially when they have a single hidden layer.

<span class="mw-page-title-main">ADALINE</span> Early single-layer artificial neural network

ADALINE is an early single-layer artificial neural network and the name of the physical device that implemented this network. The network uses memistors. It was developed by Professor Bernard Widrow and his doctoral student Ted Hoff at Stanford University in 1960. It is based on the McCulloch–Pitts neuron. It consists of a weight, a bias and a summation function.

<span class="mw-page-title-main">Spiking neural network</span>

Spiking neural networks (SNNs) are artificial neural networks that more closely mimic natural neural networks. In addition to neuronal and synaptic state, SNNs incorporate the concept of time into their operating model. The idea is that neurons in the SNN do not transmit information at each propagation cycle, but rather transmit information only when a membrane potential – an intrinsic quality of the neuron related to its membrane electrical charge – reaches a specific value, called the threshold. When the membrane potential reaches the threshold, the neuron fires, and generates a signal that travels to other neurons which, in turn, increase or decrease their potentials in response to this signal. A neuron model that fires at the moment of threshold crossing is also called a spiking neuron model.

Neural cryptography is a branch of cryptography dedicated to analyzing the application of stochastic algorithms, especially artificial neural network algorithms, for use in encryption and cryptanalysis.

Perceptrons: an introduction to computational geometry is a book written by Marvin Minsky and Seymour Papert and published in 1969. An edition with handwritten corrections and additions was released in the early 1970s. An expanded edition was further published in 1987, containing a chapter dedicated to counter the criticisms made of it in the 1980s.

<span class="mw-page-title-main">Time delay neural network</span>

Time delay neural network (TDNN) is a multilayer artificial neural network architecture whose purpose is to 1) classify patterns with shift-invariance, and 2) model context at each layer of the network.

<span class="mw-page-title-main">Structured prediction</span> Supervised machine learning techniques

Structured prediction or structured (output) learning is an umbrella term for supervised machine learning techniques that involves predicting structured objects, rather than scalar discrete or real values.

There are many types of artificial neural networks (ANN).

An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weights and bias levels of a network when a network is simulated in a specific data environment. A learning rule may accept existing conditions of the network and will compare the expected result and actual result of the network to give new and improved values for weights and bias. Depending on the complexity of actual model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations.

<span class="mw-page-title-main">Feature learning</span>

In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task.

<span class="mw-page-title-main">Convolutional neural network</span> Artificial neural network

In deep learning, a convolutional neural network is a class of artificial neural network (ANN), most commonly applied to analyze visual imagery. CNNs are also known as Shift Invariant or Space Invariant Artificial Neural Networks (SIANN), based on the shared-weight architecture of the convolution kernels or filters that slide along input features and provide translation-equivariant responses known as feature maps. Counter-intuitively, most convolutional neural networks are not invariant to translation, due to the downsampling operation they apply to the input. They have applications in image and video recognition, recommender systems, image classification, image segmentation, medical image analysis, natural language processing, brain–computer interfaces, and financial time series.

References

  1. Mohebali, Behshad; Tahmassebi, Amirhessam; Meyer-Baese, Anke; Gandomi, Amir H. (2020). Probabilistic neural networks: a brief overview of theory, implementation, and application. Elsevier. pp. 347–367. doi:10.1016/B978-0-12-816514-0.00014-X. S2CID   208119250.
  2. Zeinali, Yasha; Story, Brett A. (2017). "Competitive probabilistic neural network". Integrated Computer-Aided Engineering. 24 (2): 105–118. doi:10.3233/ICA-170540.
  3. "Probabilistic Neural Networks". Archived from the original on 2010-12-18. Retrieved 2012-03-22.
  4. "Archived copy" (PDF). Archived from the original (PDF) on 2012-01-31. Retrieved 2012-03-22.{{cite web}}: CS1 maint: archived copy as title (link)
  5. Specht, D. F. (1967-06-01). "Generation of Polynomial Discriminant Functions for Pattern Recognition". IEEE Transactions on Electronic Computers. EC-16 (3): 308–319. doi:10.1109/PGEC.1967.264667. ISSN   0367-7508.
  6. Specht, D. F. (1990). "Probabilistic neural networks". Neural Networks. 3: 109–118. doi:10.1016/0893-6080(90)90049-Q.
  7. "Probabilistic Neural Networks :: Radial Basis Networks (Neural Network Toolbox™)". www.mathworks.in. Archived from the original on 4 August 2012. Retrieved 6 June 2022.
  8. "Probabilistic and General Regression Neural Networks". Archived from the original on 2012-03-02. Retrieved 2012-03-22.
  9. Tran, D. H.; Ng, A. W. M.; Perera, B. J. C.; Burn, S.; Davis, P. (September 2006). "Application of probabilistic neural networks in modelling structural deterioration of stormwater pipes" (PDF). Urban Water Journal. 3 (3): 175–184. doi:10.1080/15730620600961684. S2CID   15220500. Archived from the original (PDF) on 8 August 2017. Retrieved 27 February 2023.
  10. Li, Q. B.; Li, X.; Zhang, G. J.; Xu, Y. Z.; Wu, J. G.; Sun, X. J. (2009). "[Application of probabilistic neural networks method to gastric endoscope samples diagnosis based on FTIR spectroscopy]". Guang Pu Xue Yu Guang Pu Fen Xi. 29 (6): 1553–7. PMID   19810529.
  11. Berno, E.; Brambilla, L.; Canaparo, R.; Casale, F.; Costa, M.; Della Pepa, C.; Eandi, M.; Pasero, E. (2003). "Application of probabilistic neural networks to population pharmacokineties". Proceedings of the International Joint Conference on Neural Networks, 2003. pp. 2637–2642. doi:10.1109/IJCNN.2003.1223983. ISBN   0-7803-7898-9. S2CID   60477107.
  12. Huang, Chenn-Jung; Liao, Wei-Chen (2004). "Application of Probabilistic Neural Networks to the Class Prediction of Leukemia and Embryonal Tumor of Central Nervous System". Neural Processing Letters. 19 (3): 211–226. doi:10.1023/B:NEPL.0000035613.51734.48. S2CID   5651402.
  13. Araghi, Leila Fallah; d Khaloozade, Hami; Arvan, Mohammad Reza (19 March 2009). "Ship Identification Using Probabilistic Neural Networks (PNN)" (PDF). Proceedings of the International MultiConference of Engineers and Computer Scientists. Hong Kong,China. 2. Retrieved 27 February 2023.
  14. "Archived copy" (PDF). Archived from the original (PDF) on 2010-06-14. Retrieved 2012-03-22.{{cite web}}: CS1 maint: archived copy as title (link)
  15. Zhang, Y. (2009). "Remote-sensing Image Classification Based on an Improved Probabilistic Neural Network". Sensors. 9 (9): 7516–7539. Bibcode:2009Senso...9.7516Z. doi: 10.3390/s90907516 . PMC   3290485 . PMID   22400006.