Self-organizing map

Last updated

A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional (typically two-dimensional) representation of a higher-dimensional data set while preserving the topological structure of the data. For example, a data set with variables measured in observations could be represented as clusters of observations with similar values for the variables. These clusters then could be visualized as a two-dimensional "map" such that observations in proximal clusters have more similar values than observations in distal clusters. This can make high-dimensional data easier to visualize and analyze.

Contents

An SOM is a type of artificial neural network but is trained using competitive learning rather than the error-correction learning (e.g., backpropagation with gradient descent) used by other artificial neural networks. The SOM was introduced by the Finnish professor Teuvo Kohonen in the 1980s and therefore is sometimes called a Kohonen map or Kohonen network. [1] [2] The Kohonen map or network is a computationally convenient abstraction building on biological models of neural systems from the 1970s [3] and morphogenesis models dating back to Alan Turing in the 1950s. [4] SOMs create internal representations reminiscent of the cortical homunculus [ citation needed ], a distorted representation of the human body, based on a neurological "map" of the areas and proportions of the human brain dedicated to processing sensory functions, for different parts of the body.

A self-organizing map showing U.S. Congress voting patterns. The input data was a table with a row for each member of Congress, and columns for certain votes containing each member's yes/no/abstain vote. The SOM algorithm arranged these members in a two-dimensional grid placing similar members closer together. The first plot shows the grouping when the data are split into two clusters. The second plot shows average distance to neighbours: larger distances are darker. The third plot predicts Republican (red) or Democratic (blue) party membership. The other plots each overlay the resulting map with predicted values on an input dimension: red means a predicted 'yes' vote on that bill, blue means a 'no' vote. The plot was created in Synapse. Synapse Self-Organizing Map.png
A self-organizing map showing U.S. Congress voting patterns. The input data was a table with a row for each member of Congress, and columns for certain votes containing each member's yes/no/abstain vote. The SOM algorithm arranged these members in a two-dimensional grid placing similar members closer together. The first plot shows the grouping when the data are split into two clusters. The second plot shows average distance to neighbours: larger distances are darker. The third plot predicts Republican (red) or Democratic (blue) party membership. The other plots each overlay the resulting map with predicted values on an input dimension: red means a predicted 'yes' vote on that bill, blue means a 'no' vote. The plot was created in Synapse.

Overview

Self-organizing maps, like most artificial neural networks, operate in two modes: training and mapping. First, training uses an input data set (the "input space") to generate a lower-dimensional representation of the input data (the "map space"). Second, mapping classifies additional input data using the generated map.

In most cases, the goal of training is to represent an input space with p dimensions as a map space with two dimensions. Specifically, an input space with p variables is said to have p dimensions. A map space consists of components called "nodes" or "neurons", which are arranged as a hexagonal or rectangular grid with two dimensions. [5] The number of nodes and their arrangement are specified beforehand based on the larger goals of the analysis and exploration of the data.

Each node in the map space is associated with a "weight" vector, which is the position of the node in the input space. While nodes in the map space stay fixed, training consists in moving weight vectors toward the input data (reducing a distance metric such as Euclidean distance) without spoiling the topology induced from the map space. After training, the map can be used to classify additional observations for the input space by finding the node with the closest weight vector (smallest distance metric) to the input space vector.

Learning algorithm

The goal of learning in the self-organizing map is to cause different parts of the network to respond similarly to certain input patterns. This is partly motivated by how visual, auditory or other sensory information is handled in separate parts of the cerebral cortex in the human brain. [6]

An illustration of the training of a self-organizing map. The blue blob is the distribution of the training data, and the small white disc is the current training datum drawn from that distribution. At first (left) the SOM nodes are arbitrarily positioned in the data space. The node (highlighted in yellow) which is nearest to the training datum is selected. It is moved towards the training datum, as (to a lesser extent) are its neighbors on the grid. After many iterations the grid tends to approximate the data distribution (right). Somtraining.svg
An illustration of the training of a self-organizing map. The blue blob is the distribution of the training data, and the small white disc is the current training datum drawn from that distribution. At first (left) the SOM nodes are arbitrarily positioned in the data space. The node (highlighted in yellow) which is nearest to the training datum is selected. It is moved towards the training datum, as (to a lesser extent) are its neighbors on the grid. After many iterations the grid tends to approximate the data distribution (right).

The weights of the neurons are initialized either to small random values or sampled evenly from the subspace spanned by the two largest principal component eigenvectors. With the latter alternative, learning is much faster because the initial weights already give a good approximation of SOM weights. [7]

The network must be fed a large number of example vectors that represent, as close as possible, the kinds of vectors expected during mapping. The examples are usually administered several times as iterations.

The training utilizes competitive learning. When a training example is fed to the network, its Euclidean distance to all weight vectors is computed. The neuron whose weight vector is most similar to the input is called the best matching unit (BMU). The weights of the BMU and neurons close to it in the SOM grid are adjusted towards the input vector. The magnitude of the change decreases with time and with the grid-distance from the BMU. The update formula for a neuron v with weight vector Wv(s) is

,

where s is the step index, t is an index into the training sample, u is the index of the BMU for the input vector D(t), α(s) is a monotonically decreasing learning coefficient; θ(u, v, s) is the neighborhood function which gives the distance between the neuron u and the neuron v in step s. [8] Depending on the implementations, t can scan the training data set systematically (t is 0, 1, 2...T-1, then repeat, T being the training sample's size), be randomly drawn from the data set (bootstrap sampling), or implement some other sampling method (such as jackknifing).

The neighborhood function θ(u, v, s) (also called function of lateral interaction) depends on the grid-distance between the BMU (neuron u) and neuron v. In the simplest form, it is 1 for all neurons close enough to BMU and 0 for others, but the Gaussian and Mexican-hat [9] functions are common choices, too. Regardless of the functional form, the neighborhood function shrinks with time. [6] At the beginning when the neighborhood is broad, the self-organizing takes place on the global scale. When the neighborhood has shrunk to just a couple of neurons, the weights are converging to local estimates. In some implementations, the learning coefficient α and the neighborhood function θ decrease steadily with increasing s, in others (in particular those where t scans the training data set) they decrease in step-wise fashion, once every T steps.

Training process of SOM on a two-dimensional data set TrainSOM.gif
Training process of SOM on a two-dimensional data set

This process is repeated for each input vector for a (usually large) number of cycles λ. The network winds up associating output nodes with groups or patterns in the input data set. If these patterns can be named, the names can be attached to the associated nodes in the trained net.

During mapping, there will be one single winning neuron: the neuron whose weight vector lies closest to the input vector. This can be simply determined by calculating the Euclidean distance between input vector and weight vector.

While representing input data as vectors has been emphasized in this article, any kind of object which can be represented digitally, which has an appropriate distance measure associated with it, and in which the necessary operations for training are possible can be used to construct a self-organizing map. This includes matrices, continuous functions or even other self-organizing maps.

Algorithm

  1. Randomize the node weight vectors in a map
  2. For
    1. Randomly pick an input vector
    2. Find the node in the map closest to the input vector. This node is the best matching unit (BMU). Denote it by
    3. For each node , update its vector by pulling closer to the input vector:

The variable names mean the following, with vectors in bold,

The key design choices are the shape of the SOM, the neighbourhood function, and the learning rate schedule. The idea of the neighborhood function is to make it such that the BMU is updated the most, its immediate neighbors are updated a little less, and so on. The idea of the learning rate schedule is to make it so that the map updates are large at the start, and gradually stop updating.

For example, if we want to learn a SOM using a square grid, we can index it using where both . The neighborhood function can make it so that the BMU updates in full, the nearest neighbors update in half, and their neighbors update in half again, etc.And we can use a simple linear learning rate schedule .

Notice in particular, that the update rate does not depend on where the point is in the Euclidean space, only on where it is in the SOM itself. For example, the points are close on the SOM, so they will always update in similar ways, even when they are far apart on the Euclidean space. In contrast, even if the points end up overlapping each other (such as if the SOM looks like a folded towel), they still do not update in similar ways.

Alternative algorithm

  1. Randomize the map's nodes' weight vectors
  2. Traverse each input vector in the input data set
    1. Traverse each node in the map
      1. Use the Euclidean distance formula to find the similarity between the input vector and the map's node's weight vector
      2. Track the node that produces the smallest distance (this node is the best matching unit, BMU)
    2. Update the nodes in the neighborhood of the BMU (including the BMU itself) by pulling them closer to the input vector
  3. Increase and repeat from step 2 while

Initialization options

Selection of initial weights as good approximations of the final weights is a well-known problem for all iterative methods of artificial neural networks, including self-organizing maps. Kohonen originally proposed random initiation of weights. [10] (This approach is reflected by the algorithms described above.) More recently, principal component initialization, in which initial map weights are chosen from the space of the first principal components, has become popular due to the exact reproducibility of the results. [11]

Cartographical representation of a self-organizing map (U-Matrix) based on Wikipedia featured article data (word frequency). Distance is inversely proportional to similarity. The "mountains" are edges between clusters. The red lines are links between articles. Self oraganizing map cartography.jpg
Cartographical representation of a self-organizing map (U-Matrix) based on Wikipedia featured article data (word frequency). Distance is inversely proportional to similarity. The "mountains" are edges between clusters. The red lines are links between articles.

A careful comparison of random initialization to principal component initialization for a one-dimensional map, however, found that the advantages of principal component initialization are not universal. The best initialization method depends on the geometry of the specific dataset. Principal component initialization was preferable (for a one-dimensional map) when the principal curve approximating the dataset could be univalently and linearly projected on the first principal component (quasilinear sets). For nonlinear datasets, however, random initiation performed better. [12]

Interpretation

One-dimensional SOM versus principal component analysis (PCA) for data approximation. SOM is a red broken line with squares, 20 nodes. The first principal component is presented by a blue line. Data points are the small grey circles. For PCA, the fraction of variance unexplained in this example is 23.23%, for SOM it is 6.86%. SOMsPCA.PNG
One-dimensional SOM versus principal component analysis (PCA) for data approximation. SOM is a red broken line with squares, 20 nodes. The first principal component is presented by a blue line. Data points are the small grey circles. For PCA, the fraction of variance unexplained in this example is 23.23%, for SOM it is 6.86%.

There are two ways to interpret a SOM. Because in the training phase weights of the whole neighborhood are moved in the same direction, similar items tend to excite adjacent neurons. Therefore, SOM forms a semantic map where similar samples are mapped close together and dissimilar ones apart. This may be visualized by a U-Matrix (Euclidean distance between weight vectors of neighboring cells) of the SOM. [14] [15] [16]

The other way is to think of neuronal weights as pointers to the input space. They form a discrete approximation of the distribution of training samples. More neurons point to regions with high training sample concentration and fewer where the samples are scarce.

SOM may be considered a nonlinear generalization of Principal components analysis (PCA). [17] It has been shown, using both artificial and real geophysical data, that SOM has many advantages [18] [19] over the conventional feature extraction methods such as Empirical Orthogonal Functions (EOF) or PCA.

Originally, SOM was not formulated as a solution to an optimisation problem. Nevertheless, there have been several attempts to modify the definition of SOM and to formulate an optimisation problem which gives similar results. [20] For example, Elastic maps use the mechanical metaphor of elasticity to approximate principal manifolds: [21] the analogy is an elastic membrane and plate.

Examples

Alternative approaches

See also

Further reading

Related Research Articles

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

Unsupervised learning is a framework in machine learning where, in contrast to supervised learning, algorithms learn patterns exclusively from unlabeled data. Other frameworks in the spectrum of supervisions include weak- or semi-supervision, where a small portion of the data is tagged, and self-supervision. Some researchers consider self-supervised learning a form of unsupervised learning.

<span class="mw-page-title-main">Nonlinear dimensionality reduction</span> Projection of data onto lower-dimensional manifolds

Nonlinear dimensionality reduction, also known as manifold learning, is any of various related techniques that aim to project high-dimensional data onto lower-dimensional latent manifolds, with the goal of either visualizing the data in the low-dimensional space, or learning the mapping itself. The techniques described below can be understood as generalizations of linear decomposition methods used for dimensionality reduction, such as singular value decomposition and principal component analysis.

<span class="mw-page-title-main">Artificial neuron</span> Mathematical function conceived as a crude model

An artificial neuron is a mathematical function conceived as a model of biological neurons in a neural network. Artificial neurons are the elementary units of artificial neural networks. The artificial neuron is a function that receives one or more inputs, applies weights to these inputs, and sums them to produce an output.

In computer science, learning vector quantization (LVQ) is a prototype-based supervised classification algorithm. LVQ is the supervised counterpart of vector quantization systems.

A Hopfield network is a spin glass system used to model neural networks, based on Ernst Ising's work with Wilhelm Lenz on the Ising model of magnetic materials. Hopfield networks were first described with respect to recurrent neural networks independently by Kaoru Nakano in 1971 and Shun'ichi Amari in 1972, and with respect to biological neural networks by William Little in 1974, and were popularised by John Hopfield in 1982. Hopfield networks serve as content-addressable ("associative") memory systems with binary threshold nodes, or with continuous variables. Hopfield networks also provide a model for understanding human memory.

In machine learning, backpropagation is a gradient estimation method commonly used for training neural networks to compute the network parameter updates.

<span class="mw-page-title-main">Teuvo Kohonen</span> Finnish computer scientist (1934–2021)

Teuvo Kalevi Kohonen was a Finnish computer scientist. He was professor emeritus of the Academy of Finland.

Recurrent neural networks (RNNs) are a class of artificial neural networks for sequential data processing. Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.

<span class="mw-page-title-main">Feedforward neural network</span> One of two broad types of artificial neural network

A feedforward neural network (FNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. Its flow is uni-directional, meaning that the information in the model flows in only one direction—forward—from the input nodes, through the hidden nodes and to the output nodes, without any cycles or loops, in contrast to recurrent neural networks, which have a bi-directional flow. Modern feedforward networks are trained using the backpropagation method and are colloquially referred to as the "vanilla" neural networks.

Neural gas is an artificial neural network, inspired by the self-organizing map and introduced in 1991 by Thomas Martinetz and Klaus Schulten. The neural gas is a simple algorithm for finding optimal data representations based on feature vectors. The algorithm was coined "neural gas" because of the dynamics of the feature vectors during the adaptation process, which distribute themselves like a gas within the data space. It is applied where data compression or vector quantization is an issue, for example speech recognition, image processing or pattern recognition. As a robustly converging alternative to the k-means clustering it is also used for cluster analysis.

Oja's learning rule, or simply Oja's rule, named after Finnish computer scientist Erkki Oja, is a model of how neurons in the brain or in artificial neural networks change connection strength, or learn, over time. It is a modification of the standard Hebb's Rule that, through multiplicative normalization, solves all stability problems and generates an algorithm for principal components analysis. This is a computational form of an effect which is believed to happen in biological neurons.

Competitive learning is a form of unsupervised learning in artificial neural networks, in which nodes compete for the right to respond to a subset of the input data. A variant of Hebbian learning, competitive learning works by increasing the specialization of each node in the network. It is well suited to finding clusters within data.

There are many types of artificial neural networks (ANN).

The U-matrix is a representation of a self-organizing map (SOM) where the Euclidean distance between the codebook vectors of neighboring neurons is depicted in a grayscale image. This image is used to visualize the data in a high-dimensional space using a 2D image.

<span class="mw-page-title-main">Growing self-organizing map</span> Growing variant of a self-organizing map

A growing self-organizing map (GSOM) is a growing variant of a self-organizing map (SOM). The GSOM was developed to address the issue of identifying a suitable map size in the SOM. It starts with a minimal number of nodes and grows new nodes on the boundary based on a heuristic. By using the value called Spread Factor (SF), the data analyst has the ability to control the growth of the GSOM.

Extension neural network is a pattern recognition method found by M. H. Wang and C. P. Hung in 2003 to classify instances of data sets. Extension neural network is composed of artificial neural network and extension theory concepts. It uses the fast and adaptive learning capability of neural network and correlation estimation property of extension theory by calculating extension distance.
ENN was used in:

An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weights and bias levels of a network when a network is simulated in a specific data environment. A learning rule may accept existing conditions of the network and will compare the expected result and actual result of the network to give new and improved values for weights and bias. Depending on the complexity of actual model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations.

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns features by itself via filter optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently have been replaced -- in some cases -- by more recent deep learning architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

Fusion adaptive resonance theory (fusion ART) is a generalization of self-organizing neural networks known as the original Adaptive Resonance Theory models for learning recognition categories across multiple pattern channels. There is a separate stream of work on fusion ARTMAP, that extends fuzzy ARTMAP consisting of two fuzzy ART modules connected by an inter-ART map field to an extended architecture consisting of multiple ART modules.

References

  1. Kohonen, Teuvo; Honkela, Timo (2007). "Kohonen Network". Scholarpedia. 2 (1): 1568. Bibcode:2007SchpJ...2.1568K. doi: 10.4249/scholarpedia.1568 .
  2. Kohonen, Teuvo (1982). "Self-Organized Formation of Topologically Correct Feature Maps". Biological Cybernetics. 43 (1): 59–69. doi:10.1007/bf00337288. S2CID   206775459.
  3. Von der Malsburg, C (1973). "Self-organization of orientation sensitive cells in the striate cortex". Kybernetik. 14 (2): 85–100. doi:10.1007/bf00288907. PMID   4786750. S2CID   3351573.
  4. Turing, Alan (1952). "The chemical basis of morphogenesis". Phil. Trans. R. Soc. 237 (641): 37–72. Bibcode:1952RSPTB.237...37T. doi:10.1098/rstb.1952.0012.
  5. Jaakko Hollmen (9 March 1996). "Self-Organizing Map (SOM)". Aalto University .
  6. 1 2 Haykin, Simon (1999). "9. Self-organizing maps". Neural networks - A comprehensive foundation (2nd ed.). Prentice-Hall. ISBN   978-0-13-908385-3.
  7. Kohonen, Teuvo (2005). "Intro to SOM". SOM Toolbox. Retrieved 2006-06-18.
  8. Kohonen, Teuvo; Honkela, Timo (2011). "Kohonen network". Scholarpedia. 2 (1): 1568. Bibcode:2007SchpJ...2.1568K. doi: 10.4249/scholarpedia.1568 .
  9. Vrieze, O.J. (1995). "Kohonen Network" (PDF). Artificial Neural Networks. Lecture Notes in Computer Science. Vol. 931. University of Limburg, Maastricht. pp. 83–100. doi:10.1007/BFb0027024. ISBN   978-3-540-59488-8 . Retrieved 1 July 2020.{{cite book}}: |website= ignored (help)
  10. Kohonen, T. (2012) [1988]. Self-Organization and Associative Memory (2nd ed.). Springer. ISBN   978-3-662-00784-6.
  11. Ciampi, A.; Lechevallier, Y. (2000). "Clustering large, multi-level data sets: An approach based on Kohonen self organizing maps". In Zighed, D.A.; Komorowski, J.; Zytkow, J. (eds.). Principles of Data Mining and Knowledge Discovery: 4th European Conference, PKDD 2000 Lyon, France, September 13–16, 2000 Proceedings. Lecture notes in computer science. Vol. 1910. Springer. pp. 353–358. doi: 10.1007/3-540-45372-5_36 . ISBN   3-540-45372-5.
  12. Akinduko, A.A.; Mirkes, E.M.; Gorban, A.N. (2016). "SOM: Stochastic initialization versus principal components". Information Sciences. 364–365: 213–221. doi:10.1016/j.ins.2015.10.013.
  13. Illustration is prepared using free software: Mirkes, Evgeny M.; Principal Component Analysis and Self-Organizing Maps: applet, University of Leicester, 2011
  14. Ultsch, Alfred; Siemon, H. Peter (1990). "Kohonen's Self Organizing Feature Maps for Exploratory Data Analysis". In Widrow, Bernard; Angeniol, Bernard (eds.). Proceedings of the International Neural Network Conference (INNC-90), Paris, France, July 9–13, 1990. Vol. 1. Dordrecht, Netherlands: Kluwer. pp.  305–308. ISBN   978-0-7923-0831-7.
  15. Ultsch, Alfred (2003). U*-Matrix: A tool to visualize clusters in high dimensional data (Technical report). Department of Computer Science, University of Marburg. pp. 1–12. 36.
  16. Saadatdoost, Robab; Sim, Alex Tze Hiang; Jafarkarimi, Hosein (2011). "Application of self organizing map for knowledge discovery based in higher education data". Research and Innovation in Information Systems (ICRIIS), 2011 International Conference on. IEEE. doi:10.1109/ICRIIS.2011.6125693. ISBN   978-1-61284-294-3.
  17. Yin, Hujun. "Learning Nonlinear Principal Manifolds by Self-Organising Maps". Gorban et al. 2008 .
  18. Liu, Yonggang; Weisberg, Robert H (2005). "Patterns of Ocean Current Variability on the West Florida Shelf Using the Self-Organizing Map". Journal of Geophysical Research. 110 (C6): C06003. Bibcode:2005JGRC..110.6003L. doi: 10.1029/2004JC002786 .
  19. Liu, Yonggang; Weisberg, Robert H.; Mooers, Christopher N. K. (2006). "Performance Evaluation of the Self-Organizing Map for Feature Extraction". Journal of Geophysical Research. 111 (C5): C05018. Bibcode:2006JGRC..111.5018L. doi: 10.1029/2005jc003117 .
  20. Heskes, Tom (1999). "Energy Functions for Self-Organizing Maps". In Oja, Erkki; Kaski, Samuel (eds.). Kohonen Maps. Elsevier. pp. 303–315. doi:10.1016/B978-044450270-4/50024-3. ISBN   978-044450270-4.
  21. Gorban, Alexander N.; Kégl, Balázs; Wunsch, Donald C.; Zinovyev, Andrei, eds. (2008). Principal Manifolds for Data Visualization and Dimension Reduction. Lecture Notes in Computer Science and Engineering. Vol. 58. Springer. ISBN   978-3-540-73749-0.
  22. Zheng, G.; Vaishnavi, V. (2011). "A Multidimensional Perceptual Map Approach to Project Prioritization and Selection". AIS Transactions on Human-Computer Interaction. 3 (2): 82–103. doi: 10.17705/1thci.00028 .
  23. Taner, M. T.; Walls, J. D.; Smith, M.; Taylor, G.; Carr, M. B.; Dumas, D. (2001). "Reservoir characterization by calibration of self-organized map clusters". SEG Technical Program Expanded Abstracts 2001. Vol. 2001. pp. 1552–1555. doi:10.1190/1.1816406. S2CID   59155082.
  24. Chang, Wui Lee; Pang, Lie Meng; Tay, Kai Meng (March 2017). "Application of Self-Organizing Map to Failure Modes and Effects Analysis Methodology" (PDF). Neurocomputing. 249: 314–320. doi:10.1016/j.neucom.2016.04.073.
  25. Park, Young-Seuk; Tison, Juliette; Lek, Sovan; Giraudel, Jean-Luc; Coste, Michel; Delmas, François (2006-11-01). "Application of a self-organizing map to select representative species in multivariate analysis: A case study determining diatom distribution patterns across France". Ecological Informatics. 4th International Conference on Ecological Informatics. 1 (3): 247–257. Bibcode:2006EcInf...1..247P. doi:10.1016/j.ecoinf.2006.03.005. ISSN   1574-9541.
  26. Yilmaz, Hasan Ümitcan; Fouché, Edouard; Dengiz, Thomas; Krauß, Lucas; Keles, Dogan; Fichtner, Wolf (2019-04-01). "Reducing energy time series for energy system models via self-organizing maps". It - Information Technology. 61 (2–3): 125–133. doi:10.1515/itit-2019-0025. ISSN   2196-7032. S2CID   203160544.
  27. Kaski, Samuel (1997). "Data Exploration Using Self-Organizing Maps". Acta Polytechnica Scandinavica. Mathematics, Computing and Management in Engineering Series. 82. Espoo, Finland: Finnish Academy of Technology. ISBN   978-952-5148-13-8.
  28. Alahakoon, D.; Halgamuge, S.K.; Sirinivasan, B. (2000). "Dynamic Self Organizing Maps With Controlled Growth for Knowledge Discovery". IEEE Transactions on Neural Networks. 11 (3): 601–614. doi:10.1109/72.846732. PMID   18249788.
  29. Liou, C.-Y.; Tai, W.-P. (2000). "Conformality in the self-organization network". Artificial Intelligence. 116 (1–2): 265–286. doi:10.1016/S0004-3702(99)00093-4.
  30. Liou, C.-Y.; Kuo, Y.-T. (2005). "Conformal Self-organizing Map for a Genus Zero Manifold". The Visual Computer. 21 (5): 340–353. doi:10.1007/s00371-005-0290-6. S2CID   8677589.
  31. Shah-Hosseini, Hamed; Safabakhsh, Reza (April 2003). "TASOM: A New Time Adaptive Self-Organizing Map". IEEE Transactions on Systems, Man, and Cybernetics - Part B: Cybernetics. 33 (2): 271–282. doi:10.1109/tsmcb.2003.810442. PMID   18238177.
  32. Shah-Hosseini, Hamed (May 2011). "Binary Tree Time Adaptive Self-Organizing Map". Neurocomputing. 74 (11): 1823–1839. doi:10.1016/j.neucom.2010.07.037.
  33. Gorban, A.N.; Zinovyev, A. (2010). "Principal manifolds and graphs in practice: from molecular biology to dynamical systems]". International Journal of Neural Systems . 20 (3): 219–232. arXiv: 1001.1122 . doi:10.1142/S0129065710002383. PMID   20556849. S2CID   2170982.
  34. Hua, H (2016). "Image and geometry processing with Oriented and Scalable Map". Neural Networks. 77: 1–6. doi:10.1016/j.neunet.2016.01.009. PMID   26897100.