This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages) |
Yasuo Matsuyama | |
---|---|
Born | Yokohama, Japan | March 23, 1947
Nationality | Japanese |
Alma mater | Waseda University (Dr. Engineering, 1974) Stanford University (PhD, 1978) |
Known for | Alpha-EM algorithm |
'Scientific career | |
Fields | Machine learning and human-aware information processing |
Institutions | Waseda University, Stanford University |
Thesis | 'Studies on Stochastic Modeling of Neurons (Dr. Engineering from Waseda University) . Process Distortion Measures and Signal Processing (PhD from Stanford University). |
Doctoral advisor | Waseda University: Jun'ichi Takagi, Kageo Akizuki, and Kastuhiko Shirai for Dr. Engineering Stanford University: Robert M. Gray for PhD |
Website | http://www.f.waseda.jp/yasuo2/en/index.html |
Yasuo Matsuyama (born March 23, 1947) is a Japanese researcher in machine learning and human-aware information processing.
Matsuyama is a Professor Emeritus and an Honorary Researcher of the Research Institute of Science and Engineering of Waseda University.
Matsuyama received his bachelor’s, master’s and doctoral degrees in electrical engineering from Waseda University in 1969, 1971, and 1974 respectively. The dissertation title for the Doctor of Engineering is Studies on Stochastic Modeling of Neurons. [1] There, he contributed to the spiking neurons with stochastic pulse-frequency modulation. Advisors were Jun’ichi Takagi, Kageo, Akizuki, and Katsuhiko Shirai.
Upon the completion of the doctoral work at Waseda University, he was dispatched to the United States as a Japan-U.S. exchange fellow by the joint program of the Japan Society for the Promotion of Science, Fulbright Program, and the Institute of International Education. Through this exchange program, he completed his Ph.D. program at Stanford University in 1978. The dissertation title is Process Distortion Measures and Signal Processing. [2] There, he contributed to the theory of probabilistic distortion measures and its applications to speech encoding with spectral clustering or vector quantization. His advisor was Robert. M. Gray.
From 1977 to 1078, Matsuyama was a research assistant at the Information Systems Laboratory of Stanford University.
From 1979 to 1996, he was a faculty of Ibaraki University, Japan (the final position was a professor and chairperson of the Information and System Sciences Major).
Since 1996, he was a Professor of Waseda University, Department of Computer Science and Engineering. From 2011 to 2013, he was the director of the Media Network Center of Waseda University. At the 2011 Tōhoku earthquake and tsunami of March 11, 2011, he was in charge of the safety inquiry of 65,000 students, staffs and faculties.
Since 2017, Matsuyama is a Professor Emeritus and an Honorary Researcher of the Research Institute of Science and Engineering of Waseda University. Since 2018, he serves as an acting president of the Waseda Electrical Engineering Society.
Matsuyama’s works on machine learning and human-aware information processing have dual foundations. Studies on the competitive learning (vector quantization) for his Ph.D. at Stanford University brought about his succeeding works on machine learning contributions. Studies on stochastic spiking neurons [3] [4] for his Dr. Engineering at Waseda University set off applications of biological signals to the machine learning. Thus, his works can be grouped reflecting these dual foundations.
Statistical machine learning algorithms: The use of the alpha-logarithmic likelihood ratio in learning cycles generated the alpha-EM algorithm (alpha-Expectation maximization algorithm). [5] Because the alpha-logarithm includes the usual logarithm, the alpha-EM algorithm contains the EM-algorithm (more precisely, the log-EM algorithm). The merit of the speedup by the alpha-EM over the log-EM is due to the ability to utilize the past information. Such a usage of the messages from the past brought about the alpha-HMM estimation algorithm (alpha-hidden Markov model estimation algorithm) [6] that is a generalized and faster version of the hidden Markov model estimation algorithm (HMM estimation algorithm).
Competitive learning on empirical data: Starting from the speech compression studies at Stanford, Matsuyama developed generalized competitive learning algorithms; the harmonic competition [7] and the multiple descent cost competition. [8] The former realizes the multiple-object optimization. The latter admits deformable centroids. Both algorithms generalize the batch-mode vector quantization (simply called, vector quantization) and the successive-mode vector quantization (or, called learning vector quantization).
A hierarchy from the alpha-EM to the vector quantization: Matsuyama contributed to generate and identify the hierarchy of the above algorithms.
On the class of the vector quantization and competitive learning, he contributed to generate and identify the hierarchy of VQs.
Applications to Human-aware information processing: The dual foundations of his led to the applications to huma-aware information processing.
The above theories and applications work as contributions to IoCT (Internet of Collaborative Things) and IoXT (http://www.asc-events.org/ASC17/Workshop.php).
Artificial neural networks are a branch of machine learning models that are built using principles of neuronal organization discovered by connectionism in the biological neural networks constituting animal brains.
Speech processing is the study of speech signals and the processing methods of signals. The signals are usually processed in a digital representation, so speech processing can be regarded as a special case of digital signal processing, applied to speech signals. Aspects of speech processing includes the acquisition, manipulation, storage, transfer and output of speech signals. Different speech processing tasks include speech recognition, speech synthesis, speaker diarization, speech enhancement, speaker recognition, etc.
Vector quantization (VQ) is a classical quantization technique from signal processing that allows the modeling of probability density functions by the distribution of prototype vectors. It was originally used for data compression. It works by dividing a large set of points (vectors) into groups having approximately the same number of points closest to them. Each group is represented by its centroid point, as in k-means and some other clustering algorithms.
A self-organizing map (SOM) or self-organizing feature map (SOFM) is an unsupervised machine learning technique used to produce a low-dimensional representation of a higher dimensional data set while preserving the topological structure of the data. For example, a data set with variables measured in observations could be represented as clusters of observations with similar values for the variables. These clusters then could be visualized as a two-dimensional "map" such that observations in proximal clusters have more similar values than observations in distal clusters. This can make high-dimensional data easier to visualize and analyze.
Unsupervised learning, is paradigm in machine learning where, in contrast to supervised learning and semi-supervised learning, algorithms learn patterns exclusively from unlabeled data.
Neuromorphic computing is an approach to computing that is inspired by the structure and function of the human brain. A neuromorphic computer/chip is any device that uses physical artificial neurons to do computations. In recent times, the term neuromorphic has been used to describe analog, digital, mixed-mode analog/digital VLSI, and software systems that implement models of neural systems. The implementation of neuromorphic computing on the hardware level can be realized by oxide-based memristors, spintronic memories, threshold switches, transistors, among others. Training software-based neuromorphic systems of spiking neural networks can be achieved using error backpropagation, e.g., using Python based frameworks such as snnTorch, or using canonical learning rules from the biological learning literature, e.g., using BindsNet.
Bart Andrew Kosko is a writer and professor of electrical engineering and law at the University of Southern California (USC). He is a researcher and popularizer of fuzzy logic, neural networks, and noise, and the author of several trade books and textbooks on these and related subjects of machine intelligence. He was awarded the 2022 Donald O. Hebb Award for neural learning by the International Neural Network Society.
A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. The term "recurrent neural network" is used to refer to the class of networks with an infinite impulse response, whereas "convolutional neural network" refers to the class of finite impulse response. Both classes of networks exhibit temporal dynamic behavior. A finite impulse recurrent network is a directed acyclic graph that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a directed cyclic graph that can not be unrolled.
The Linde–Buzo–Gray algorithm is a vector quantization algorithm to derive a good codebook.
In wireless communications, channel state information (CSI) is the known channel properties of a communication link. This information describes how a signal propagates from the transmitter to the receiver and represents the combined effect of, for example, scattering, fading, and power decay with distance. The method is called channel estimation. The CSI makes it possible to adapt transmissions to current channel conditions, which is crucial for achieving reliable communication with high data rates in multiantenna systems.
ADALINE is an early single-layer artificial neural network and the name of the physical device that implemented this network. The network uses memistors. It was developed by professor Bernard Widrow and his doctoral student Ted Hoff at Stanford University in 1960. It is based on the perceptron. It consists of a weight, a bias and a summation function.
An echo state network (ESN) is a type of reservoir computer that uses a recurrent neural network with a sparsely connected hidden layer. The connectivity and weights of hidden neurons are fixed and randomly assigned. The weights of output neurons can be learned so that the network can produce or reproduce specific temporal patterns. The main interest of this network is that although its behaviour is non-linear, the only weights that are modified during training are for the synapses that connect the hidden neurons to output neurons. Thus, the error function is quadratic with respect to the parameter vector and can be differentiated easily to a linear system.
In various science/engineering applications, such as independent component analysis, image analysis, genetic analysis, speech recognition, manifold learning, and time delay estimation it is useful to estimate the differential entropy of a system or process, given some observations.
There are many types of artificial neural networks (ANN).
An adaptive neuro-fuzzy inference system or adaptive network-based fuzzy inference system (ANFIS) is a kind of artificial neural network that is based on Takagi–Sugeno fuzzy inference system. The technique was developed in the early 1990s. Since it integrates both neural networks and fuzzy logic principles, it has potential to capture the benefits of both in a single framework.
Deep learning is part of a broader family of machine learning methods, which is based on artificial neural networks with representation learning. The adjective "deep" in deep learning refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
An artificial neural network's learning rule or learning process is a method, mathematical logic or algorithm which improves the network's performance and/or training time. Usually, this rule is applied repeatedly over the network. It is done by updating the weights and bias levels of a network when a network is simulated in a specific data environment. A learning rule may accept existing conditions of the network and will compare the expected result and actual result of the network to give new and improved values for weights and bias. Depending on the complexity of actual model being simulated, the learning rule of the network can be as simple as an XOR gate or mean squared error, or as complex as the result of a system of differential equations.
In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task.
Convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns feature engineering by itself via filters optimization. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles Higher-layer features are extracted from wider context windows, compared to lower-layer features.
The following outline is provided as an overview of and topical guide to machine learning. Machine learning is a subfield of soft computing within computer science that evolved from the study of pattern recognition and computational learning theory in artificial intelligence. In 1959, Arthur Samuel defined machine learning as a "field of study that gives computers the ability to learn without being explicitly programmed". Machine learning explores the study and construction of algorithms that can learn from and make predictions on data. Such algorithms operate by building a model from an example training set of input observations in order to make data-driven predictions or decisions expressed as outputs, rather than following strictly static program instructions.