Bruno Adolphus Olshausen | |
---|---|
Alma mater | Stanford University (BA, MA), California Institute of Technology (PhD) |
Scientific career | |
Thesis | Neural routing circuits for forming invariant representations of visual objects (1994) |
Bruno Adolphus Olshausen is an American neuroscientist and professor at the University of California, Berkeley, known for his work on computational neuroscience, vision science, and sparse coding. He currently serves as a Professor in the Helen Wills Neuroscience Institute and the UC Berkeley School of Optometry, with an affiliated appointment in Electrical Engineering and Computer Sciences. He is also the Director of the Redwood Center for Theoretical Neuroscience at UC Berkeley.
Olshausen received his B.S. and M.S. degrees in Electrical Engineering from Stanford University in 1986 and 1987 respectively. He earned his Ph.D. in Computation and Neural Systems from the California Institute of Technology in 1994. After completing his doctoral studies, he held postdoctoral positions at Department of Psychology, Cornell University and Center for Biological and Computational Learning, Massachusetts Institute of Technology. [1] [2]
Olshausen has served in several editorial and advisory roles. In 2009, he was awarded Fellowship of Wissenschaftskolleg zu Berlin and Fellowship of Canadian Institute for Advanced Research, Neural Computation and Adaptive Perception program.
His academic appointments include:
Olshausen's research focuses on understanding the information processing strategies employed by the visual system for tasks such as object recognition and scene analysis. His approach combines studying neural response properties with mathematical modeling to develop functional theories of vision. This work aims to both advance understanding of brain function and develop new algorithms for image analysis based on biological principles. He has also contributed to technological applications, including image and signal processing, alternatives to backpropagation for unsupervised learning, memory storage and computation, analog data compression systems, etc.
One of Olshausen's most significant contributions is demonstrating how the principle of sparse coding can explain response properties of neurons in visual cortex. His 1996 paper in Nature with David J. Field showed how simple cells in the V1 cortex receptive field properties could emerge from learning a sparse code for natural images. [3] This paper is based on two previous reports that gave additional technical details. [4] [5]
The paper argued that simple cells have Gabor-like, localized, oriented, and bandpass receptive fields. Previous methods, such as generalized Hebbian algorithm, obtains Fourier-like receptive fields that are not localized or oriented. But with sparse coding, such receptive fields do emerge.
Specifically, consider an image and some receptive fields . An image can be approximately represented as a linear sum of the receptive fields: . If so, then the image can be coded as , a code which may have better properties than directly coding for the pixel values of the image.
The algorithm proceeds as follows: [4]
The key part of the algorithm is the loss functionwhere the first term is image reconstruction loss, and the second term is the sparsity loss. Minimizing the first term leads to accurate image reconstruction, and minimizing the second term leads to sparse linear coefficients, that is, a vector with many almost-zero entries. The hyperparameter balances the importance of image reconstruction vs sparsity.
Based on the 1996 paper, he worked out a theory that the Gabor filters appearing in the V1 cortex performs sparse coding with overcomplete basis set, such that it is optimal for images occurring in the natural habitat of humans. [6] [7]
In mathematical physics, a lattice model is a mathematical model of a physical system that is defined on a lattice, as opposed to a continuum, such as the continuum of space or spacetime. Lattice models originally occurred in the context of condensed matter physics, where the atoms of a crystal automatically form a lattice. Currently, lattice models are quite popular in theoretical physics, for many reasons. Some models are exactly solvable, and thus offer insight into physics beyond what can be learned from perturbation theory. Lattice models are also ideal for study by the methods of computational physics, as the discretization of any continuum model automatically turns it into a lattice model. The exact solution to many of these models includes the presence of solitons. Techniques for solving these include the inverse scattering transform and the method of Lax pairs, the Yang–Baxter equation and quantum groups. The solution of these models has given insights into the nature of phase transitions, magnetization and scaling behaviour, as well as insights into the nature of quantum field theory. Physical lattice models frequently occur as an approximation to a continuum theory, either to give an ultraviolet cutoff to the theory to prevent divergences or to perform numerical computations. An example of a continuum theory that is widely studied by lattice models is the QCD lattice model, a discretization of quantum chromodynamics. However, digital physics considers nature fundamentally discrete at the Planck scale, which imposes upper limit to the density of information, aka Holographic principle. More generally, lattice gauge theory and lattice field theory are areas of study. Lattice models are also used to simulate the structure and dynamics of polymers.
In quantum field theory, the Dirac spinor is the spinor that describes all known fundamental particles that are fermions, with the possible exception of neutrinos. It appears in the plane-wave solution to the Dirac equation, and is a certain combination of two Weyl spinors, specifically, a bispinor that transforms "spinorially" under the action of the Lorentz group.
The great-circle distance, orthodromic distance, or spherical distance is the distance between two points on a sphere, measured along the great-circle arc between them. This arc is the shortest path between the two points on the surface of the sphere.
The Gram–Charlier A series, and the Edgeworth series are series that approximate a probability distribution in terms of its cumulants. The series are the same; but, the arrangement of terms differ. The key idea of these expansions is to write the characteristic function of the distribution whose probability density function f is to be approximated in terms of the characteristic function of a distribution with known and suitable properties, and to recover f through the inverse Fourier transform.
In theoretical physics, a supermultiplet is a representation of a supersymmetry algebra, possibly with extended supersymmetry.
In quantum field theory, a quartic interaction is a type of self-interaction in a scalar field. Other types of quartic interactions may be found under the topic of four-fermion interactions. A classical free scalar field satisfies the Klein–Gordon equation. If a scalar field is denoted , a quartic interaction is represented by adding a potential energy term to the Lagrangian density. The coupling constant is dimensionless in 4-dimensional spacetime.
In image processing, a Gabor filter, named after Dennis Gabor, who first proposed it as a 1D filter. The Gabor filter was first generalized to 2D by Gösta Granlund, by adding a reference direction. The Gabor filter is a linear filter used for texture analysis, which essentially means that it analyzes whether there is any specific frequency content in the image in specific directions in a localized region around the point or region of analysis. Frequency and orientation representations of Gabor filters are claimed by many contemporary vision scientists to be similar to those of the human visual system. They have been found to be particularly appropriate for texture representation and discrimination. In the spatial domain, a 2D Gabor filter is a Gaussian kernel function modulated by a sinusoidal plane wave.
In mathematics, especially in the area of algebra known as group theory, the holomorph of a group , denoted , is a group that simultaneously contains and its automorphism group . It provides interesting examples of groups, and allows one to treat group elements and group automorphisms in a uniform context. The holomorph can be described as a semidirect product or as a permutation group.
In physics and mathematics, the Gibbs measure, named after Josiah Willard Gibbs, is a probability measure frequently seen in many problems of probability theory and statistical mechanics. It is a generalization of the canonical ensemble to infinite systems. The canonical ensemble gives the probability of the system X being in state x as
Prony analysis was developed by Gaspard Riche de Prony in 1795. However, practical use of the method awaited the digital computer. Similar to the Fourier transform, Prony's method extracts valuable information from a uniformly sampled signal and builds a series of damped complex exponentials or damped sinusoids. This allows the estimation of frequency, amplitude, phase and damping components of a signal.
Probabilistic Computation Tree Logic (PCTL) is an extension of computation tree logic (CTL) that allows for probabilistic quantification of described properties. It has been defined in the paper by Hansson and Jonsson.
An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data. An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction, to generate lower-dimensional embeddings for subsequent use by other machine learning algorithms.
In the mathematical theory of artificial neural networks, universal approximation theorems are theorems of the following form: Given a family of neural networks, for each function from a certain function space, there exists a sequence of neural networks from the family, such that according to some criterion. That is, the family of neural networks is dense in the function space.
Least-squares support-vector machines (LS-SVM) for statistics and in statistical modeling, are least-squares versions of support-vector machines (SVM), which are a set of related supervised learning methods that analyze data and recognize patterns, and which are used for classification and regression analysis. In this version one finds the solution by solving a set of linear equations instead of a convex quadratic programming (QP) problem for classical SVMs. Least-squares SVM classifiers were proposed by Johan Suykens and Joos Vandewalle. LS-SVMs are a class of kernel-based learning methods.
Within bayesian statistics for machine learning, kernel methods arise from the assumption of an inner product space or similarity structure on inputs. For some such methods, such as support vector machines (SVMs), the original formulation and its regularization were not Bayesian in nature. It is helpful to understand them from a Bayesian perspective. Because the kernels are not necessarily positive semidefinite, the underlying structure may not be inner product spaces, but instead more general reproducing kernel Hilbert spaces. In Bayesian probability kernel methods are a key component of Gaussian processes, where the kernel function is known as the covariance function. Kernel methods have traditionally been used in supervised learning problems where the input space is usually a space of vectors while the output space is a space of scalars. More recently these methods have been extended to problems that deal with multiple outputs such as in multi-task learning.
In quantum mechanics, weak measurement is a type of quantum measurement that results in an observer obtaining very little information about the system on average, but also disturbs the state very little. From Busch's theorem any quantum system is necessarily disturbed by measurement, but the amount of disturbance is described by a parameter called the measurement strength.
Regularized least squares (RLS) is a family of methods for solving the least-squares problem while using regularization to further constrain the resulting solution.
Low-rank matrix approximations are essential tools in the application of kernel methods to large-scale learning problems.
Batch normalization is a method used to make training of artificial neural networks faster and more stable through normalization of the layers' inputs by re-centering and re-scaling. It was proposed by Sergey Ioffe and Christian Szegedy in 2015.
The reparameterization trick is a technique used in statistical machine learning, particularly in variational inference, variational autoencoders, and stochastic optimization. It allows for the efficient computation of gradients through random variables, enabling the optimization of parametric probability models using stochastic gradient descent, and the variance reduction of estimators.