Paul Werbos

Last updated
Paul J. Werbos
PaulWerbos-IJCNNseattle1991-07-08.jpg
Paul Werbos at the International Joint Conference on Neural Networks (IJCNN) in Seattle on 8 July 1991.
Born (1947-09-04) September 4, 1947 (age 75)
Nationality American
Alma mater Harvard University
Known for Backpropagation
Awards IEEE Neural Network Pioneer Award (1995)
IEEE Frank Rosenblatt Award (2022)
Scientific career
Fields Social science
Machine Learning
Thesis Beyond Regression: New Tools for Prediction and Analysis in the Behavioral Sciences  (1974)
Doctoral advisor Karl Deutsch
Other academic advisors Yu-Chi Ho

Paul John Werbos (born 1947) is an American social scientist and machine learning pioneer. He is best known for his 1974 dissertation, which first described the process of training artificial neural networks through backpropagation of errors. [1] He also was a pioneer of recurrent neural networks. [2]

Werbos was one of the original three two-year Presidents of the International Neural Network Society (INNS). In 1995, he was awarded the IEEE Neural Network Pioneer Award for the discovery of backpropagation and other basic neural network learning frameworks such as Adaptive Dynamic Programming. [3]

Werbos has also written on quantum mechanics and other areas of physics. [4] [5] He also has interest in larger questions relating to consciousness, the foundations of physics, and human potential.

He served as program director in the National Science Foundation for several years until 2015.

Related Research Articles

<span class="mw-page-title-main">Artificial neural network</span> Computational model used in machine learning, based on connected, hierarchical functions

Artificial neural networks (ANNs), usually simply called neural networks (NNs) or neural nets, are computing systems inspired by the biological neural networks that constitute animal brains.

An artificial neuron is a mathematical function conceived as a model of biological neurons, a neural network. Artificial neurons are elementary units in an artificial neural network. The artificial neuron receives one or more inputs and sums them to produce an output. Usually each input is separately weighted, and the sum is passed through a non-linear function known as an activation function or transfer function. The transfer functions usually have a sigmoid shape, but they may also take the form of other non-linear functions, piecewise linear functions, or step functions. They are also often monotonically increasing, continuous, differentiable and bounded. Non-monotonic, unbounded and oscillating activation functions with multiple zeros that outperform sigmoidal and ReLU like activation functions on many tasks have also been recently explored. The thresholding function has inspired building logic gates referred to as threshold logic; applicable to building logic circuits resembling brain processing. For example, new devices such as memristors have been extensively used to develop such logic in recent times.

<span class="mw-page-title-main">Jürgen Schmidhuber</span> German computer scientist

Jürgen Schmidhuber is a German computer scientist most noted for his work in the field of artificial intelligence, deep learning and artificial neural networks. He is a co-director of the Dalle Molle Institute for Artificial Intelligence Research in Lugano, in Ticino in southern Switzerland. Following Google Scholar, from 2016 to 2021 he has received more than 100,000 scientific citations. He has been referred to as "father of modern AI," and "father of deep learning."

<span class="mw-page-title-main">Geoffrey Hinton</span> British-Canadian computer scientist and psychologist

Geoffrey Everest Hinton is a British-Canadian cognitive psychologist and computer scientist, most noted for his work on artificial neural networks. Since 2013, he has divided his time working for Google and the University of Toronto. In 2017, he co-founded and became the Chief Scientific Advisor of the Vector Institute in Toronto.

<span class="mw-page-title-main">Backpropagation</span> Optimization algorithm for artificial neural networks

In machine learning, backpropagation is a widely used algorithm for training feedforward artificial neural networks. Generalizations of backpropagation exist for other artificial neural networks (ANNs), and for functions generally. These classes of algorithms are all referred to generically as "backpropagation". In fitting a neural network, backpropagation computes the gradient of the loss function with respect to the weights of the network for a single input–output example, and does so efficiently, unlike a naive direct computation of the gradient with respect to each weight individually. This efficiency makes it feasible to use gradient methods for training multilayer networks, updating weights to minimize loss; gradient descent, or variants such as stochastic gradient descent, are commonly used. The backpropagation algorithm works by computing the gradient of the loss function with respect to each weight by the chain rule, computing the gradient one layer at a time, iterating backward from the last layer to avoid redundant calculations of intermediate terms in the chain rule; this is an example of dynamic programming.

<span class="mw-page-title-main">Recurrent neural network</span> Computational model used in machine learning

A recurrent neural network (RNN) is a class of artificial neural networks where connections between nodes can create a cycle, allowing output from some nodes to affect subsequent input to the same nodes. This allows it to exhibit temporal dynamic behavior. Derived from feedforward neural networks, RNNs can use their internal state (memory) to process variable length sequences of inputs. This makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. Recurrent neural networks are theoretically Turing complete and can run arbitrary programs to process arbitrary sequences of inputs.

<span class="mw-page-title-main">Neural network</span> Structure in biology and artificial intelligence

A neural network is a network or circuit of biological neurons, or, in a modern sense, an artificial neural network, composed of artificial neurons or nodes. Thus, a neural network is either a biological neural network, made up of biological neurons, or an artificial neural network, used for solving artificial intelligence (AI) problems. The connections of the biological neuron are modeled in artificial neural networks as weights between nodes. A positive weight reflects an excitatory connection, while negative values mean inhibitory connections. All inputs are modified by a weight and summed. This activity is referred to as a linear combination. Finally, an activation function controls the amplitude of the output. For example, an acceptable range of output is usually between 0 and 1, or it could be −1 and 1.

<span class="mw-page-title-main">David Rumelhart</span> American psychologist

David Everett Rumelhart was an American psychologist who made many contributions to the formal analysis of human cognition, working primarily within the frameworks of mathematical psychology, symbolic artificial intelligence, and parallel distributed processing. He also admired formal linguistic approaches to cognition, and explored the possibility of formulating a formal grammar to capture the structure of stories.

<span class="mw-page-title-main">Quantum neural network</span> Quantum Mechanics in Neural Networks

Quantum neural networks are computational neural network models which are based on the principles of quantum mechanics. The first ideas on quantum neural computation were published independently in 1995 by Subhash Kak and Ron Chrisley, engaging with the theory of quantum mind, which posits that quantum effects play a role in cognitive function. However, typical research in quantum neural networks involves combining classical artificial neural network models with the advantages of quantum information in order to develop more efficient algorithms. One important motivation for these investigations is the difficulty to train classical neural networks, especially in big data applications. The hope is that features of quantum computing such as quantum parallelism or the effects of interference and entanglement can be used as resources. Since the technological implementation of a quantum computer is still in a premature stage, such quantum neural network models are mostly theoretical proposals that await their full implementation in physical experiments.

Reservoir computing is a framework for computation derived from recurrent neural network theory that maps input signals into higher dimensional computational spaces through the dynamics of a fixed, non-linear system called a reservoir. After the input signal is fed into the reservoir, which is treated as a "black box," a simple readout mechanism is trained to read the state of the reservoir and map it to the desired output. The first key benefit of this framework is that training is performed only at the readout stage, as the reservoir dynamics are fixed. The second is that the computational power of naturally available systems, both classical and quantum mechanical, can be used to reduce the effective computational cost.

A native of Terre Haute, Indiana, Stuart E. Dreyfus is professor emeritus at University of California, Berkeley in the Industrial Engineering and Operations Research Department. While at the Rand Corporation he was a programmer of the JOHNNIAC computer. While at Rand he coauthored Applied Dynamic Programming with Richard Bellman. Following that work, he was encouraged to pursue a Ph.D. which he completed in applied mathematics at Harvard University in 1964, on the calculus of variations. In 1962, Dreyfus simplified the Dynamic Programming-based derivation of backpropagation using only the chain rule. He also coauthored Mind Over Machine with his brother Hubert Dreyfus in 1986.

Arthur Earl Bryson Jr. is the Paul Pigott Professor of Engineering Emeritus at Stanford University and the "father of modern optimal control theory". With Henry J. Kelley, he also pioneered an early version of the backpropagation procedure, now widely used for machine learning and artificial neural networks.

<span class="mw-page-title-main">Yann LeCun</span> French computer scientist (born 1960)

Yann André LeCun is a French computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor of the Courant Institute of Mathematical Sciences at New York University and Vice-President, Chief AI Scientist at Meta.

Kunihiko Fukushima is a Japanese computer scientist, most noted for his work on artificial neural networks and deep learning. He is currently working part-time as a Senior Research Scientist at the Fuzzy Logic Systems Institute in Fukuoka, Japan.

Backpropagation through time (BPTT) is a gradient-based technique for training certain types of recurrent neural networks. It can be used to train Elman networks. The algorithm was independently derived by numerous researchers.

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is part of a broader family of machine learning methods based on artificial neural networks with representation learning. Learning can be supervised, semi-supervised or unsupervised.

A recursive neural network is a kind of deep neural network created by applying the same set of weights recursively over a structured input, to produce a structured prediction over variable-size input structures, or a scalar prediction on it, by traversing a given structure in topological order. Recursive neural networks, sometimes abbreviated as RvNNs, have been successful, for instance, in learning sequence and tree structures in natural language processing, mainly phrase and sentence continuous representations based on word embedding. RvNNs have first been introduced to learn distributed representations of structure, such as logical terms. Models and general frameworks have been developed in further works since the 1990s.

<span class="mw-page-title-main">AlexNet</span> Convolutional neural network

AlexNet is the name of a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor.

The history of artificial neural networks (ANN) began with Warren McCulloch and Walter Pitts (1943) who created a computational model for neural networks based on algorithms called threshold logic. This model paved the way for research to split into two approaches. One approach focused on biological processes while the other focused on the application of neural networks to artificial intelligence. This work led to work on nerve networks and their link to finite automata.

Frank L. Lewis is an American electrical engineer, academic and researcher. He is a professor of electrical engineering, Moncrief-O’Donnell Endowed Chair, and head of Advanced Controls and Sensors Group at The University of Texas at Arlington (UTA). He is a member of UTA Academy of Distinguished Teachers and a charter member of UTA Academy of Distinguished Scholars.

References

  1. The thesis, and some supplementary information, can be found in his book, Werbos, Paul J. (1994). The Roots of Backpropagation : From Ordered Derivatives to Neural Networks and Political Forecasting. New York: John Wiley & Sons. ISBN   0-471-59897-6.
  2. Werbos, P. (1990). "Backpropagation Through Time: What It Does and How to Do It". Proceedings of the IEEE. 78 (10): 1550–1560. doi:10.1109/5.58337.
  3. "Award Recipients | IEEE Computational Intelligence Society". cis.ieee.org. Archived from the original on 2013-01-17. Retrieved 2017-09-12.
  4. Werbos, Paul J. (2005). "A Conjecture About Fermi–Bose Equivalence". arXiv: hep-th/0505023 . Bibcode:2005hep.th....5023W.{{cite journal}}: Cite journal requires |journal= (help)
  5. "Discussion with Paul Werbos on the Nature of Quantum Nonlocality". December 7, 2012.