Seppo Linnainmaa

Last updated

Seppo Ilmari Linnainmaa (born 28 September 1945) is a Finnish mathematician and computer scientist known for creating the modern version of backpropagation.

Contents

Biography

He was born in Pori. [1] He received his MSc in 1970 and introduced a reverse mode of automatic differentiation in his MSc thesis. [2] [3] In 1974 he obtained the first doctorate ever awarded in computer science at the University of Helsinki. [4] In 1976, he became Assistant Professor. From 1984 to 1985 he was Visiting Professor at the University of Maryland, USA. From 1986 to 1989 he was Chairman of the Finnish Artificial Intelligence Society. From 1989 to 2007, he was Research Professor at the VTT Technical Research Centre of Finland. He retired in 2007.

Backpropagation

Explicit, efficient error backpropagation in arbitrary, discrete, possibly sparsely connected, neural networks-like networks was first described in Linnainmaa's 1970 master's thesis, [2] [5] albeit without reference to NNs, [6] when he introduced the reverse mode of automatic differentiation (AD), in order to efficiently compute the derivative of a differentiable composite function that can be represented as a graph, by recursively applying the chain rule to the building blocks of the function. [4] [2] [5] [7] Linnainmaa published it first, following Gerardi Ostrowski who had used it in the context of certain process models in chemical engineering some five years earlier, but didn't publish.

With faster computers emerging, the method has become heavily used in numerous applications. For example, backpropagation of errors in multi-layer perceptrons, a technique used in machine learning, is a special case of reverse mode AD[ further explanation needed ].

Notes

  1. Ellonen, Leena, ed. (2008). Suomen professorit 1640–2007 (in Finnish). Helsinki: Professoriliitto. p. 405. ISBN   978-952-99281-1-8.
  2. 1 2 3 Linnainmaa, Seppo (1970). Algoritmin kumulatiivinen pyöristysvirhe yksittäisten pyöristysvirheiden Taylor-kehitelmänä [The representation of the cumulative rounding error of an algorithm as a Taylor expansion of the local rounding errors](PDF) (Thesis) (in Finnish). pp. 6–7.
  3. "Seppo Linnainmaa (Mathematician and Computer Scientist)". OnThisDay.com. Retrieved 2024-04-06.
  4. 1 2 Griewank, Andreas (2012). "Who Invented the Reverse Mode of Differentiation?" (PDF). Documenta Matematica, Extra Volume ISMP. pp. 389–400. S2CID   15568746.
  5. 1 2 Linnainmaa, Seppo (1976). "Taylor expansion of the accumulated rounding error". BIT Numerical Mathematics. 16 (2): 146–160. doi:10.1007/BF01931367. S2CID   122357351.
  6. Jürgen Schmidhuber, (2015). Who Invented Backpropagation?
  7. Griewank, Andreas and Walther, A.. Principles and Techniques of Algorithmic Differentiation, Second Edition. SIAM, 2008.[ page needed ]

Related Research Articles

<span class="mw-page-title-main">Neural network (machine learning)</span> Computational model used in machine learning, based on connected, hierarchical functions

In machine learning, a neural network is a model inspired by the neuronal organization found in the biological neural networks in animal brains.

<span class="mw-page-title-main">Stephen Cook</span> American-Canadian computer scientist, contributor to complexity theory

Stephen Arthur Cook is an American-Canadian computer scientist and mathematician who has made significant contributions to the fields of complexity theory and proof complexity. He is a university professor emeritus at the University of Toronto, Department of Computer Science and Department of Mathematics.

<i>Sokoban</i> 1981 video game

Sokoban is a puzzle video game in which the player pushes boxes around in a warehouse, trying to get them to storage locations. The game was designed in 1981 by Hiroyuki Imabayashi, and first published in December 1982.

<span class="mw-page-title-main">Geoffrey Hinton</span> British-Canadian computer scientist and psychologist (born 1947)

Geoffrey Everest Hinton is a British-Canadian computer scientist and cognitive psychologist, most noted for his work on artificial neural networks. From 2013 to 2023, he divided his time working for Google and the University of Toronto, before publicly announcing his departure from Google in May 2023, citing concerns about the risks of artificial intelligence (AI) technology. In 2017, he co-founded and became the chief scientific advisor of the Vector Institute in Toronto.

In mathematics and computer algebra, automatic differentiation, also called algorithmic differentiation, computational differentiation, is a set of techniques to evaluate the partial derivative of a function specified by a computer program.

In machine learning, backpropagation is a gradient estimation method used to train neural network models. The gradient estimate is used by the optimization algorithm to compute the network parameter updates.

A recurrent neural network (RNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. In contrast to the uni-directional feedforward neural network, it is a bi-directional artificial neural network, meaning that it allows the output from some nodes to affect subsequent input to the same nodes. Their ability to use internal state (memory) to process arbitrary sequences of inputs makes them applicable to tasks such as unsegmented, connected handwriting recognition or speech recognition. The term "recurrent neural network" is used to refer to the class of networks with an infinite impulse response, whereas "convolutional neural network" refers to the class of finite impulse response. Both classes of networks exhibit temporal dynamic behavior. A finite impulse recurrent network is a directed acyclic graph that can be unrolled and replaced with a strictly feedforward neural network, while an infinite impulse recurrent network is a directed cyclic graph that can not be unrolled.

<span class="mw-page-title-main">Feedforward neural network</span> One of two broad types of artificial neural network

A feedforward neural network (FNN) is one of the two broad types of artificial neural network, characterized by direction of the flow of information between its layers. Its flow is uni-directional, meaning that the information in the model flows in only one direction—forward—from the input nodes, through the hidden nodes and to the output nodes, without any cycles or loops, in contrast to recurrent neural networks, which have a bi-directional flow. Modern feedforward networks are trained using the backpropagation method and are colloquially referred to as the "vanilla" neural networks.

A multilayer perceptron (MLP) is a name for a modern feedforward artificial neural network, consisting of fully connected neurons with a nonlinear kind of activation function, organized in at least three layers, notable for being able to distinguish data that is not linearly separable. It is a misnomer because the original perceptron used a Heaviside step function, instead of a nonlinear kind of activation function.

<span class="mw-page-title-main">David Rumelhart</span> American psychologist (1942–2011)

David Everett Rumelhart was an American psychologist who made many contributions to the formal analysis of human cognition, working primarily within the frameworks of mathematical psychology, symbolic artificial intelligence, and parallel distributed processing. He also admired formal linguistic approaches to cognition, and explored the possibility of formulating a formal grammar to capture the structure of stories.

Checkpointing schemes are scientific computing algorithms used in solving time dependent adjoint equations, as well as reverse mode automatic differentiation.

Reverse computation is a software application of the concept of reversible computing.

The School of Computer Science and Electronic Engineering at the University of Essex is an academic department that focuses on educating and researching into Computer Science and Electronic Engineering specific matters. It was formed by the merger of two departments, notable for being amongst the first in England in their fields, the Department of Electronic Systems Engineering(1966) and the Department of Computer Science (1966).

<span class="mw-page-title-main">Deep learning</span> Branch of machine learning

Deep learning is the subset of machine learning methods based on artificial neural networks (ANNs) with representation learning. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

This page is a timeline of machine learning. Major discoveries, achievements, milestones and other major events in machine learning are included.

<span class="mw-page-title-main">Adept (C++ library)</span>

Adept is a combined automatic differentiation and array software library for the C++ programming language. The automatic differentiation capability facilitates the development of applications involving mathematical optimization. Adept is notable for having applied the template metaprogramming technique of expression templates to speed-up the differentiation of mathematical statements. Along with the efficient way that it stores the differential information, this makes it significantly faster than most other C++ tools that provide similar functionality, although comparable performance has been reported for Stan and in some cases Sacado. Differentiation may be in forward mode, reverse mode, or the full Jacobian matrix may be computed.

Oded Regev is an Israeli-American theoretical computer scientist and mathematician. He is a professor of computer science at the Courant institute at New York University. He is best known for his work in lattice-based cryptography, and in particular for introducing the learning with errors problem.

Artificial neural networks (ANNs) are models created using machine learning to perform a number of tasks. Their creation was inspired by neural circuitry. While some of the computational implementations ANNs relate to earlier discoveries in mathematics, the first implementation of ANNs was by psychologist Frank Rosenblatt, who developed the perceptron. Little research was conducted on ANNs in the 1970s and 1980s, with the AAAI calling that period an "AI winter".

Andrea Walther is a German applied mathematician whose research interests include nonlinear optimization, non-smooth optimization, and scientific computing, and who is known in particular for her work on automatic differentiation. She is professor of mathematical optimization in the institute for mathematics of Humboldt University of Berlin.