Mark I Perceptron

Last updated February 03, 2025

The Mark I Perceptron was a pioneering supervised image classification learning system developed by Frank Rosenblatt in 1958. It was the first implementation of an Artificial Intelligence (AI) machine. It differs from the Perceptron which is a software architecture proposed in 1943 by Warren McCulloch and Walter Pitts,^[1] which was also employed in Mark I, and enhancements of which have continued to be an integral part of cutting edge AI technologies like the Transformer.

Architecture

The Mark I Perceptron was organized into three layers:^[2]

A set of sensory units which receive optical input
A set of association units, each of which fire based on input from multiple sensory units
A set of response units, which fire based on input from multiple association units

The connection between sensory units and association units were random. The working of association units was very similar to the response units.^[2] Different versions of the Mark I used different numbers of units in each of the layers.^[3]

Capabilities

In his 1957 proposal for funding for development of the "Cornell Photoperceptron", Rosenblatt claimed:^[4]

"Devices of this sort are expected ultimately to be capable of concept formation, language translation, collation of military intelligence, and the solution of problems through inductive logic."

With the first version of the Mark I Perceptron as early as 1958, Rosenblatt demonstrated a simple binary classification experiment, namely distinguishing between sheets of paper marked on the right versus those marked on the left side.^[5]

One of the later experiments distinguished a square from a circle printed on paper. The shapes were perfect and their sizes fixed; the only variation was in their position and orientation. The Mark I Perceptron achieved 99.8% accuracy on a test dataset with 500 neurons in a single layer. The size of the training dataset was 10,000 example images. It took 3 seconds for the training pipeline to go through a single image. Higher accuracy was observed with thick outline figures compared to solid figures, likely because outline figures reduced overfitting.^[3]

Another experiment distinguished between a square and a diamond for which 100% accuracy was achieved with only 60 training images, with a Perceptron having 1,000 neurons in a single layer. The time taken to process each training input for this larger perceptron was 15 seconds. The only variation was in position of the image, since rotation would have been ambiguous.

In that same experiment, it could distinguish between the letters X and E with 100% accuracy when trained with only 20 images (10 images of each letter). Variations in the images included both position and rotation by up to 30 degrees. When variation in rotation was increased to any angle (both in training and test datasets), the accuracy reduced to 90% with 60 training images (30 images of each letter).^[3]

For distinguishing between the letters E and F, a more challenging problem due to their similarity, the same 1,000 neuron perceptron achieved an accuracy of more than 80% with 60 training images. Variation was only in the position of the image, with no rotation.^[3]

Related Research Articles

In machine learning, a neural network is a model inspired by the structure and function of biological neural networks in animal brains.

In machine learning, the perceptron is an algorithm for supervised learning of binary classifiers. A binary classifier is a function which can decide whether or not an input, represented by a vector of numbers, belongs to some specific class. It is a type of linear classifier, i.e. a classification algorithm that makes its predictions based on a linear predictor function combining a set of weights with the feature vector.

Machine learning (ML) is a field of study in artificial intelligence concerned with the development and study of statistical algorithms that can learn from data and generalize to unseen data, and thus perform tasks without explicit instructions. Within a subdiscipline in machine learning, advances in the field of deep learning have allowed neural networks, a class of statistical algorithms, to surpass many previous machine learning approaches in performance.

Connectionism is an approach to the study of human mental processes and cognition that utilizes mathematical models known as connectionist networks or artificial neural networks.

An artificial neuron is a mathematical function conceived as a model of a biological neuron in a neural network. The artificial neuron is the elementary unit of an artificial neural network.

In machine learning, a common task is the study and construction of algorithms that can learn from and make predictions on data. Such algorithms function by making data-driven predictions or decisions, through building a mathematical model from input data. These input data used to build the model are usually divided into multiple data sets. In particular, three data sets are commonly used in different stages of the creation of the model: training, validation, and test sets.

Recurrent neural networks (RNNs) are a class of artificial neural network commonly used for sequential data processing. Unlike feedforward neural networks, which process data in a single pass, RNNs process data across multiple time steps, making them well-adapted for modelling and processing text, speech, and time series.

Feedforward refers to recognition-inference architecture of neural networks. Artificial neural network architectures are based on inputs multiplied by weights to obtain outputs (inputs-to-output): feedforward. Recurrent neural networks, or neural networks with loops allow information from later processing stages to feed back to earlier stages for sequence processing. However, at every stage of inference a feedforward multiplication remains the core, essential for backpropagation or backpropagation through time. Thus neural networks cannot contain feedback like negative feedback or positive feedback where the outputs feed back to the very same inputs and modify them, because this forms an infinite loop which is not possible to rewind in time to generate an error signal through backpropagation. This issue and nomenclature appear to be a point of confusion between some computer scientists and scientists in other fields studying brain networks.

Frank Rosenblatt was an American psychologist notable in the field of artificial intelligence. He is sometimes called the father of deep learning for his pioneering work on artificial neural networks.

In deep learning, a multilayer perceptron (MLP) is a name for a modern feedforward neural network consisting of fully connected neurons with nonlinear activation functions, organized in layers, notable for being able to distinguish data that is not linearly separable.

Quantum neural networks are computational neural network models which are based on the principles of quantum mechanics. The first ideas on quantum neural computation were published independently in 1995 by Subhash Kak and Ron Chrisley, engaging with the theory of quantum mind, which posits that quantum effects play a role in cognitive function. However, typical research in quantum neural networks involves combining classical artificial neural network models with the advantages of quantum information in order to develop more efficient algorithms. One important motivation for these investigations is the difficulty to train classical neural networks, especially in big data applications. The hope is that features of quantum computing such as quantum parallelism or the effects of interference and entanglement can be used as resources. Since the technological implementation of a quantum computer is still in a premature stage, such quantum neural network models are mostly theoretical proposals that await their full implementation in physical experiments.

ADALINE is an early single-layer artificial neural network and the name of the physical device that implemented it. It was developed by professor Bernard Widrow and his doctoral student Marcian Hoff at Stanford University in 1960. It is based on the perceptron and consists of weights, a bias, and a summation function. The weights and biases were implemented by rheostats, and later, memistors.

<span class="mw-page-title-main">Spiking neural network</span> Artificial neural network that mimics neurons

Spiking neural networks (SNNs) are artificial neural networks (ANN) that more closely mimic natural neural networks. These models leverage timing of discrete spikes as the main information carrier.

<i>Perceptrons</i> (book) Book by Marvin Minsky and Seymour Papert

Perceptrons: An Introduction to Computational Geometry is a book written by Marvin Minsky and Seymour Papert and published in 1969. An edition with handwritten corrections and additions was released in the early 1970s. An expanded edition was further published in 1988 (ISBN 9780262631112) after the revival of neural networks, containing a chapter dedicated to counter the criticisms made of it in the 1980s.

There are many types of artificial neural networks (ANN).

Deep learning is a subset of machine learning that focuses on utilizing neural networks to perform tasks such as classification, regression, and representation learning. The field takes inspiration from biological neuroscience and is centered around stacking artificial neurons into layers and "training" them to process data. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns features by itself via filter optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently been replaced—in some cases—by newer deep learning architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 weights for each convolutional layer are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

<span class="mw-page-title-main">Residual neural network</span> Type of artificial neural network

A residual neural network is a deep learning architecture in which the layers learn residual functions with reference to the layer inputs. It was developed in 2015 for image recognition, and won the ImageNet Large Scale Visual Recognition Challenge of that year.

Artificial neural networks (ANNs) are models created using machine learning to perform a number of tasks. Their creation was inspired by biological neural circuitry. While some of the computational implementations ANNs relate to earlier discoveries in mathematics, the first implementation of ANNs was by psychologist Frank Rosenblatt, who developed the perceptron. Little research was conducted on ANNs in the 1970s and 1980s, with the AAAI calling this period an "AI winter".

A neural network is a group of interconnected units called neurons that send signals to one another. Neurons can be either biological cells or mathematical models. While individual neurons are simple, many of them together in a network can perform complex tasks. There are two main types of neural networks:

References

↑ McCulloch, Warren S.; Pitts, Walter (1943-12-01). "A logical calculus of the ideas immanent in nervous activity". The Bulletin of Mathematical Biophysics. 5 (4): 115–133. doi:10.1007/BF02478259. ISSN 1522-9602.
1 2 Rosenblatt, F. (1958). "The perceptron: A probabilistic model for information storage and organization in the brain". Psychological Review. 65 (6): 386–408. doi:10.1037/h0042519. ISSN 1939-1471. PMID 13602029.
1 2 3 4 Rosenblatt, Frank (March 1960). "Perceptron Simulation Experiments". Proceedings of the IRE. 48 (3): 301–309. doi:10.1109/JRPROC.1960.287598. ISSN 0096-8390.
↑ Rosenblatt, Frank (January 1957). "The Perceptron—a perceiving and recognizing automaton" (PDF). Cornell Aeronautical Laboratory.
↑ "Professor's perceptron paved the way for AI – 60 years too soon | Cornell Chronicle". news.cornell.edu. Retrieved 2024-10-08.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] McCulloch, Warren S.; Pitts, Walter (1943-12-01). "A logical calculus of the ideas immanent in nervous activity". The Bulletin of Mathematical Biophysics. 5 (4): 115–133. doi:10.1007/BF02478259. ISSN 1522-9602.

[:0-2] 1 2 Rosenblatt, F. (1958). "The perceptron: A probabilistic model for information storage and organization in the brain". Psychological Review. 65 (6): 386–408. doi:10.1037/h0042519. ISSN 1939-1471. PMID 13602029.

[:1-3] 1 2 3 4 Rosenblatt, Frank (March 1960). "Perceptron Simulation Experiments". Proceedings of the IRE. 48 (3): 301–309. doi:10.1109/JRPROC.1960.287598. ISSN 0096-8390.

[4] Rosenblatt, Frank (January 1957). "The Perceptron—a perceiving and recognizing automaton" (PDF). Cornell Aeronautical Laboratory.

[5] "Professor's perceptron paved the way for AI – 60 years too soon | Cornell Chronicle". news.cornell.edu. Retrieved 2024-10-08.

[1]

[2]

[3]

[4]

[5]

Mark I Perceptron

Contents

Architecture

Capabilities

See also

Related Research Articles

References