Thought vector

Last updated September 15, 2024

Thought vector is a term popularized by Geoffrey Hinton, the prominent deep-learning researcher, which uses vectors based on natural language ^[1] to improve its search results.^[2]

Related Research Articles

Geoffrey Everest Hinton is a British-Canadian computer scientist and cognitive psychologist, most noted for his work on artificial neural networks. From 2013 to 2023, he divided his time working for Google and the University of Toronto, before publicly announcing his departure from Google in May 2023, citing concerns about the risks of artificial intelligence (AI) technology. In 2017, he co-founded and became the chief scientific advisor of the Vector Institute in Toronto.

<span class="mw-page-title-main">Boltzmann machine</span> Type of stochastic recurrent neural network

A Boltzmann machine, named after Ludwig Boltzmann is a stochastic spin-glass model with an external field, i.e., a Sherrington–Kirkpatrick model, that is a stochastic Ising model. It is a statistical physics technique applied in the context of cognitive science. It is also classified as a Markov random field.

Stochastic gradient descent is an iterative method for optimizing an objective function with suitable smoothness properties. It can be regarded as a stochastic approximation of gradient descent optimization, since it replaces the actual gradient by an estimate thereof. Especially in high-dimensional optimization problems this reduces the very high computational burden, achieving faster iterations in exchange for a lower convergence rate.

An autoencoder is a type of artificial neural network used to learn efficient codings of unlabeled data. An autoencoder learns two functions: an encoding function that transforms the input data, and a decoding function that recreates the input data from the encoded representation. The autoencoder learns an efficient representation (encoding) for a set of data, typically for dimensionality reduction.

<span class="mw-page-title-main">Activation function</span> Artificial neural network node function

The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is nonlinear. Modern activation functions include the smooth version of the ReLU, the GELU, which was used in the 2018 BERT model, the logistic (sigmoid) function used in the 2012 speech recognition model developed by Hinton et al, the ReLU used in the 2012 AlexNet computer vision model and in the 2015 ResNet model.

Yann André LeCun is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor of the Courant Institute of Mathematical Sciences at New York University and Vice-President, Chief AI Scientist at Meta.

There are many types of artificial neural networks (ANN).

Deep learning is a subset of machine learning methods based on neural networks with representation learning. The field takes inspiration from biological neuroscience and is centered around stacking artificial neurons into layers and "training" them to process data. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.

A restricted Boltzmann machine (RBM) is a generative stochastic artificial neural network that can learn a probability distribution over its set of inputs.

In machine learning, feature learning or representation learning is a set of techniques that allows a system to automatically discover the representations needed for feature detection or classification from raw data. This replaces manual feature engineering and allows a machine to both learn the features and use them to perform a specific task.

A convolutional neural network (CNN) is a regularized type of feed-forward neural network that learns features by itself via filter optimization. This type of deep learning network has been applied to process and make predictions from many different types of data including text, images and audio. Convolution-based networks are the de-facto standard in deep learning-based approaches to computer vision and image processing, and have only recently have been replaced -- in some cases -- by more recent deep learning architectures such as the transformer. Vanishing gradients and exploding gradients, seen during backpropagation in earlier neural networks, are prevented by using regularized weights over fewer connections. For example, for each neuron in the fully-connected layer, 10,000 weights would be required for processing an image sized 100 × 100 pixels. However, applying cascaded convolution kernels, only 25 neurons are required to process 5x5-sized tiles. Higher-layer features are extracted from wider context windows, compared to lower-layer features.

In machine learning, a deep belief network (DBN) is a generative graphical model, or alternatively a class of deep neural network, composed of multiple layers of latent variables, with connections between the layers but not between units within each layer.

Eclipse Deeplearning4j is a programming library written in Java for the Java virtual machine (JVM). It is a framework with wide support for deep learning algorithms. Deeplearning4j includes implementations of the restricted Boltzmann machine, deep belief net, deep autoencoder, stacked denoising autoencoder and recursive neural tensor network, word2vec, doc2vec, and GloVe. These algorithms all include distributed parallel versions that integrate with Apache Hadoop and Spark.

In natural language processing (NLP), a word embedding is a representation of a word. The embedding is used in text analysis. Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using language modeling and feature learning techniques, where words or phrases from the vocabulary are mapped to vectors of real numbers.

Yoshua Bengio is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de Montréal and scientific director of the Montreal Institute for Learning Algorithms (MILA).

The following table compares notable software frameworks, libraries and computer programs for deep learning.

Ruslan Salakhutdinov is a Canadian researcher of Tatar origin working in the field of artificial intelligence.

A capsule neural network (CapsNet) is a machine learning system that is a type of artificial neural network (ANN) that can be used to better model hierarchical relationships. The approach is an attempt to more closely mimic biological neural organization.

Yee-Whye Teh is a professor of statistical machine learning in the Department of Statistics, University of Oxford. Prior to 2012 he was a reader at the Gatsby Charitable Foundation computational neuroscience unit at University College London. His work is primarily in machine learning, artificial intelligence, statistics and computer science.

The Vector Institute is a private, non-profit artificial intelligence research institute in Toronto focusing primarily on machine learning and deep learning research. As of 2023, it consists of 143 faculty members and affiliates — 38 of which are CIFAR AI chairs — 57 postdoctoral fellows, and 502 students. Along with the University of Toronto, the Vector Institute is affiliated with faculty from universities across Ontario, as well as British Columbia and Nova Scotia.

References

↑ Hinton, Geoffrey. "Aetherial Symbols" . Retrieved 2017-10-09.
↑ Gibson, Chris Nicholson, Adam. "Thought Vectors, Deep Learning & the Future of AI - Deeplearning4j: Open-source, distributed deep learning for the JVM". deeplearning4j.org. Archived from the original on 2017-02-11. Retrieved 2016-08-23.{{cite web}}: CS1 maint: multiple names: authors list (link)

This computer science article is a stub. You can help Wikipedia by expanding it.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Hinton, Geoffrey. "Aetherial Symbols" . Retrieved 2017-10-09.

[2] Gibson, Chris Nicholson, Adam. "Thought Vectors, Deep Learning & the Future of AI - Deeplearning4j: Open-source, distributed deep learning for the JVM". deeplearning4j.org. Archived from the original on 2017-02-11. Retrieved 2016-08-23.{{cite web}}: CS1 maint: multiple names: authors list (link)

[1]

[2]