Ilya Sutskever | |
---|---|
איליה סוצקבר Илья Суцкевер | |
Born | Илья́ Ефи́мович Суцке́вер Ilya Efimovich Sutskever 1985/86 [1] |
Citizenship | Canadian, Israeli, Russian[ citation needed ] |
Alma mater | |
Known for | AlexNet Co-founding OpenAI Founding SSI Inc. |
Scientific career | |
Fields | Machine learning Neural networks Artificial intelligence Deep learning [4] |
Institutions | University of Toronto Stanford University Google Brain OpenAI |
Thesis | Training Recurrent Neural Networks (2013) |
Doctoral advisor | Geoffrey Hinton [5] [6] |
Website | www |
Ilya Sutskever FRS (born 1985/86 [1] ) is a computer scientist who specializes in machine learning. [4]
Sutskever has made several major contributions to the field of deep learning. [7] [8] [9] He is notably the co-inventor, with Alex Krizhevsky and Geoffrey Hinton, of AlexNet, a convolutional neural network. [10]
Sutskever co-founded and is a former chief scientist at OpenAI. [11] In 2023, he was one of the members of OpenAI's board who fired CEO Sam Altman; Altman returned a week later, and Sutskever stepped down from the board. In June 2024, Sutskever co-founded the company Safe Superintelligence with Daniel Gross and Daniel Levy. [12] [13]
Sutskever was born in Nizhny Novgorod, Russia, then called Gorky, at the time part of the Soviet Union, and at age 5 emigrated with his family to Israel, [14] where he lived until age 15. [15]
Sutskever attended the Open University of Israel between 2000 and 2002. [16] After that, he moved to Canada with his family and attended the University of Toronto in Ontario.
Sutskever received a Bachelor of Science in mathematics from the University of Toronto in 2005, [16] [17] [3] [18] a Master of Science in computer science in 2007, [17] [19] and a Doctor of Philosophy in computer science in 2013. [6] [20] [21] His doctoral supervisor was Geoffrey Hinton. [5]
In 2012, Sutskever built AlexNet in collaboration with Hinton and Alex Krizhevsky. To support AlexNet's computing demands, he bought many GTX 580 GPUs online. [22]
In 2012, Sutskever spent about two months as a postdoc with Andrew Ng at Stanford University. He then returned to the University of Toronto and joined Hinton's new research company DNNResearch, a spinoff of Hinton's research group. In 2013, Google acquired DNNResearch and hired Sutskever as a research scientist at Google Brain. [23]
At Google Brain, Sutskever worked with Oriol Vinyals and Quoc Viet Le to create the sequence-to-sequence learning algorithm, [24] and worked on TensorFlow. [25] He is also one of the AlphaGo paper's many co-authors. [26]
At the end of 2015, Sutskever left Google to become cofounder and chief scientist of the newly founded organization OpenAI. [27] [28] [29]
Sutskever is considered to have played a key role in the development of ChatGPT. [30] [31] In 2023, he announced that he would co-lead OpenAI's new "Superalignment" project, which is trying to solve the alignment of superintelligences within four years. He wrote that even if superintelligence seems far off, it could happen this decade. [32]
Sutskever was formerly one of the six board members of the nonprofit entity that controls OpenAI. [33] The Information speculated that Sam Altman's firing resulted in part from a conflict over the extent to which the company should commit to AI safety. [34] In an all-hands company meeting shortly after the board meeting, Sutskever said that firing Altman was "the board doing its duty", [35] but the next week, he expressed regret at having participated in Altman's ouster. [36] Altman's firing and Brockman's resignation led three senior researchers to resign from OpenAI. [37] After that, Sutskever stepped down from the OpenAI board. [38] After that, he was absent from OpenAI's office. Some sources suggested he was leading the team remotely, while others said he no longer had access to the team's work and could not lead it. [39]
In May 2024, Sutskever announced his departure from OpenAI to focus on a new project that was "very personally meaningful" to him. His decision followed a turbulent period at OpenAI marked by leadership crises and internal debates about the direction of AI development and alignment protocols. Jan Leike, the other leader of the superalignment project, announced his departure hours later, citing an erosion of safety and trust in OpenAI's leadership. [40]
In June 2024, Sutskever announced Safe Superintelligence Inc., a new company he founded with Daniel Gross and Daniel Levy with offices in Palo Alto and Tel Aviv. [41] In contrast to OpenAI, which releases revenue-generating products, Sutskever said the new company's "first product will be the safe superintelligence, and it will not do anything else up until then". [13] In September 2024, the company announced that it raised $1 billion from venture capital firms including Andreessen Horowitz, Sequoia Capital, DST Global and SV Angel. [42]
Geoffrey Everest Hinton is a British-Canadian computer scientist and cognitive psychologist, most noted for his work on artificial neural networks. From 2013 to 2023, he divided his time working for Google and the University of Toronto, before publicly announcing his departure from Google in May 2023, citing concerns about the risks of artificial intelligence (AI) technology. In 2017, he co-founded and became the chief scientific advisor of the Vector Institute in Toronto.
A superintelligence is a hypothetical agent that possesses intelligence far surpassing that of the brightest and most gifted human minds. "Superintelligence" may also refer to a property of problem-solving systems whether or not these high-level intellectual competencies are embodied in agents that act in the world. A superintelligence may or may not be created by an intelligence explosion and associated with a technological singularity.
The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is nonlinear. Modern activation functions include the smooth version of the ReLU, the GELU, which was used in the 2018 BERT model, the logistic (sigmoid) function used in the 2012 speech recognition model developed by Hinton et al, the ReLU used in the 2012 AlexNet computer vision model and in the 2015 ResNet model.
Samuel Harris Altman is an American entrepreneur and investor best known as the CEO of OpenAI since 2019. He is also the chairman of clean energy companies Oklo Inc. and Helion Energy. Altman is considered to be one of the leading figures of the AI boom. He dropped out of Stanford University after two years and founded Loopt, a mobile social networking service, raising more than $30 million in venture capital. In 2011, Altman joined Y Combinator, a startup accelerator, and was its president from 2014 to 2019.
Yann André LeCun is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor of the Courant Institute of Mathematical Sciences at New York University and Vice-President, Chief AI Scientist at Meta.
Deep learning is a subset of machine learning methods based on neural networks with representation learning. The field takes inspiration from biological neuroscience and is centered around stacking artificial neurons into layers and "training" them to process data. The adjective "deep" refers to the use of multiple layers in the network. Methods used can be either supervised, semi-supervised or unsupervised.
Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.
Multimodal learning, in the context of machine learning, is a type of deep learning using multiple modalities of data, such as text, audio, or images.
Yoshua Bengio is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de Montréal and scientific director of the Montreal Institute for Learning Algorithms (MILA).
OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California. Its mission is to develop "safe and beneficial" artificial general intelligence (AGI), which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As a leading organization in the ongoing AI boom, OpenAI is known for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in November 2022 has been credited with catalyzing widespread interest in generative AI.
An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.
The ImageNet project is a large visual database designed for use in visual object recognition software research. More than 14 million images have been hand-annotated by the project to indicate what objects are pictured and in at least one million of the images, bounding boxes are also provided. ImageNet contains more than 20,000 categories, with a typical category, such as "balloon" or "strawberry", consisting of several hundred images. The database of annotations of third-party image URLs is freely available directly from ImageNet, though the actual images are not owned by ImageNet. Since 2010, the ImageNet project runs an annual software contest, the ImageNet Large Scale Visual Recognition Challenge, where software programs compete to correctly classify and detect objects and scenes. The challenge uses a "trimmed" list of one thousand non-overlapping classes.
Wojciech Zaremba is a Polish computer scientist, a founding team member of OpenAI (2016–present), where he leads both the Codex research and language teams. The teams actively work on AI that writes computer code and creating successors to GPT-3 respectively.
AlexNet is the name of a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor at the University of Toronto.
Artificial neural networks (ANNs) are models created using machine learning to perform a number of tasks. Their creation was inspired by neural circuitry. While some of the computational implementations ANNs relate to earlier discoveries in mathematics, the first implementation of ANNs was by psychologist Frank Rosenblatt, who developed the perceptron. Little research was conducted on ANNs in the 1970s and 1980s, with the AAAI calling that period an "AI winter".
Alex Krizhevsky is a Ukrainian-born Canadian computer scientist most noted for his work on artificial neural networks and deep learning. In 2012, Krizhevsky, Ilya Sutskever and their PhD advisor Geoffrey Hinton, at the University of Toronto, developed a powerful visual-recognition network AlexNet using only two GeForce NVIDIA GPU cards. This revolutionized research in neural networks. Previously neural networks were trained on CPUs. The transition to GPUs opened the way to the development of advanced AI models. AlexNet won the ImageNet challenge in 2012. Krizhevsky and Sutskever sold their startup, DNN Research Inc., to Google, shortly after winning the contest. Krizhevsky left Google in September 2017 after losing interest in the work, to work at the company Dessa in support of new deep-learning techniques. Many of his numerous papers on machine learning and computer vision are frequently cited by other researchers. He is also the main author of the CIFAR-10 and CIFAR-100 datasets.
François Chollet is a French software engineer and artificial intelligence researcher currently working at Google. Chollet is the creator of the Keras deep-learning library, released in 2015. His research focuses on computer vision, the application of machine learning to formal reasoning, abstraction, and how to achieve greater generality in artificial intelligence.
Oriol Vinyals is a Spanish machine learning researcher at DeepMind, where he is the principal research scientist. His research in DeepMind is regularly featured in the mainstream media especially after being acquired by Google.
Lê Viết Quốc, or in romanized form Quoc Viet Le, is a Vietnamese-American computer scientist and a machine learning pioneer at Google Brain, which he established with others from Google. He co-invented the doc2vec and seq2seq models in natural language processing. Le also initiated and lead the AutoML initiative at Google Brain, including the proposal of neural architecture search.
Jan Leike is an AI alignment researcher who has worked at DeepMind and OpenAI. He joined Anthropic in May 2024.
OpenAI is governed by the board of the OpenAI Nonprofit, composed of OpenAI Global, LLC employees Greg Brockman (Chairman & President), Ilya Sutskever (Chief Scientist), and Sam Altman (CEO), and non-employees Adam D'Angelo, Tasha McCauley, Helen Toner.