Ilya Sutskever | |
---|---|
איליה סוצקבר Илья Суцкевер | |
Born | Илья́ Ефи́мович Суцке́вер Ilya Efimovich Sutskever 8 December 1986 |
Citizenship | Canadian, Israeli [3] |
Alma mater | Open University of Israel University of Toronto (BS, MS, PhD) |
Known for | AlexNet Co-founding OpenAI Founding SSI Inc. |
Scientific career | |
Fields | Machine learning Neural networks Artificial intelligence Deep learning [4] |
Institutions | University of Toronto Google Brain OpenAI |
Thesis | Training Recurrent Neural Networks (2013) |
Doctoral advisor | Geoffrey Hinton [5] [6] |
Website | www |
Ilya Sutskever FRS (born 8 December 1986) is a Canadian-Israeli-Russian computer scientist who specializes in machine learning. [4]
Sutskever has made several major contributions to the field of deep learning. [7] [8] [9] He is notably the co-inventor, with Alex Krizhevsky and Geoffrey Hinton, of AlexNet, a convolutional neural network. [10]
Sutskever co-founded and is a former chief scientist at OpenAI. [11] In 2023, he was one of the members of OpenAI's board that ousted Sam Altman from his position as CEO; Altman returned a week later, and Sutskever stepped down from the board. In June 2024, Sutskever co-founded the company Safe Superintelligence with Daniel Gross and Daniel Levy. [12] [13]
Sutskever was born into a Jewish family [14] in Nizhny Novgorod, Russia (then Gorky, Soviet Union). At the age of 5, he made aliyah with his family and lived in Jerusalem, Israel, [15] [16] until he was 16, when his family moved to Canada. [17] Sutskever attended the Open University of Israel from 2000 to 2002. [18] After moving to Canada, he attended the University of Toronto in Ontario. [18]
Sutskever received a Bachelor of Science in mathematics from the University of Toronto in 2005, [18] [19] [2] [20] a Master of Science in computer science in 2007, [19] [21] and a Doctor of Philosophy in computer science in 2013. [6] [22] [23] His doctoral supervisor was Geoffrey Hinton. [5]
In 2012, Sutskever built AlexNet in collaboration with Hinton and Alex Krizhevsky. To support AlexNet's computing demands, he bought many GTX 580 GPUs online. [24]
In 2012, Sutskever spent about two months as a postdoc with Andrew Ng at Stanford University. He then returned to the University of Toronto and joined Hinton's new research company DNNResearch, a spinoff of Hinton's research group. In 2013, Google acquired DNNResearch and hired Sutskever as a research scientist at Google Brain. [25]
At Google Brain, Sutskever worked with Oriol Vinyals and Quoc Viet Le to create the sequence-to-sequence learning algorithm, [26] and worked on TensorFlow. [27] He is also one of the AlphaGo paper's many co-authors. [28]
At the end of 2015, Sutskever left Google to become cofounder and chief scientist of the newly founded organization OpenAI. [29] [30] [31]
Sutskever is considered to have played a key role in the development of ChatGPT. [32] [33] In 2023, he announced that he would co-lead OpenAI's new "Superalignment" project, which is trying to solve the alignment of superintelligences within four years. He wrote that even if superintelligence seems far off, it could happen this decade. [34]
Sutskever was formerly one of the six board members of the nonprofit entity that controls OpenAI. [35] On November 17, 2023, the board fired Sam Altman, saying that "he was not consistently candid in his communications with the board". [36] The Information speculated that the decision was partly driven by conflict over the extent to which the company should commit to AI safety. [37] In an all-hands company meeting shortly after the board meeting, Sutskever said that firing Altman was "the board doing its duty", [38] but the next week, he expressed regret at having participated in Altman's ouster. [39] Altman's firing and Brockman's resignation led three senior researchers to resign from OpenAI. [40] After that, Sutskever stepped down from the OpenAI board. [41] After that, he was absent from OpenAI's office. Some sources suggested he was leading the team remotely, while others said he no longer had access to the team's work. [42]
In May 2024, Sutskever announced his departure from OpenAI to focus on a new project that was "very personally meaningful" to him. His decision followed a turbulent period at OpenAI marked by leadership crises and internal debates about the direction of AI development and alignment protocols. Jan Leike, the other leader of the superalignment project, announced his departure hours later, citing an erosion of safety and trust in OpenAI's leadership. [43]
In June 2024, Sutskever announced Safe Superintelligence Inc., a new company he founded with Daniel Gross and Daniel Levy with offices in Palo Alto and Tel Aviv. [44] In contrast to OpenAI, which releases revenue-generating products, Sutskever said the new company's "first product will be the safe superintelligence, and it will not do anything else up until then". [13] In September 2024, the company announced that it had raised $1 billion from venture capital firms including Andreessen Horowitz, Sequoia Capital, DST Global, and SV Angel. [45]
In an October 2024 interview after winning the Nobel Prize in Physics, Geoffrey Hinton expressed support for Sutskever's decision to fire Altman, emphasizing concerns about AI safety. [46] [47]
Geoffrey Everest Hinton is a British-Canadian computer scientist, cognitive scientist, cognitive psychologist, known for his work on artificial neural networks which earned him the title as the "Godfather of AI".
Artificial general intelligence (AGI) is a type of artificial intelligence (AI) that matches or surpasses human cognitive capabilities across a wide range of cognitive tasks. This contrasts with narrow AI, which is limited to specific tasks. Artificial superintelligence (ASI), on the other hand, refers to AGI that greatly exceeds human cognitive capabilities. AGI is considered one of the definitions of strong AI.
A superintelligence is a hypothetical agent that possesses intelligence surpassing that of the brightest and most gifted human minds. "Superintelligence" may also refer to a property of problem-solving systems whether or not these high-level intellectual competencies are embodied in agents that act in the world. A superintelligence may or may not be created by an intelligence explosion and associated with a technological singularity.
The activation function of a node in an artificial neural network is a function that calculates the output of the node based on its individual inputs and their weights. Nontrivial problems can be solved using only a few nodes if the activation function is nonlinear. Modern activation functions include the smooth version of the ReLU, the GELU, which was used in the 2018 BERT model, the logistic (sigmoid) function used in the 2012 speech recognition model developed by Hinton et al, the ReLU used in the 2012 AlexNet computer vision model and in the 2015 ResNet model.
Samuel Harris Altman is an American entrepreneur and investor best known as the chief executive officer of OpenAI since 2019. He is also the Chairman of clean energy companies Oklo Inc. and Helion Energy. Altman is considered to be one of the leading figures of the AI boom. He dropped out of Stanford University after two years and founded Loopt, a mobile social networking service, raising more than $30 million in venture capital. In 2011, Altman joined Y Combinator, a startup accelerator, and was its president from 2014 to 2019.
Yann André LeCun is a French-American computer scientist working primarily in the fields of machine learning, computer vision, mobile robotics and computational neuroscience. He is the Silver Professor of the Courant Institute of Mathematical Sciences at New York University and Vice President, Chief AI Scientist at Meta.
Google Brain was a deep learning artificial intelligence research team that served as the sole AI branch of Google before being incorporated under the newer umbrella of Google AI, a research division at Google dedicated to artificial intelligence. Formed in 2011, it combined open-ended machine learning research with information systems and large-scale computing resources. It created tools such as TensorFlow, which allow neural networks to be used by the public, and multiple internal AI research projects, and aimed to create research opportunities in machine learning and natural language processing. It was merged into former Google sister company DeepMind to form Google DeepMind in April 2023.
Multimodal learning is a type of deep learning that integrates and processes multiple types of data, referred to as modalities, such as text, audio, images, or video. This integration allows for a more holistic understanding of complex data, improving model performance in tasks like visual question answering, cross-modal retrieval, text-to-image generation, aesthetic ranking, and image captioning.
Yoshua Bengio is a Canadian computer scientist, most noted for his work on artificial neural networks and deep learning. He is a professor at the Department of Computer Science and Operations Research at the Université de Montréal and scientific director of the Montreal Institute for Learning Algorithms (MILA).
OpenAI is an American artificial intelligence (AI) research organization founded in December 2015 and headquartered in San Francisco, California. Its mission is to develop "safe and beneficial" artificial general intelligence (AGI), which it defines as "highly autonomous systems that outperform humans at most economically valuable work". As a leading organization in the ongoing AI boom, OpenAI is known for the GPT family of large language models, the DALL-E series of text-to-image models, and a text-to-video model named Sora. Its release of ChatGPT in November 2022 has been credited with catalyzing widespread interest in generative AI.
Wojciech Zaremba is a Polish computer scientist, a founding team member of OpenAI (2016–present), where he leads both the Codex research and language teams. The teams actively work on AI that writes computer code and creating successors to GPT-3 respectively.
AlexNet is the name of a convolutional neural network (CNN) architecture, designed by Alex Krizhevsky in collaboration with Ilya Sutskever and Geoffrey Hinton, who was Krizhevsky's Ph.D. advisor at the University of Toronto. It had 60 million parameters and 650,000 neurons.
Alex Krizhevsky is a Ukrainian-born Canadian computer scientist most noted for his work on artificial neural networks and deep learning. In 2012, Krizhevsky, Ilya Sutskever and their PhD advisor Geoffrey Hinton, at the University of Toronto, developed a powerful visual-recognition network AlexNet using only two GeForce NVIDIA GPU cards. This revolutionized research in neural networks. Previously neural networks were trained on CPUs. The transition to GPUs opened the way to the development of advanced AI models. AlexNet won the ImageNet challenge in 2012. Krizhevsky and Sutskever sold their startup, DNN Research Inc., to Google, shortly after winning the contest. Krizhevsky left Google in September 2017 after losing interest in the work, to work at the company Dessa in support of new deep-learning techniques. Many of his numerous papers on machine learning and computer vision are frequently cited by other researchers. He is also the main author of the CIFAR-10 and CIFAR-100 datasets.
Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020.
Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained on a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.
A generative pre-trained transformer (GPT) is a type of large language model (LLM) and a prominent framework for generative artificial intelligence. It is an artificial neural network that is used in natural language processing by machines. It is based on the transformer deep learning architecture, pre-trained on large data sets of unlabeled text, and able to generate novel human-like content. As of 2023, most LLMs had these characteristics and are sometimes referred to broadly as GPTs.
Greg Brockman is an American entrepreneur, investor and software developer who is a co-founder and currently the president of OpenAI. He began his career at Stripe in 2010, upon leaving MIT, and became their CTO in 2013. He left Stripe in 2015 to co-found OpenAI, where he also assumed the role of CTO.
On November 17, 2023, OpenAI's board of directors ousted co-founder and chief executive Sam Altman after the board had no confidence in his leadership. The removal was caused by concerns about his handling of artificial intelligence safety, and allegations of abusive behavior. Altman was reinstated on November 22 after pressure from employees and investors.
"Attention Is All You Need" is a 2017 landmark research paper in machine learning authored by eight scientists working at Google. The paper introduced a new deep learning architecture known as the transformer, based on the attention mechanism proposed in 2014 by Bahdanau et al. It is considered a foundational paper in modern artificial intelligence, as the transformer approach has become the main architecture of large language models like those based on GPT. At the time, the focus of the research was on improving Seq2seq techniques for machine translation, but the authors go further in the paper, foreseeing the technique's potential for other tasks like question answering and what is now known as multimodal Generative AI.
Safe Superintelligence Inc. or SSI Inc. is an American artificial intelligence company founded by Ilya Sutskever, Daniel Gross and Daniel Levy. The company's mission is to focus on safely developing a superintelligence, an agent capable of surpassing human intelligence.
OpenAI is governed by the board of the OpenAI Nonprofit, composed of OpenAI Global, LLC employees Greg Brockman (Chairman & President), Ilya Sutskever (Chief Scientist), and Sam Altman (CEO), and non-employees Adam D'Angelo, Tasha McCauley, Helen Toner.