This article contains wording that promotes the subject in a subjective manner without imparting real information.(October 2025) |
Ashish Vaswani | |
|---|---|
| Born | 1986 (age 38–39) |
| Alma mater | |
| Known for | Transformer (deep learning architecture) |
| Scientific career | |
| Fields | |
| Institutions |
|
| Thesis | Smaller, Faster, and Accurate Models for Statistical Machine Translation (2014) |
| Doctoral advisor |
|
Ashish Vaswani (born 1986) [1] is an Indian computer scientist also known as the co-inventor of the Transformer architecture, a foundational breakthrough in deep learning that underpins models like GPT, BERT, and others in natural language processing and AI.
Vaswani is known for his contributions in the field of deep learning, partially as a co-author of the 2017 paper "Attention Is All You Need," which introduced the Transformer neural network. [2] This breakthrough in artificial intelligence laid the foundation for GPT, BERT, ChatGPT, and their successors.
Vaswani completed his engineering in Computer Science from BIT Mesra in 2002. In 2004, he moved to the US to pursue higher studies at University of Southern California. [3] He earned his PhD at the University of Southern California under the supervision of Prof. David Chiang. [4] He has worked as a researcher at Google, [5] where he was part of the Google Brain team. He was a co-founder of Adept AI Labs, but has since left the company. [6] [7]
Vaswani's paper, "Attention Is All You Need", was published in 2017. [8] The paper introduced the Transformer model, which eschews the use of recurrence in sequence-to-sequence tasks and relies entirely on self-attention mechanisms. The model has been instrumental in the development of several subsequent state-of-the-art models in NLP, including BERT, [9] GPT-2, and GPT-3.