Ashish Vaswani

Last updated
Ashish Vaswani
Born1986 (age 3839)
Alma mater
Known for Transformer (deep learning architecture)
Scientific career
Fields
Institutions
Thesis Smaller, Faster, and Accurate Models for Statistical Machine Translation  (2014)
Doctoral advisor
  • David Chiang
  • Liang Huang

Ashish Vaswani (born 1986) [1] is an Indian computer scientist. Vaswani conducted research at Google Brain and, earlier in his career, was affiliated with the Information Sciences Institute at the University of Southern California.

Contents

Vaswani is a co-author of the 2017 paper "Attention Is All You Need," which introduced the Transformer neural network architecture. [2] The Transformer model has been used in the development of subsequent NLP models BERT, ChatGPT, and their successors.

Career

Vaswani completed his engineering in Computer Science from Birla Institute of Technology, Mesra (BIT Mesra) in 2002. In 2004, he enrolled at the University of Southern California for graduate studies. [3] He earned his PhD in Computer Science at the University of Southern California supervised by David Chiang. [4] During his research career at Google, [5] Vaswani was part of the Google Brain team, where he conducted the work leading to the 'Attention Is All You Need' publication. Prior to joining Google, he was affiliated with the Information Sciences Institute at the University of Southern California.

After Google, Vaswani co-founded Adept AI, a machine learning-focused startup that developed AI agents and tools for software automation. He has since left the company. [6] [7] He is currently co-founder and CEO of Essential AI.

Notable works

Vaswani's most notable paper, "Attention Is All You Need", was published in 2017. [8] The paper introduced the Transformer model, which uses self-attention mechanisms instead of recurrence for sequence-to-sequence tasks.


The Transformer architecture has become foundational to modern language models and NLP systems, including BERT (2018), [9] GPT-2, and GPT-3. (2019-2020), and more recent models such as ChatGPT, GPT-4, and GPT-5. The 'Attention Is All You Need' paper is among the most cited papers in machine learning.

References

  1. Nichil, Geoffrey (16 November 2024). "Who is Ashish Vaswani?". Synaptiks. Archived from the original on 15 December 2024.
  2. Ashish Vaswani; Noam Shazeer; Niki Parmar; Jakob Uszkoreit; Llion Jones; Aidan N. Gomez; Łukasz Kaiser; Illia Polosukhin (12 June 2017). "Attention is All you Need" (PDF). Advances in Neural Information Processing Systems 30. Advances in Neural Information Processing Systems. arXiv: 1706.03762 . Wikidata   Q30249683.
  3. Team, OfficeChai (February 4, 2023). "The Indian Researchers Whose Work Led To The Creation Of ChatGPT". OfficeChai.
  4. "Ashish Vaswani's webpage at ISI". www.isi.edu.
  5. "Transformer: A Novel Neural Network Architecture for Language Understanding". ai.googleblog.com. August 31, 2017.
  6. Rajesh, Ananya Mariam; Hu, Krystal; Rajesh, Ananya Mariam; Hu, Krystal (March 16, 2023). "AI startup Adept raises $350 mln in fresh funding". Reuters via www.reuters.com.
  7. Tong, Anna; Hu, Krystal; Tong, Anna; Hu, Krystal (2023-05-04). "Top ex-Google AI researchers raise funding from Thrive Capital". Reuters. Retrieved 2023-07-11.
  8. Dawson, Caitlin (March 9, 2023). "USC Alumni Paved Path for ChatGPT". USC Viterbi | School of Engineering.
  9. Devlin, Jacob; Chang, Ming-Wei; Lee, Kenton; Toutanova, Kristina (May 24, 2019). "BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding". arXiv: 1810.04805 [cs.CL].