Ari Holtzman

Last updated

Ari Holtzman is a professor of Computer Science at the University of Chicago and an expert in the area of Natural language processing and Computational linguistics [1] . Previously, Holtzman was a PhD student at the University of Washington where he was advised by Luke Zettlemoyer.

In 2017, he was a member of the winning team for the inaugural Alexa Prize for developing a conversational AI system for the Amazon Alexa device. [2] Holtzman has made multiple contributions in the area of text generation and language models such as the introduction of nucleus sampling in 2019, his work on AI safety and neural fake news detection, and the fine-tuning of quantized large language models. [3] [4]

Related Research Articles

<span class="mw-page-title-main">University of Washington</span> Public research university in Seattle, Washington, United States

The University of Washington is a public research university in Seattle, Washington. Founded on November 4, 1861, as Territorial University, Washington is one of the oldest universities on the West Coast; it was established in Seattle approximately a decade after the city's founding.

<span class="mw-page-title-main">Maxim Kontsevich</span> Russian and French mathematician (born 1964)

Maxim Lvovich Kontsevich is a Russian and French mathematician and mathematical physicist. He is a professor at the Institut des Hautes Études Scientifiques and a distinguished professor at the University of Miami. He received the Henri Poincaré Prize in 1997, the Fields Medal in 1998, the Crafoord Prize in 2008, the Shaw Prize and Breakthrough Prize in Fundamental Physics in 2012, and the Breakthrough Prize in Mathematics in 2015.

<span class="mw-page-title-main">Chatbot</span> Program that simulates conversation

A chatbot is a software application or web interface that aims to mimic human conversation through text or voice interactions. Modern chatbots are typically online and use artificial intelligence (AI) systems that are capable of maintaining a conversation with a user in natural language and simulating the way a human would behave as a conversational partner. Such technologies often utilize aspects of deep learning and natural language processing, but more simplistic chatbots have been around for decades prior.

The Language Technologies Institute (LTI) is a research institute at Carnegie Mellon University in Pittsburgh, Pennsylvania, United States, and focuses on the area of language technologies. The institute is home to 33 faculty with the primary scholarly research of the institute focused on machine translation, speech recognition, speech synthesis, information retrieval, parsing, information extraction, and multimodal machine learning. Until 1996, the institute existed as the Center for Machine Translation, which was established in 1986. Subsequently, from 1996 onwards, it started awarding degrees, and the name was changed to The Language Technologies Institute. The institute was founded by Professor Jaime Carbonell, who served as director until his death in February 2020. He was followed by Jamie Callan, and then Carolyn Rosé, as interim directors. In August 2023, Mona Diab became the director of the institute.

Ashwin Ram is an Indian-American computer scientist. He was chief innovation officer at PARC from 2011 to 2016, and published books and scientific articles and helped start at least two companies.

<span class="mw-page-title-main">Virtual assistant</span> Software agent

A virtual assistant (VA) is a software agent that can perform a range of tasks or services for a user based on user input such as commands or questions, including verbal ones. Such technologies often incorporate chatbot capabilities to simulate human conversation, such as via online chat, to facilitate interaction with their users. The interaction may be via text, graphical interface, or voice - as some virtual assistants are able to interpret human speech and respond via synthesized voices.

<span class="mw-page-title-main">University of Wisconsin–Madison</span> Public university in Madison, Wisconsin, United States

The University of Wisconsin–Madison is a public land-grant research university in Madison, Wisconsin. UW–Madison serves as the official state university of Wisconsin and the flagship campus of the University of Wisconsin System, while also earning recognition as a "Public Ivy". Founded in 1848 when Wisconsin achieved statehood, UW–Madison was the first public university established in Wisconsin and remains the oldest and largest public university in the state. UW–Madison became a land-grant institution in 1866. The 933-acre (378 ha) main campus, located on the shores of Lake Mendota, includes four National Historic Landmarks. The university also owns and operates the 1,200-acre (486 ha) University of Wisconsin–Madison Arboretum, located 4 miles (6.4 km) south of the main campus, which is also a National Historic Landmark.

<span class="mw-page-title-main">Jeff Dean</span> American computer scientist and software engineer

Jeffrey Adgate "Jeff" Dean is an American computer scientist and software engineer. Since 2018, he has been the lead of Google AI. He was appointed Google’s chief scientist in 2023 after a reorganization of Google’s AI focused groups.

Semantic Scholar is an artificial intelligence–powered research tool for scientific literature developed at the Allen Institute for AI and publicly released in November 2015. It uses advances in natural language processing to provide summaries for scholarly papers. The Semantic Scholar team is actively researching the use of artificial intelligence in natural language processing, machine learning, human–computer interaction, and information retrieval.

<span class="mw-page-title-main">OpenAI</span> Artificial intelligence research organization

OpenAI is an American artificial intelligence (AI) research laboratory consisting of the non-profit OpenAI, Inc. and its for-profit subsidiary corporation OpenAI, L.P.. OpenAI conducts research on artificial intelligence with the declared intention of developing "safe and beneficial" artificial general intelligence, which it defines as "highly autonomous systems that outperform humans at most economically valuable work".

Amazon Alexa, also known simply as Alexa, is a virtual assistant technology largely based on a Polish speech synthesizer named Ivona, bought by Amazon in 2013. It was first used in the Amazon Echo smart speaker and the Echo Dot, Echo Studio and Amazon Tap speakers developed by Amazon Lab126. It is capable of many NLP tasks such as voice interaction, music playback, making to-do lists, setting alarms, streaming podcasts, playing audiobooks, and providing weather, traffic, sports, and other real-time information, such as news. Alexa can also control several smart devices using itself as a home automation system. Users are able to extend the Alexa capabilities by installing "skills" such as weather programs and audio features. It performs these tasks using automatic speech recognition, natural language processing, and other forms of weak AI.

Amazon Lex is a service for building conversational interfaces into any application using voice and text. It powers the Amazon Alexa virtual assistant. In April 2017, the platform was released to the developer community, and suggested that it could be used for conversational interfaces including Web, mobile apps, robots, toys, drones, and more. Amazon already had launched Alexa Voice Services, which developers can use to integrate Alexa into their own devices, like smart speakers, alarm clocks, etc.; however, Lex will not require that end users interact with the Alexa assistant per se, but rather any type of assistant or interface. As of February 2018, users can now define a response for Amazon Lex chatbots directly from the AWS management console.

Mari Ostendorf is a professor of electrical engineering in the area of speech and language technology and the vice provost for research at the University of Washington.

<span class="mw-page-title-main">GPT-3</span> 2020 large language model

Generative Pre-trained Transformer 3 (GPT-3) is a large language model released by OpenAI in 2020. Like its predecessor GPT-2, it is a decoder-only transformer model of deep neural network, which uses attention in place of previous recurrence- and convolution-based architectures. Attention mechanisms allow the model to selectively focus on segments of input text it predicts to be the most relevant. It uses a 2048-tokens-long context and then-unprecedented size of 175 billion parameters, requiring 800GB to store. The model demonstrated strong zero-shot and few-shot learning on many tasks.

<span class="mw-page-title-main">GPT-2</span> 2019 text-generating language model

Generative Pre-trained Transformer 2 (GPT-2) is a large language model by OpenAI and the second in their foundational series of GPT models. GPT-2 was pre-trained on BookCorpus, a dataset of over 7,000 unpublished fiction books from various genres, and trained on a dataset of 8 million web pages. It was partially released in February 2019, followed by full release of the 1.5-billion-parameter model on November 5, 2019.

<span class="mw-page-title-main">Midjourney</span> Image-generating machine learning model

Midjourney is a generative artificial intelligence program and service created and hosted by San Francisco-based independent research lab Midjourney, Inc. Midjourney generates images from natural language descriptions, called "prompts", similar to OpenAI's DALL-E and Stable Diffusion.

Hugging Face, Inc. is a French-American company that develops tools for building applications using machine learning. It is most notable for its transformers library built for natural language processing applications and its platform that allows users to share machine learning models and datasets.

<span class="mw-page-title-main">ChatGPT</span> AI chatbot developed by OpenAI

ChatGPT, which stands for Chat Generative Pre-trained Transformer, is a large language model-based chatbot developed by OpenAI and launched on November 30, 2022, notable for enabling users to refine and steer a conversation towards a desired length, format, style, level of detail, and language used. Successive prompts and replies, known as prompt engineering, are considered at each conversation stage as a context.

<span class="mw-page-title-main">Large language model</span> Neural network with billions of weights

A large language model (LLM) is a language model characterized by its large size. Their size is enabled by AI accelerators, which are able to process vast amounts of text data, mostly scraped from the Internet. The artificial neural networks which are built can contain from tens of millions and up to billions of weights and are (pre-)trained using self-supervised learning and semi-supervised learning. Transformer architecture contributed to faster training. Alternative architectures include the mixture of experts (MoE), which has been proposed by Google, starting with sparsely-gated ones in 2017, Gshard in 2021 to GLaM in 2022.

Yang Liu is a Chinese and American computer scientist specializing in speech recognition, and a principal scientist for Amazon Alexa.

References

  1. "Ari Holtzman". UChicago Faculty. Retrieved 23 August 2023.
  2. Langston, Jennifer. "UW students win Amazon's inaugural Alexa Prize for most engaging socialbot". UW News. Retrieved 23 August 2023.
  3. Ray, Tiernan. "To Catch a Fake: Machine learning sniffs out its own machine-written propaganda". zdnet. Retrieved 23 August 2023.
  4. Dettmers, Tim; Pagnoni, Artidoro; Holtzman, Ari; Zettlemoyer, Luke (2023-05-01). "QLoRA: Efficient Finetuning of Quantized LLMs". arXiv: 2305.14314 [cs.LG].