Dan Roth

Last updated
Dan Roth
Roth dan-057(web).jpg
Dan Roth, 2011
Born
Alma mater Harvard University
Known forJoint Learning and Inference: ILP formulations of NLP tasks..., [1] Machine Learning for NLP, Probabilistic Reasoning
AwardsACM Fellow; IJCAI John McCarthy Award [2] [3]
Scientific career
Fields Computer Science, Machine Learning, Natural Language Processing, Automated reasoning, Information Extraction.
Institutions University of Illinois at Urbana-Champaign, University of Pennsylvania
Doctoral advisor Leslie Valiant
Website www.cis.upenn.edu/~danroth

Dan Roth is the Eduardo D. Glandt Distinguished Professor of Computer and Information Science at the University of Pennsylvania [4] and the Chief AI Scientist at Oracle. Until June 2024 Dan was a VP/Distinguished Scientist at AWS AI. In his role at AWS Roth led over the last three years the scientific effort behind the first-generation Generative AI products from AWS, including Titan Models, Amazon Q efforts, and Bedrock, from inception until they became generally available.

Roth got his B.A summa cum laude in Mathematics from the Technion, Israel and his Ph.D in Computer Science from Harvard University in 1995. [5] He taught at the University of Illinois at Urbana-Champaign from 1998 to 2017 before moving to the University of Pennsylvania. [6]

Professional career

Roth is a Fellow of the American Association for the Advancement of Science (AAAS), [7] the Association for Computing Machinery (ACM), [8] the Association for the Advancement of Artificial Intelligence (AAAI), [9] and the Association of Computational Linguistics (ACL). [10]

Roth’s research [11] focuses on the computational foundations of intelligent behavior. He develops theories and systems pertaining to intelligent behavior using a unified methodology, at the heart of which is the idea that learning has a central role in intelligence. His work centers around the study of machine learning and inference methods to facilitate natural language understanding. In doing that he has pursued several interrelated lines of work that span multiple aspects of this problem - from fundamental questions in learning and inference and how they interact, [12] to the study of a range of natural language processing (NLP) problems and developing advanced machine learning based tools for natural language applications. [13]

Roth has made seminal contribution to the fusion of Learning and Reasoning, [14] Machine Learning with weak, incidental supervision, [15] and to machine learning and inference approaches to natural language understanding. He has written the first paper on zero-shot learning in natural language processing, a 2008 paper by Chang, Ratinov, Roth, and Srikumar that was published at AAAI’08, but the name given to the learning paradigm there was dataless classification. [16] Roth has worked on probabilistic reasoning (including its complexity [17] and probabilistic lifted inference [18] ), Constrained Conditional Models (ILP formulations of NLP problems) and constraints-driven learning, [19] [20] part-based (constellation) methods in object recognition, [21] response based Learning, [22] He has developed NLP and Information extraction tools that are being used broadly by researchers and commercially, including NER, coreference resolution, wikification, SRL, and ESL text correction. [13]

Roth is a co-founder of NexLP, Inc., a startup that applies natural language processing and machine learning in the legal and compliance domains. In 2020, NexLP was acquired by Reveal, Inc., an e-discovery software company. [23] He is currently on the scientific advisory board of the Allen Institute for AI. [24]

Related Research Articles

Natural language processing (NLP) is an interdisciplinary subfield of computer science and artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. Typically data is collected in text corpora, using either rule-based, statistical or neural-based approaches in machine learning and deep learning.

A Bayesian network is a probabilistic graphical model that represents a set of variables and their conditional dependencies via a directed acyclic graph (DAG). While it is one of several forms of causal notation, causal networks are special cases of Bayesian networks. Bayesian networks are ideal for taking an event that occurred and predicting the likelihood that any one of several possible known causes was the contributing factor. For example, a Bayesian network could represent the probabilistic relationships between diseases and symptoms. Given symptoms, the network can be used to compute the probabilities of the presence of various diseases.

In artificial intelligence, symbolic artificial intelligence is the term for the collection of all methods in artificial intelligence research that are based on high-level symbolic (human-readable) representations of problems, logic and search. Symbolic AI used tools such as logic programming, production rules, semantic nets and frames, and it developed applications such as knowledge-based systems, symbolic mathematics, automated theorem provers, ontologies, the semantic web, and automated planning and scheduling systems. The Symbolic AI paradigm led to seminal ideas in search, symbolic programming languages, agents, multi-agent systems, the semantic web, and the strengths and limitations of formal knowledge and reasoning systems.

<span class="mw-page-title-main">Stuart J. Russell</span> British computer scientist and author (born 1962)

Stuart Jonathan Russell is a British computer scientist known for his contributions to artificial intelligence (AI). He is a professor of computer science at the University of California, Berkeley and was from 2008 to 2011 an adjunct professor of neurological surgery at the University of California, San Francisco. He holds the Smith-Zadeh Chair in Engineering at University of California, Berkeley. He founded and leads the Center for Human-Compatible Artificial Intelligence (CHAI) at UC Berkeley. Russell is the co-author with Peter Norvig of the authoritative textbook of the field of AI: Artificial Intelligence: A Modern Approach used in more than 1,500 universities in 135 countries.

<span class="mw-page-title-main">Judea Pearl</span> Computer scientist (born 1936)

Judea Pearl is an Israeli-American computer scientist and philosopher, best known for championing the probabilistic approach to artificial intelligence and the development of Bayesian networks. He is also credited for developing a theory of causal and counterfactual inference based on structural models. In 2011, the Association for Computing Machinery (ACM) awarded Pearl with the Turing Award, the highest distinction in computer science, "for fundamental contributions to artificial intelligence through the development of a calculus for probabilistic and causal reasoning". He is the author of several books, including the technical Causality: Models, Reasoning and Inference, and The Book of Why, a book on causality aimed at the general public.

<span class="mw-page-title-main">Daphne Koller</span> Israeli-American computer scientist

Daphne Koller is an Israeli-American computer scientist. She was a professor in the department of computer science at Stanford University and a MacArthur Foundation fellowship recipient. She is one of the founders of Coursera, an online education platform. Her general research area is artificial intelligence and its applications in the biomedical sciences. Koller was featured in a 2004 article by MIT Technology Review titled "10 Emerging Technologies That Will Change Your World" concerning the topic of Bayesian machine learning.

Hector Joseph Levesque is a Canadian academic and researcher in artificial intelligence. His research concerns incorporating commonsense reasoning in intelligent systems and he initiated the Winograd Schemas Challenge.

<span class="mw-page-title-main">Jaime Carbonell</span> American computer scientist (1953–2020)

Jaime Guillermo Carbonell was a computer scientist who made seminal contributions to the development of natural language processing tools and technologies. His extensive research in machine translation resulted in the development of several state-of-the-art language translation and artificial intelligence systems. He earned his B.S. degrees in Physics and in Mathematics from MIT in 1975 and did his Ph.D. under Dr. Roger Schank at Yale University in 1979. He joined Carnegie Mellon University as an assistant professor of computer science in 1979 and lived in Pittsburgh from then. He was affiliated with the Language Technologies Institute, Computer Science Department, Machine Learning Department, and Computational Biology Department at Carnegie Mellon.

James Frederick Allen is an American computational linguist recognized for his contributions to temporal logic, in particular Allen's interval algebra. He is interested in knowledge representation, commonsense reasoning, and natural language understanding, believing that "deep language understanding can only currently be achieved by significant hand-engineering of semantically-rich formalisms coupled with statistical preferences". He is the John H. Dessaurer Professor of Computer Science at the University of Rochester.

<span class="mw-page-title-main">Manuela M. Veloso</span> Portuguese-American computer scientist

Manuela Maria Veloso is the Head of J.P. Morgan AI Research & Herbert A. Simon University Professor Emeritus in the School of Computer Science at Carnegie Mellon University, where she was previously Head of the Machine Learning Department. She served as president of Association for the Advancement of Artificial Intelligence (AAAI) until 2014, and the co-founder and a Past President of the RoboCup Federation. She is a fellow of AAAI, Institute of Electrical and Electronics Engineers (IEEE), American Association for the Advancement of Science (AAAS), and Association for Computing Machinery (ACM). She is an international expert in artificial intelligence and robotics.

Language and Communication Technologies is the scientific study of technologies that explore language and communication. It is an interdisciplinary field that encompasses the fields of computer science, linguistics and cognitive science.

A constrained conditional model (CCM) is a machine learning and inference framework that augments the learning of conditional models with declarative constraints. The constraint can be used as a way to incorporate expressive prior knowledge into the model and bias the assignments made by the learned model to satisfy these constraints. The framework can be used to support decisions in an expressive output space while maintaining modularity and tractability of training and inference.

<span class="mw-page-title-main">Eric Horvitz</span> American computer scientist, and Technical Fellow at Microsoft

Eric Joel Horvitz is an American computer scientist, and Technical Fellow at Microsoft, where he serves as the company's first Chief Scientific Officer. He was previously the director of Microsoft Research Labs, including research centers in Redmond, WA, Cambridge, MA, New York, NY, Montreal, Canada, Cambridge, UK, and Bangalore, India.

Dragomir R. Radev was an American computer scientist who was a professor at Yale University, working on natural language processing and information retrieval. He also served as a University of Michigan computer science professor and Columbia University computer science adjunct professor, as well as a Member of the Advisory Board of Lawyaw.

David Leigh Waltz was a computer scientist who made significant contributions in several areas of artificial intelligence, including constraint satisfaction, case-based reasoning and the application of massively parallel computation to AI problems. He held positions in academia and industry and at the time of his death, was a professor of Computer Science at Columbia University where he directed the Center for Computational Learning Systems.

In natural language processing (NLP), a word embedding is a representation of a word. The embedding is used in text analysis. Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using language modeling and feature learning techniques, where words or phrases from the vocabulary are mapped to vectors of real numbers.

<span class="mw-page-title-main">Francesca Rossi</span> Italian computer scientist

Francesca Rossi is an Italian computer scientist, currently working at the IBM Thomas J. Watson Research Center as an IBM Fellow and the IBM AI Ethics Global Leader.

Jiebo Luo is a Chinese-American computer scientist, the Albert Arendt Hopeman Professor of Engineering and Professor of Computer Science at the University of Rochester. He is interested in artificial intelligence, data science and computer vision.

Zero-shot learning (ZSL) is a problem setup in deep learning where, at test time, a learner observes samples from classes which were not observed during training, and needs to predict the class that they belong to. The name is a play on words based on the earlier concept of one-shot learning, in which classification can be learned from only one, or a few, examples.

<span class="mw-page-title-main">Thomas Dean (computer scientist)</span> American computer scientist

Thomas L. Dean is an American computer scientist known for his work in robot planning, probabilistic graphical models, and computational neuroscience. He was one of the first to introduce ideas from operations research and control theory to artificial intelligence. In particular, he introduced the idea of the anytime algorithm and was the first to apply the factored Markov decision process to robotics. He has authored several influential textbooks on artificial intelligence.

References

  1. Constrained Conditional Models
  2. "Welcome to IJCAI 2017!".
  3. "Roth honored with the IJCAI John McCarthy Award".
  4. "Penn Engineering - Research Directory Profile". www.seas.upenn.edu. Retrieved 2017-08-29.
  5. "Dan Roth's Webpage". Archived from the original on 2016-01-08. Retrieved 2016-01-09.
  6. "Dan Roth - Main Page". l2r.cs.uiuc.edu. Archived from the original on 2017-08-26. Retrieved 2017-08-29.
  7. AAAS List of Fellows Archived July 27, 2014, at the Wayback Machine
  8. "ACM Fellows". Archived from the original on 2016-12-01. Retrieved 2016-01-09.
  9. AAAI List of Fellows
  10. ACL Fellows
  11. Dan Roth's Publication Page
  12. R. Khardon and D. Roth,Learning to Reason, Journal of the ACM (1997)
  13. 1 2 Cognitive Computation Group Demo Page
  14. D. Roth,Learning to Reason: The Approach, (1996)
  15. D. Roth,Incidental Supervision, AAAI (2017)
  16. Chang, M.W. (2008). "Importance of Semantic Representation: Dataless Classification". AAAI.
  17. D. Roth, D. Roth, On the hardness of approximate reasoning, Artificial Intelligence (1996)
  18. R. de Salvo Braz, E. Amir and D. Roth, Lifted First-Order Probabilistic Inference, IJCAI, 2005.
  19. M. Chang and L. Ratinov and D. Roth, Structured Learning with Constrained Conditional Models, Machine Learning (2012)
  20. D. Roth and W. Yih, A Linear Programming Formulation for Global Inference in Natural Language Tasks, CoNLL (2004)
  21. S. Agarwal and A. Awan and D. Roth, Learning to Detect Objects in Images via a Sparse, Part-Based Representation, IEEE Transactions on PAMI (2004)
  22. J. Clarke and D. Goldwasser and M. Chang and D. Roth, Driving Semantic Parsing from the World's Response, CoNLL (2010)
  23. "Reveal Acquires NexLP to become the leading AI-powered eDiscovery Solution" (Press release).
  24. "Scientific Advisory Board — Allen Institute for AI". allenai.org. Retrieved 2023-12-06.