Roberto Navigli

Last updated
Roberto Navigli
Roberto Navigli.jpg
Alma mater
Awards ACL Fellow (2023)
EurAI Fellow (2024)
Scientific career
Fields
Institutions Sapienza University of Rome
Thesis Structural Semantic Interconnections: a Knowledge-Based WSD Algorithm, its Evaluation and Applications  (2007)
Doctoral advisors
Website www.diag.uniroma1.it/navigli

Roberto Navigli (born 1978) is an Italian computer scientist and Professor in the Department of Computer, Control and Management Engineering "Antonio Ruberti" at the Sapienza University of Rome, [3] where he is also the Director of the Sapienza NLP Group. [4] His research focuses on Artificial Intelligence, specifically on enabling computers to understand and represent meaning across hundreds of languages, making significant contributions to various fields within Natural Language Processing, including Word Sense Disambiguation, Entity Linking, Semantic Role Labeling and semantic parsing. [1] He created BabelNet, a multilingual knowledge graph that brings together knowledge from resources including WordNet, Wikipedia, Wiktionary and Wikidata. At the core of his research lies the goal of making semantic representations of words and sentences independent of the language in which they are written. More recently, he has focused on Large Language Models, leading the Minerva LLM project, [5] the first Italian effort for pretraining a LLM from scratch. [6]

Contents

Education

Navigli obtained his Master of Science degree in Computer Science in 2001 at Sapienza University of Rome, followed, in 2007, by a PhD from the same institution, under the supervision of Paola Velardi. [2] Navigli's doctoral thesis focused on devising and evaluating an innovative knowledge-based algorithm for Word Sense Disambiguation, named Structural Semantic Interconnections. [7]

Career and research

During and after his doctorate, Navigli was a Visiting Research Fellow and Visiting Professor [8] of the School of Informatics, University of Edinburgh, working with Prof. Mirella Lapata, the University of Sussex, invited by Dr. Diana McCarthy, the University of Wolverhampton with Prof. Ruslan Mitkov, and the Center for Advanced Studies of Ludwig Maximilian University of Munich, invited by Prof. Hinrich Schütze  [ de ]. [9] He then obtained academic positions as researcher, and later Associate and Full Professor, at the Sapienza University of Rome, where he established the Sapienza NLP group. [10] Between 2017 and 2023, Navigli served as a member of the ERC Starting Grant panel for Computer Science and Informatics (PE6). [11]

Navigli was granted a European Research Council (ERC) Starting Grant [12] to fund his work on the creation of BabelNet and multilingual Word Sense Disambiguation, most notably Babelfy, and a subsequent ERC Consolidator Grant [13] to work on sentence-level, language-independent semantic representations, leading to the BabelNet Meaning Representation and its semantic parser, with the goal of creating 'the DNA of language'. [14] These two grants, amounting to overall 2.9M€, have been highlighted among the 15 projects through which the ERC transformed science. [15]

In 2016, Navigli founded Babelscape, [16] a successful university spinoff company, focused on multilingual neuro-symbolic Natural Language Understanding. [17]

Awards

Selected publications

Related Research Articles

Natural language processing (NLP) is a subfield of computer science and especially artificial intelligence. It is primarily concerned with providing computers with the ability to process data encoded in natural language and is thus closely related to information retrieval, knowledge representation and computational linguistics, a subfield of linguistics. Typically data is collected in text corpora, using either rule-based, statistical or neural-based approaches in machine learning and deep learning.

<span class="mw-page-title-main">WordNet</span> Computational lexicon of English

WordNet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. It can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. It was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website. There are now WordNets in more than 200 languages.

Word-sense disambiguation is the process of identifying which sense of a word is meant in a sentence or other segment of context. In human language processing and cognition, it is usually subconscious.

Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools used to estimate the strength of the semantic relationship between units of language, concepts or instances, through a numerical description obtained according to the comparison of information supporting their meaning or describing their nature. The term semantic similarity is often confused with semantic relatedness. Semantic relatedness includes any relation between two terms, while semantic similarity only includes "is a" relations. For example, "car" is similar to "bus", but is also related to "road" and "driving".

Eugene Charniak was a professor of computer Science and cognitive Science at Brown University. He held an A.B. in Physics from the University of Chicago and a Ph.D. from M.I.T. in Computer Science. His research was in the area of language understanding or technologies which relate to it, such as knowledge representation, reasoning under uncertainty, and learning. Since the early 1990s he was interested in statistical techniques for language understanding. His research in this area included work in the subareas of part-of-speech tagging, probabilistic context-free grammar induction, and, more recently, syntactic disambiguation through word statistics, efficient syntactic parsing, and lexical resource acquisition through statistical means.

<span class="mw-page-title-main">Yorick Wilks</span> British computer scientist (1939–2023)

Yorick Alexander Wilks FBCS was a British computer scientist. He was an emeritus professor of artificial intelligence at the University of Sheffield, visiting professor of artificial intelligence at Gresham College, senior research fellow at the Oxford Internet Institute, senior scientist at the Florida Institute for Human and Machine Cognition, and a member of the Epiphany Philosophers.

SemEval is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive.

Dragomir R. Radev was an American computer scientist who was a professor at Yale University, working on natural language processing and information retrieval. He also served as a University of Michigan computer science professor and Columbia University computer science adjunct professor, as well as a Member of the Advisory Board of Lawyaw.

The Allen Institute for AI is a 501(c)3 non-profit research institute founded by late Microsoft co-founder and philanthropist Paul Allen in 2014. The institute seeks to conduct high-impact AI research and engineering in service of the common good. Oren Etzioni was appointed by Paul Allen in September 2013 to direct the research at the institute. After leading the organization for nine years, Oren Etzioni stepped down from his role as CEO on September 30, 2022. He was replaced in an interim capacity by the leading researcher of the company's Aristo project, Peter Clark. On June 20, 2023, AI2 announced Ali Farhadi as its next CEO starting July 31, 2023. The company's board formed a search committee for a new CEO. AI2 also has an active office in Tel Aviv, Israel.

<span class="mw-page-title-main">BabelNet</span> Multilingual lexical-semantic knowledge graph and encyclopedic dictionary

BabelNet is a multilingual lexical-semantic knowledge graph, ontology and encyclopedic dictionary developed at the NLP group of the Sapienza University of Rome under the supervision of Roberto Navigli. BabelNet was automatically created by linking Wikipedia to the most popular computational lexicon of the English language, WordNet. The integration is done using an automatic mapping and by filling in lexical gaps in resource-poor languages by using statistical machine translation. The result is an encyclopedic dictionary that provides concepts and named entities lexicalized in many languages and connected with large amounts of semantic relations. Additional lexicalizations and definitions are added by linking to free-license wordnets, OmegaWiki, the English Wiktionary, Wikidata, FrameNet, VerbNet and others. Similarly to WordNet, BabelNet groups words in different languages into sets of synonyms, called Babel synsets. For each Babel synset, BabelNet provides short definitions in many languages harvested from both WordNet and Wikipedia.

In natural language processing, a word embedding is a representation of a word. The embedding is used in text analysis. Typically, the representation is a real-valued vector that encodes the meaning of the word in such a way that the words that are closer in the vector space are expected to be similar in meaning. Word embeddings can be obtained using language modeling and feature learning techniques, where words or phrases from the vocabulary are mapped to vectors of real numbers.

<span class="mw-page-title-main">Dan Roth</span> Professor of Computer Science at University of Pennsylvania

Dan Roth is the Eduardo D. Glandt Distinguished Professor of Computer and Information Science at the University of Pennsylvania and the Chief AI Scientist at Oracle. Until June 2024 Dan was a VP/Distinguished Scientist at AWS AI. In his role at AWS Roth led over the last three years the scientific effort behind the first-generation Generative AI products from AWS, including Titan Models, Amazon Q efforts, and Bedrock, from inception until they became generally available.

In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing, and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data Cloud was conceived and is being maintained by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation, but has been a point of focal activity for several W3C community groups, research projects, and infrastructure efforts since then.

<span class="mw-page-title-main">Pascale Fung</span> Professor

Pascale Fung (馮雁) is a professor in the Department of Electronic & Computer Engineering and the Department of Computer Science & Engineering at the Hong Kong University of Science & Technology(HKUST). She is the director of the Centre for AI Research (CAiRE) at HKUST. She is an elected Fellow of the Institute of Electrical and Electronics Engineers (IEEE) for her “contributions to human-machine interactions”, an elected Fellow of the International Speech Communication Association for “fundamental contributions to the interdisciplinary area of spoken language human-machine interactions” and an elected Fellow of the Association for Computational Linguistics (ACL) for her “significant contributions toward statistical NLP, comparable corpora, and building intelligent systems that can understand and empathize with humans”.

<span class="mw-page-title-main">Rada Mihalcea</span> American computer scientist

Rada Mihalcea is the Janice M. Jenkins Collegiate Professor of Computer Science and Engineering at the University of Michigan. She has made contributions to natural language processing, multimodal processing, and computational social science. With Paul Tarau, she is the co-inventor of TextRank Algorithm, which is widely used for text summarization.

Mirella Lapata is a computer scientist and Professor in the School of Informatics at the University of Edinburgh. Working on the general problem of extracting semantic information from large bodies of text, Lapata develops computer algorithms and models in the field of natural language processing (NLP).

Yejin Choi is Wissner-Slivka Chair of Computer Science at the University of Washington. Her research considers natural language processing and computer vision.

Mona Talat Diab is a computer science professor and director of Carnegie Mellon University's Language Technologies Institute. Previously, she was a professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Ellen Riloff is an American computer scientist currently serving as a professor at the School of Computing at the University of Utah. Her research focuses on natural language processing and computational linguistics, specifically information extraction, sentiment analysis, semantic class induction, and bootstrapping methods that learn from unannotated texts.

Giuseppe De Giacomo is an Italian computer scientist. He is a Professor of Computer Science at the Department of Computer Science, University of Oxford, and Professor of Computer Engineering at the Department of Computer, Control and Management Engineering, Sapienza University of Rome. He is also a Senior Research Fellow at the Green Templeton College.

References

  1. 1 2 Roberto Navigli publications indexed by Google Scholar
  2. 1 2 "Roberto Navigli's institutional page - Publications".
  3. "Official DIAG Sapienza Page" (in Italian). Retrieved 2024-11-03.
  4. "Sapienza NLP Page" (in Italian). Retrieved 2024-11-03.
  5. "Minerva" . Retrieved 2024-11-03.
  6. "Ecco Minerva, la prima famiglia di LLM addestrati da zero in italiano" (in Italian). 23 April 2024. Retrieved 2024-11-03.
  7. Navigli, Roberto; Velardi, Paola (2005). "Structural Semantic Interconnections: A Knowledge-Based Approach to Word Sense Disambiguation". IEEE Transactions on Pattern Analysis and Machine Intelligence. 27 (7): 1075–1086. doi:10.1109/TPAMI.2005.149. PMID   16013755 . Retrieved 2024-11-03.
  8. "Roberto Navigli's institutional page - CV".
  9. "Visiting fellow" . Retrieved 2024-11-03.
  10. "Sapienza NLP page".
  11. "ERC Starting Grant Panels 2023" (PDF). Retrieved 2024-11-03., "ERC Starting Grant Panels 2021" (PDF). Retrieved 2024-11-03., "ERC Starting Grant Panels 2019" (PDF). Retrieved 2024-11-03., "ERC Starting Grant Panels 2017" (PDF). Retrieved 2024-11-03.
  12. "MultiJEDI on CORDIS". CORDIS. Retrieved 2024-08-24.
  13. "MOUSSE". CORDIS. Retrieved 2024-08-24.
  14. "Project breaks new grounds in AI to create 'DNA of language'".
  15. "How the ERC transformed science" . Retrieved 2024-08-24.
  16. "Babelscape - about".
  17. "Dalla ricerca arriva l'IA tutta made in Italy" (in Italian). 24 October 2024. Retrieved 2024-11-06.
  18. "Current EurAI Fellows".
  19. "EurAI Fellow motivation on X" . Retrieved 2024-11-03.
  20. 1 2 "ACL 2024 best paper awards".
  21. "Current ACL Fellows".
  22. 1 2 3 4 "AIJ Awards: List of Current and Previous Winners".
  23. 1 2 "ACL 2023 best paper awards".
  24. 1 2 "Sapienza DIAG ACL 2022 best resource paper announcement".
  25. 1 2 "NAACL 2021 best paper awards". 2 June 2021.
  26. "META Prize page". Archived from the original on 2023-03-06.
  27. "Marco Somalvico awards page".
  28. "Marco Cadoli awards page".