Preslav Nakov

Last updated
Preslav Nakov
Born1977-01-26
NationalityBulgarian
Alma mater University of California, Berkeley (PhD in Computer Science)
Sofia University (MSc in Computer Science)
Known for Natural language processing
Detecting fake news online
Sentiment analysis
Scientific career
Fields Computer Science
Institutions Qatar Computing Research Institute
National University of Singapore
Sofia University
Bulgarian Academy of Sciences
University of California, Berkeley
University College London
Thesis Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics  (2007)
Doctoral advisor Marti Hearst
Website Personal website

Preslav Nakov (born on 26 January 1977 in Veliko Turnovo, Bulgaria) is a computer scientist who works on natural language processing. He is particularly known for his research on fake news detection, [1] automatic detection of offensive language, [2] and biomedical text mining. [3] Nakov obtained a PhD in computer science under the supervision of Marti Hearst from the University of California, Berkeley. He was the first person to receive the prestigious John Atanasov Presidential Award for achievements in the development of the information society by the President of Bulgaria. [4]

Contents

Education

Preslav Nakov grew up in Veliko Turnovo, Bulgaria, where he attended primary and secondary school, obtaining a Diploma in Mathematics from the Secondary School of Mathematics and Natural Sciences 'Vassil Drumev' in 1996. He then obtained a MSc degree in Informatics (Computer Science) with specialisations in Artificial Intelligence and Information and Communication Technologies from Sofia University in 2011. During his MSc studies, he worked as a teaching assistant at Sofa University and the Bulgarian Academy of Sciences, as well as a guest lecturer at University College London during a visit in Spring 1999. Subsequently, he enrolled into the PhD program at the Department of Electrical Engineering and Computer Science, University of California, Berkeley, partly supported by a Fulbright Scholarship. Under the supervision of Marti Hearst, he wrote a thesis on the topic of text mining from the Web, and graduated with a PhD in Computer Science from UC Berkeley in 2007. [5]

Career

Upon graduating from the University of California, Berkeley, Nakov started work as a Research Fellow at the National University of Singapore. Since 2012, he has been a Senior Scientist at the Qatar Computing Research Institute (QCRI). He maintains a position as an honorary lecturer at Sofia University.

Research

Preslav Nakov works in the area of natural language processing and text mining. He has published over 300 peer-reviewed research papers. [6] Preslav Nakov's early research was on lexical semantics and text mining. He published influential papers on biomedical text mining, most prominently on methods to identify citation sentences in biomedical papers. [3] He is though most well-known for his research on fake news detection, such as his work on predicting the factuality and bias of news sources, [1] as well as for his research on the automatic detection of offensive language. [2] Nakov also previously led the organisation of a popular evaluation campaign on sentiment analysis systems as part of SemEval between the years of 2015 and 2017. [7] He currently coordinates the Tanbih News Aggregator project, a large project with partners at the Qatar Computing Research Institute and the MIT Computer Science and Artificial Intelligence Laboratory, which aims to uncover stance, bias and propaganda in news. [8]

Selected honors and distinctions

Related Research Articles

<span class="mw-page-title-main">Natural language processing</span> Field of linguistics and computer science

Natural language processing (NLP) is a subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.

Natural-language understanding (NLU) or natural-language interpretation (NLI) is a subtopic of natural-language processing in artificial intelligence that deals with machine reading comprehension. Natural-language understanding is considered an AI-hard problem.

Text mining, also referred to as text data mining, similar to text analytics, is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a KDD process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.

<span class="mw-page-title-main">Charles J. Fillmore</span> American linguist

Charles J. Fillmore was an American linguist and Professor of Linguistics at the University of California, Berkeley. He received his Ph.D. in Linguistics from the University of Michigan in 1961. Fillmore spent ten years at Ohio State University and a year as a Fellow at the Center for Advanced Study in the Behavioral Sciences at Stanford University before joining Berkeley's Department of Linguistics in 1971. Fillmore was extremely influential in the areas of syntax and lexical semantics.

Jerry R. Hobbs is an American researcher in the fields of computational linguistics, discourse analysis, and artificial intelligence.

Computational semantics is the study of how to automate the process of constructing and reasoning with meaning representations of natural language expressions. It consequently plays an important role in natural-language processing and computational linguistics.

In linguistics, statistical semantics applies the methods of statistics to the problem of determining the meaning of words or phrases, ideally through unsupervised learning, to a degree of precision at least sufficient for the purpose of information retrieval.

Distributional semantics is a research area that develops and studies theories and methods for quantifying and categorizing semantic similarities between linguistic items based on their distributional properties in large samples of language data. The basic idea of distributional semantics can be summed up in the so-called Distributional hypothesis: linguistic items with similar distributions have similar meanings.

Terminology extraction is a subtask of information extraction. The goal of terminology extraction is to automatically extract relevant terms from a given corpus.

<span class="mw-page-title-main">Yorick Wilks</span> British computer scientist (born 1939)

Yorick Wilks FBCS, a British computer scientist, is Emeritus Professor of Artificial Intelligence at the University of Sheffield, Visiting Professor of Artificial Intelligence at Gresham College, Former Senior Research Fellow at the Oxford Internet Institute, Senior Scientist at the Florida Institute for Human and Machine Cognition, and a member of the Epiphany Philosophers.

In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.

Semantic computing is a field of computing that combines elements of semantic analysis, natural language processing, data mining, knowledge graphs, and related fields.

SemEval is an ongoing series of evaluations of computational semantic analysis systems; it evolved from the Senseval word sense evaluation series. The evaluations are intended to explore the nature of meaning in language. While meaning is intuitive to humans, transferring those intuitions to computational analysis has proved elusive.

Semantic analysis (computational) within applied linguistics and computer science, is a composite of semantic analysis and computational components. Semantic analysis refers to a formal analysis of meaning, and computational refers to approaches that in principle support effective implementation in digital computers.

Dragomir R. Radev is a Yale University professor of computer science working on natural language processing and information retrieval. He previously served as a University of Michigan computer science professor and Columbia University computer science adjunct professor. Radev serves as Member of the Advisory Board of Lawyaw.

The following outline is provided as an overview of and topical guide to natural-language processing:

<span class="mw-page-title-main">Marti Hearst</span> American computer scientist

Marti Hearst is a professor in the School of Information at the University of California, Berkeley. She did early work in corpus-based computational linguistics, including some of the first work in automating sentiment analysis, and word sense disambiguation. She invented an algorithm that became known as "Hearst patterns" which applies lexico-syntactic patterns to recognize hyponymy (ISA) relations with high accuracy in large text collections, including an early application of it to WordNet; this algorithm is widely used in commercial text mining applications including ontology learning. Hearst also developed early work in automatic segmentation of text into topical discourse boundaries, inventing a now well-known approach called TextTiling.

Semantic parsing is the task of converting a natural language utterance to a logical form: a machine-understandable representation of its meaning. Semantic parsing can thus be understood as extracting the precise meaning of an utterance. Applications of semantic parsing include machine translation, question answering, ontology induction, automated reasoning, and code generation. The phrase was first used in the 1970s by Yorick Wilks as the basis for machine translation programs working with only semantic representations.

Mona Talat Diab is a computer science professor at George Washington University and a research scientist with Facebook AI. Her research focuses on natural language processing, computational linguistics, cross lingual/multilingual processing, computational socio-pragmatics, Arabic language processing, and applied machine learning.

Ellen Riloff is an American computer scientist currently serving as a professor at the School of Computing at the University of Utah. Her research focuses on Natural Language Processing and Computational Linguistics, specifically information extraction, sentiment analysis, semantic class induction, and bootstrapping methods that learn from unannotated texts.

References

  1. 1 2 Baly, Ramy; Karadzhov, Georgi; Alexandrov, Dimitar; Glass, James; Nakov, Preslav (2018-11-01). "Predicting Factuality of Reporting and Bias of News Media Sources". Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing. Conference on Empirical Methods in Natural Language Processing. Brussels, Belgium: Association for Computational Linguistics. pp. 3528–3539.
  2. 1 2 Zampieri, Marcos; Malmasi, Shervin; Nakov, Preslav; Rosenthal, Sara; Farra, Noura; Kumar, Ritesh (2019-06-01). "Predicting the Type and Target of Offensive Posts in Social Media". Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). North American Chapter of the Association for Computational Linguistics. Minneapolis, Minnesota: Association for Computational Linguistics. pp. 1415–1420.
  3. 1 2 Nakov, Preslav; Schwartz, Ariel; Hearst, Marti (2004-07-25). "Citances: Citation Sentences for Semantic Analysis of Bioscience Text". Proceedings of SIGIR. International ACM SIGIR Conference on Research and Development in Information Retrieval. ACM. pp. 81–88. CiteSeerX   10.1.1.59.2666 .
  4. 1 2 "John Atanasov Presidential Award -- Preslav Nakov (2013)". Administration of the President of the Republic of Bulgaria. 2003. Retrieved February 21, 2021.
  5. Nakov, Preslav (2007). Using the Web as an Implicit Training Set: Application to Noun Compound Syntax and Semantics (PhD).
  6. "Preslav Nakov - Google Scholar Citations". scholar.google.com. Retrieved 2021-02-21.
  7. Rosenthal, Sara; Farra, Noura; Nakov, Preslav (2017-08-01). "SemEval-2017 Task 4: Sentiment Analysis in Twitter". Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017). Workshop on Lexical and Computational Semantics and Semantic Evaluation. Association for Computational Linguistics. pp. 502–518.
  8. "TANBIH News Aggregator" . Retrieved February 21, 2021.
  9. "RANLP-2011 Young Researcher Award" (PDF). 2011. Retrieved February 21, 2021.
  10. "CIKM -- Best papers Nominees". 2020. Retrieved February 21, 2021.