Diana Maynard is a British computer scientist and computational linguist who works as a senior research fellow in the Natural Language Processing Group of the Department of Computer Science at the University of Sheffield.
Maynard is originally from Chertsey. She was educated in Manchester, earning a bachelor's degree in 1995 at the University of Manchester Institute of Science and Technology, a master's degree from the University of Manchester in 1996, and a Ph.D. from Manchester Metropolitan University in 2000, [1] completed despite becoming nearly blind from complications of childhood diabetes during her graduate studies. [2]
Maynard has been a researcher associated with the General Architecture for Text Engineering (GATE) project at Sheffield since 2000. [1] Her research with the project includes the development of the Java Annotation Patterns Engine (JAPE) for using regular expressions to process annotations, as well as research on information extraction and sentiment analysis.
She is also associated with the Centre for Freedom of the Media, a research centre based at Sheffield, with whom she has worked on tools for monitoring attacks on journalists. [3]
Maynard is a coauthor of the books Text Processing with GATE (University of Sheffield, 2011) and Natural Language Processing for the Semantic Web (Morgan & Claypool, 2017).
Natural language processing (NLP) is an interdisciplinary subfield of linguistics, computer science, and artificial intelligence concerned with the interactions between computers and human language, in particular how to program computers to process and analyze large amounts of natural language data. The goal is a computer capable of "understanding" the contents of documents, including the contextual nuances of the language within them. The technology can then accurately extract information and insights contained in the documents as well as categorize and organize the documents themselves.
In linguistics, a corpus or text corpus is a language resource consisting of a large and structured set of texts. In corpus linguistics, they are used to do statistical analysis and hypothesis testing, checking occurrences or validating linguistic rules within a specific language territory.
The Department of Computer Science at the University of Manchester is the longest established department of Computer Science in the United Kingdom and one of the largest. It is located in the Kilburn Building on the Oxford Road and currently has over 800 students taking a wide range of undergraduate and postgraduate courses and 60 full-time academic staff.
The Doctor of Engineering, or Engineering Doctorate, is a degree awarded on the basis of advanced study and research in engineering and applied science for solving problems in industry. According to the National Science Foundation in the United States, it is a terminal research doctorate. A DEng/EngD is equivalent to a PhD in engineering, but different in that it has a solid industrial base and an additional taught element. The degree is usually aimed toward working professionals.
Linguistic categories include
General Architecture for Text Engineering or GATE is a Java suite of tools originally developed at the University of Sheffield beginning in 1995 and now used worldwide by a wide community of scientists, companies, teachers and students for many natural language processing tasks, including information extraction in many languages.
Yorick Alexander Wilks FBCS was a British computer scientist. He was an emeritus professor of artificial intelligence at the University of Sheffield, visiting professor of artificial intelligence at Gresham College, senior research fellow at the Oxford Internet Institute, senior scientist at the Florida Institute for Human and Machine Cognition, and a member of the Epiphany Philosophers.
In natural language processing, semantic role labeling is the process that assigns labels to words or phrases in a sentence that indicates their semantic role in the sentence, such as that of an agent, goal, or result.
Jun'ichi Tsujii is a Japanese computer scientist specializing in natural language processing and text mining, particularly in the field of biology and bioinformatics.
Knowledge extraction is the creation of knowledge from structured and unstructured sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL, the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge or the generation of a schema based on the source data.
Dragomir R. Radev was an American computer scientist who was a professor at Yale University, working on natural language processing and information retrieval. He also served as a University of Michigan computer science professor and Columbia University computer science adjunct professor, as well as a Member of the Advisory Board of Lawyaw.
Apache cTAKES: clinical Text Analysis and Knowledge Extraction System is an open-source Natural Language Processing (NLP) system that extracts clinical information from electronic health record unstructured text. It processes clinical notes, identifying types of clinical named entities — drugs, diseases/disorders, signs/symptoms, anatomical sites and procedures. Each named entity has attributes for the text span, the ontology mapping code, context, and negated/not negated.
Temporal annotation is the study of how to automatically add semantic information regarding time to natural language documents. It plays a role in natural language processing and computational linguistics.
Rada Mihalcea is a professor of computer science and engineering at the University of Michigan. Her research focuses on natural language processing, multimodal processing, and computational social science.
Sophia Ananiadou is a British computer scientist and computational linguist. She led the development of and directs the National Centre for Text Mining (NaCTeM) in the United Kingdom. She is also a Professor in the Department of Computer Science at the University of Manchester.
Regina Barzilay is an Israeli-American computer scientist. She is a professor at the Massachusetts Institute of Technology and a faculty lead for artificial intelligence at the MIT Jameel Clinic. Her research interests are in natural language processing and applications of deep learning to chemistry and oncology.
Mirella Lapata FRSE is a computer scientist and Professor in the School of Informatics at the University of Edinburgh. Working on the general problem of extracting semantic information from large bodies of text, Lapata develops computer algorithms and models in the field of natural language processing (NLP).
Marie-Francine (Sien) Moens is a Belgian computer scientist known for her research in natural language processing, argument mining, sentiment analysis, and information retrieval. She is a professor of computer science at KU Leuven.
Ani Nenkova is Principal Scientist at Adobe Research, currently on leave from her position as an Associate Professor of Computer and Information Science at the University of Pennsylvania. Her research focuses on computational linguistics and artificial intelligence, with an emphasis on developing computational methods for analysis of text quality and style, discourse, affect recognition, and summarization.
Sunita Sarawagi is an Indian computer scientist known for her research in databases, data mining, and machine learning, including the use of natural language processing to extract structured data from text. She is Institute Chair Professor of Computer Science and Engineering at IIT Bombay.