This article has multiple issues. Please help improve it or discuss these issues on the talk page . (Learn how and when to remove these template messages)
|
WikiProfessional (Wiki for Professionals) was an attempt to create a web-based research environment for semantic searching, providing an intuitive tool for analyzing and relating concepts. [1] [2] [3] [4]
When data is entered, the system semantically analyzed and recognized co-occurrences between different entities. The results were visualized through a "Knowlet," [5] which is a visual representation of semantic distance between associated entities. This Knowlet is then used to notify persons that have subscribed to these entities, enabling a rapid data interchange between collaborators.
The major focus was proteins, using a portal named WikiProteins. [6] [7] It contained over a hundred million entries, "melding some of the key biomedical databases into a single information resource". Sources included:
The project never passed the open beta test phase. It was operated by Knewco and led by initiator Barend Mons, a bioinformatician at the Erasmus MC and Leiden University Medical Centre. Knewco was intending to profit from WikiProfessionals by charging some users (such as drug firms) for "premium services", for example incorporating a private version of the system with in-house data.
After the project disappeared, a group called the Concept Web Alliance [8] formed to try to rebuild a similarly linked database. This became ConceptWiki [9] [10] and the Nanopublication metadata format. [11]
Bioinformatics is an interdisciplinary field of science that develops methods and software tools for understanding biological data, especially when the data sets are large and complex. Bioinformatics uses biology, chemistry, physics, computer science, computer programming, information engineering, mathematics and statistics to analyze and interpret biological data. The subsequent process of analyzing and interpreting data is referred to as computational biology.
The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.
In information science, an ontology encompasses a representation, formal naming, and definitions of the categories, properties, and relations between the concepts, data, or entities that pertain to one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of terms and relational expressions that represent the entities in that subject area. The field which studies ontologies so conceived is sometimes referred to as applied ontology.
The National Center for Biotechnology Information (NCBI) is part of the United States National Library of Medicine (NLM), a branch of the National Institutes of Health (NIH). It is approved and funded by the government of the United States. The NCBI is located in Bethesda, Maryland, and was founded in 1988 through legislation sponsored by US Congressman Claude Pepper.
Text mining, text data mining (TDM) or text analytics is the process of deriving high-quality information from text. It involves "the discovery by computer of new, previously unknown information, by automatically extracting information from different written resources." Written resources may include websites, books, emails, reviews, and articles. High-quality information is typically obtained by devising patterns and trends by means such as statistical pattern learning. According to Hotho et al. (2005) we can distinguish between three different perspectives of text mining: information extraction, data mining, and a knowledge discovery in databases (KDD) process. Text mining usually involves the process of structuring the input text, deriving patterns within the structured data, and finally evaluation and interpretation of the output. 'High quality' in text mining usually refers to some combination of relevance, novelty, and interest. Typical text mining tasks include text categorization, text clustering, concept/entity extraction, production of granular taxonomies, sentiment analysis, document summarization, and entity relation modeling.
Biological databases are libraries of biological sciences, collected from scientific experiments, published literature, high-throughput experiment technology, and computational analysis. They contain information from research areas including genomics, proteomics, metabolomics, microarray gene expression, and phylogenetics. Information contained in biological databases includes gene function, structure, localization, clinical effects of mutations as well as similarities of biological sequences and structures.
The SIB Swiss Institute of Bioinformatics is an academic not-for-profit foundation which federates bioinformatics activities throughout Switzerland.
Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools used to estimate the strength of the semantic relationship between units of language, concepts or instances, through a numerical description obtained according to the comparison of information supporting their meaning or describing their nature. The term semantic similarity is often confused with semantic relatedness. Semantic relatedness includes any relation between two terms, while semantic similarity only includes "is a" relations. For example, "car" is similar to "bus", but is also related to "road" and "driving".
UniProt is a freely accessible database of protein sequence and functional information, many entries being derived from genome sequencing projects. It contains a large amount of information about the biological function of proteins derived from the research literature. It is maintained by the UniProt consortium, which consists of several European bioinformatics organisations and a foundation from Washington, DC, United States.
Biomedical text mining refers to the methods and study of how text mining may be applied to texts and literature of the biomedical domain. As a field of research, biomedical text mining incorporates ideas from natural language processing, bioinformatics, medical informatics and computational linguistics. The strategies in this field have been applied to the biomedical literature available through services such as PubMed.
The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a suite of interoperable reference ontologies in the biomedical domain. Currently, there are more than a hundred ontologies that follow the OBO Foundry principles.
Barend Mons is a molecular biologist by training and a leading FAIR data specialist. The first decade of his scientific career he spent on fundamental research on malaria parasites and later on translational research for malaria vaccines. In the year 2000 he switched to advanced data stewardship and (biological) systems analytics. He is currently a professor in Leiden and most known for innovations in scholarly collaboration, especially nanopublications, knowledge graph based discovery and most recently the FAIR data initiative and GO FAIR. Since 2012 he is a Professor in biosemantics in the Department of Human Genetics at the Leiden University Medical Center (LUMC) in The Netherlands. In 2015 Barend was appointed chair of the High Level Expert Group on the European Open Science Cloud. Since 2017 Barend is heading the International Support and Coordination office of the GO FAIR initiative. He is also the elected president of CODATA, the standing committee on research data related issues of the International Science Council. Barend is a member of the Netherlands Academy of Technology and Innovation(ACTI). He is also the European representative in the Board on Research Data and Information (BRDI) of the National Academies of Science for engineering and medicine in the USA. Barend is a frequent keynote speaker about FAIR and open science around the world, and participates in various national and international boards.
ChemSpider is a freely accessible online database of chemicals owned by the Royal Society of Chemistry. It contains information on more than 100 million molecules from over 270 data sources, each of them receiving a unique identifier called ChemSpider Identifier.
The Gene Wiki is a project within Wikipedia that aims to describe the relationships and functions of all human genes. It was established to transfer information from scientific resources to Wikipedia stub articles.
In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.
Knowledge extraction is the creation of knowledge from structured and unstructured sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL, the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge or the generation of a schema based on the source data.
The term microattribution is defined as "giving database accessions the same citation conventions and indices that journal articles currently enjoy". In the sense that the purpose of precise attribution is to extend the scholarly convention of giving citation credit, the provenance of a piece of scholarship is recognized to give credit and priority to a preceding author. Microattribution is thus defined as "a scholarly contribution smaller than a journal article being ascribed to a particular author" or a small scholarly contribution being ascribed to a particular author. Since data accessions can describe contributions that can vastly exceed research articles in size and quality, quantum attribution or precise citation might be better terms.
Open PHACTS was a European initiative public–private partnership between academia, publishers, enterprises, pharmaceutical companies and other organisations working to enable better, cheaper and faster drug discovery. It has been funded by the Innovative Medicines Initiative, selected as part of three projects to "design methods for common standards and sharing of data for more efficient drug development and patient treatment in the future".
Biocuration is the field of life sciences dedicated to organizing biomedical data, information and knowledge into structured formats, such as spreadsheets, tables and knowledge graphs. The biocuration of biomedical knowledge is made possible by the cooperative work of biocurators, software developers and bioinformaticians and is at the base of the work of biological databases.