VIVO (software)

Last updated
VIVO
Original author(s) Cornell University Library
Stable release
1.13.0 / 2 September 2022;7 months ago (2022-09-02)
Repository github.com/vivo-project/VIVO
Written in Java, Web Ontology Language
License Apache License
Website www.vivoweb.org

VIVO is a web-based, open-source suite of computer software for managing data about researchers, scientists, and faculty members. VIVO uses Semantic Web techniques to represent people and their work. As of 2020, it is used by dozens of universities and the United States Department of Agriculture. [1]

Contents

History

The Cornell University Library originally created VIVO in 2003 as a "virtual life sciences community". [2] In 2009, the National Institutes of Health awarded a $12.2 million grant to University of Florida, Cornell University, Indiana University, Ponce School of Medicine, The Scripps Research Institute, Washington University in St. Louis, and Weill Cornell Medical College to expand the tool for use outside of Cornell. [3]

Data ingest

VIVO can harvest publication data from PubMed, CSV files, relational databases, or OAI-PMH harvest. It then uses a semi-automated process to match publications to researchers. [4] It also harvests information about researchers from Human Resources systems and student information systems. [5]

Ontology

The VIVO ontology incorporates elements of several established ontologies, including Dublin Core, Basic Formal Ontology, Bibliographic Ontology, FOAF, and SKOS. The ontology can be used to describe several roles of faculty members, including research, teaching, and service. [6]

The Dutch Data Archiving and Networked Services and Indiana University worked to develop the ontology to enable bilingual modeling of researchers. [7]

Related Research Articles

<span class="mw-page-title-main">Semantic network</span> Knowledge base that represents semantic relations between concepts in a network

A semantic network, or frame network is a knowledge base that represents semantic relations between concepts in a network. This is often used as a form of knowledge representation. It is a directed or undirected graph consisting of vertices, which represent concepts, and edges, which represent semantic relations between concepts, mapping or connecting semantic fields. A semantic network may be instantiated as, for example, a graph database or a concept map. Typical standardized semantic networks are expressed as semantic triples.

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

<span class="mw-page-title-main">WordNet</span> Computational lexicon of English

WordNet is a lexical database of semantic relations between words that links words into semantic relations including synonyms, hyponyms, and meronyms. The synonyms are grouped into synsets with short definitions and usage examples. It can thus be seen as a combination and extension of a dictionary and thesaurus. While it is accessible to human users via a web browser, its primary use is in automatic text analysis and artificial intelligence applications. It was first created in the English language and the English WordNet database and software tools have been released under a BSD style license and are freely available for download from that WordNet website. There are now WordNets in more than 200 languages.

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.

<span class="mw-page-title-main">Concept map</span> Diagram showing relationships among concepts

A concept map or conceptual diagram is a diagram that depicts suggested relationships between concepts. Concept maps may be used by instructional designers, engineers, technical writers, and others to organize and structure knowledge.

Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools used to estimate the strength of the semantic relationship between units of language, concepts or instances, through a numerical description obtained according to the comparison of information supporting their meaning or describing their nature. The term semantic similarity is often confused with semantic relatedness. Semantic relatedness includes any relation between two terms, while semantic similarity only includes "is a" relations. For example, "car" is similar to "bus", but is also related to "road" and "driving".

The Open Biological and Biomedical Ontologies (OBO) Foundry is a group of people dedicated to build and maintain ontologies related to the life sciences. The OBO Foundry establishes a set of principles for ontology development for creating a suite of interoperable reference ontologies in the biomedical domain. Currently, there are more than a hundred ontologies that follow the OBO Foundry principles.

<span class="mw-page-title-main">Frank van Harmelen</span>

Frank van Harmelen is a Dutch computer scientist and professor in Knowledge Representation & Reasoning in the AI department at the Vrije Universiteit Amsterdam. He was scientific director of the LarKC project (2008-2011), "aiming to develop the Large Knowledge Collider, a platform for very large scale semantic web reasoning."

Data preprocessing can refer to manipulation or dropping of data before it is used in order to ensure or enhance performance, and is an important step in the data mining process. The phrase "garbage in, garbage out" is particularly applicable to data mining and machine learning projects. Data-gathering methods are often loosely controlled, resulting in out-of-range values, impossible data combinations, and missing values, etc.

eBird Online database of bird observations

eBird is an online database of bird observations providing scientists, researchers and amateur naturalists with real-time data about bird distribution and abundance. Originally restricted to sightings from the Western Hemisphere, the project expanded to include New Zealand in 2008, and again expanded to cover the whole world in June 2010. eBird has been described as an ambitious example of enlisting amateurs to gather data on biodiversity for use in science.

Dieter Fensel is a German researcher in the field of formal languages and the semantic web. He is University Professor at the University of Innsbruck, where he directs the Semantic Technologies Institute Innsbruck, a research center associated with the university.

Swoogle was a search engine for Semantic Web ontologies, documents, terms and data published on the Web. Swoogle employed a system of crawlers to discover RDF documents and HTML documents with embedded RDF content. Swoogle reasoned about these documents and their constituent parts and recorded and indexed meaningful metadata about them in its database.

Amit Sheth is a computer scientist at University of South Carolina in Columbia, South Carolina. He is the founding Director of the Artificial Intelligence Institute, and a Professor of Computer Science and Engineering. From 2007 to June 2019, he was the Lexis Nexis Ohio Eminent Scholar, director of the Ohio Center of Excellence in Knowledge-enabled Computing, and a Professor of Computer Science at Wright State University. Sheth's work has been cited by over 48,800 publications. He has an h-index of 106, which puts him among the top 100 computer scientists with the highest h-index. Prior to founding the Kno.e.sis Center, he served as the director of the Large Scale Distributed Information Systems Lab at the University of Georgia in Athens, Georgia.

The Semantic Sensor Web (SSW) is a marriage of sensor web and semantic Web technologies. The encoding of sensor descriptions and sensor observation data with Semantic Web languages enables more expressive representation, advanced access, and formal analysis of sensor resources. The SSW annotates sensor data with spatial, temporal, and thematic semantic metadata. This technique builds on current standardization efforts within the Open Geospatial Consortium's Sensor Web Enablement (SWE) and extends them with Semantic Web technologies to provide enhanced descriptions and access to sensor data.

iPlant Collaborative

The iPlant Collaborative, renamed Cyverse in 2017, is a virtual organization created by a cooperative agreement funded by the US National Science Foundation (NSF) to create cyberinfrastructure for the plant sciences (botany). The NSF compared cyberinfrastructure to physical infrastructure, "... the distributed computer, information and communication technologies combined with the personnel and integrating components that provide a long-term platform to empower the modern scientific research endeavor". In September 2013 it was announced that the National Science Foundation had renewed iPlant's funding for a second 5-year term with an expansion of scope to all non-human life science research.

Research networking (RN) is about using tools to identify, locate and use research and scholarly information about people and resources. Research networking tools serve as knowledge management systems for the research enterprise. RN tools connect institution-level/enterprise systems, national research networks, publicly available research data, and restricted/proprietary data by harvesting information from disparate sources into compiled profiles for faculty, investigators, scholars, clinicians, community partners and facilities. RN tools facilitate collaboration and team science to address research challenges through the rapid discovery and recommendation of researchers, expertise and resources.

<span class="mw-page-title-main">Katy Börner</span> Information scientist

Katy Börner is an engineer, scholar, author, educator, and speaker specializing in data analysis and visualization, particularly in the areas of science and technology (S&T) studies and biomedical applications. Based out of Indiana University, Bloomington, Börner is the Victor Yngve Distinguished Professor of Engineering & Information Science in the Department of Intelligent Systems Engineering and the Department of Information and Library Science at the Luddy School of Informatics, Computing, and Engineering and a member of the Core Cognitive Science Faculty. Since 2012, she has also held the position of visiting professor at the Royal Netherlands Academy of Arts and Sciences (KNAW) in Amsterdam, The Netherlands, and in 2017-2019, she was a Humboldt Fellow at Dresden University of Technology, Germany.

<span class="mw-page-title-main">Jure Leskovec</span> Slovene computer scientist

Jure Leskovec is a Slovenian computer scientist, entrepreneur and associate professor of Computer Science at Stanford University focusing on networks. He was the chief scientist at Pinterest.

In natural language processing, linguistics, and neighboring fields, Linguistic Linked Open Data (LLOD) describes a method and an interdisciplinary community concerned with creating, sharing, and (re-)using language resources in accordance with Linked Data principles. The Linguistic Linked Open Data Cloud was conceived and is being maintained by the Open Linguistics Working Group (OWLG) of the Open Knowledge Foundation, but has been a point of focal activity for several W3C community groups, research projects, and infrastructure efforts since then.

<span class="mw-page-title-main">Knowledge graph</span> Type of knowledge base

In knowledge representation and reasoning, knowledge graph is a knowledge base that uses a graph-structured data model or topology to integrate data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the semantics underlying the used terminology.

References

  1. "VIVO". vivoweb.org. Retrieved 21 January 2020.
  2. Devare, Medha; Corson-Rikert, Jon; Caruso, Brian; Lowe, Brian; Chiang, Kathy; McCue, Janet (2007). "VIVO: Connecting People, Creating a Virtual Life Sciences Community". D-Lib Magazine. 13 (7/8). doi: 10.1045/july2007-devare . Retrieved 21 January 2020.
  3. García-Milian, Rolando; Norton, Hannah F.; Auten, Beth; Davis, Valrie I.; Holmes, Kristi L.; Johnson, Margeaux; Tennant, Michele R. (April 2013). "Librarians as Part of Cross-Disciplinary, Multi-institutional Team Projects: Experiences from the VIVO Collaboration". Science & Technology Libraries. 32 (2): 160–175. doi:10.1080/0194262X.2013.791183. PMC   3700548 . PMID   23833333.
  4. Barnes, Chris; Williams, Stephen; Sposato, Vincent; Skaggs, Nicholas; Raum, Narayan; Corson-Rikert, Jon; Caruso, Brian; Blake, Jim (2012). "Extending VIVO". In Börner, Katy; Conlon, Michael; Corson-Rikert, Jon; Ding, Ying (eds.). VIVO : a semantic approach to scholarly networking and discovery. [San Rafael, Calif.]: Morgan & Claypool. ISBN   9781608459933.
  5. "About VIVO". vivoweb.org. Retrieved 21 January 2020.
  6. Ding, Ying; Mitchell, Stella; Corson-Rikert, Jon; Lowe, Brian; He, Bing (2011). The VIVO Ontology: Enabling Networking of Scientists (PDF). Retrieved 21 January 2020.
  7. Chambers, Tamy; Shah, Sahil; Urankar, Ashish; Kalyan, Venkat; Scharnhorst, Andrea; Reijnhoudt, Linda; Rideour, Laura; Guéret, Christophe; Ding, Ying (2013). "Bilingual researcher profiles: Modeling Dutch researchers in both English and Dutch using the VIVO ontology". Proceedings of the American Society for Information Science and Technology. 50 (1): 1–4. doi: 10.1002/meet.14505001137 .