Semantic wiki

Last updated

A semantic wiki is a wiki that has an underlying model of the knowledge described in its pages. Regular, or syntactic, wikis have structured text and untyped hyperlinks. Semantic wikis, on the other hand, provide the ability to capture or identify information about the data within pages, and the relationships between pages, in ways that can be queried or exported like a database [1] [2] through semantic queries.

Contents

Semantic wikis were first proposed in the early 2000s, and began to be implemented seriously around 2005. [3] [4] As of 2021, well-known semantic wiki engines are Semantic MediaWiki and Wikibase. [5]

Key characteristics

Formal notation

The knowledge model found in a semantic wiki is typically available in a formal language, so that machines can process it into an entity-relationship model or relational database.

The formal notation may be included in the pages themselves by users, as in Semantic MediaWiki, or it may be derived from the pages or the page names or the means of linking.

For example, using a specific alternative page name might indicate that a specific type of link was intended.

Providing information through a formal notation allows machines to calculate new facts, as relations between pages, from the facts represented in the knowledge model.

Semantic Web compatibility

The technologies developed by the Semantic Web community provide one basis for formal reasoning about the knowledge model that is developed by importing this data. However, there is also a wide array of technologies that work on relational data.

Example

Imagine a semantic wiki devoted to food. The page for an apple would contain, in addition to standard text information, some machine-readable or at least machine-intuitable semantic data. The most basic kind of data would be that an apple is a kind of fruit—what's known as an inheritance relationship. The wiki would thus be able to automatically generate a list of fruits, simply by listing all pages that are tagged as being of type "fruit." Further semantic tags in the "apple" page could indicate other data about apples, including their possible colors and sizes, nutritional information and serving suggestions, and so on.

If the wiki exports all this data in RDF or a similar format, it can then be queried in a similar way to a database—so that an external user or site could, for instance, request a list of all fruits that are red and can also be baked in a pie.

History

In the 1980s, before the Web began, there were several technologies to process typed links between collectively maintained hypertext pages, such as NoteCards, KMS, and gIBIS. Extensive research was published on these tools by the collaboration software, computer-mediated communication, hypertext, and computer supported cooperative work communities.

The first known usage of the term "Semantic Wiki" was a Usenet posting by Andy Dingley in January 2001. [6] Its first known appearance in a technical paper was in a 2003 paper by Austrian researcher Leo Sauermann. [7]

Many of the existing semantic wiki applications were started in the mid-2000s, including ArtificialMemory [8] (2004), Semantic MediaWiki (2005), Freebase (2005), and OntoWiki (2006).

June 2006 saw the first meeting dedicated to semantic wikis, the "SemWiki" workshop, co-located with the European Semantic Web Conference in Montenegro. [9] This workshop ran annually until 2010. [10]

The site DBpedia, launched in 2007, though not a semantic wiki, publishes structured data from Wikipedia in RDF form, which enables semantic querying of Wikipedia's data.

In March 2008, Wikia, the world's largest wiki farm, made the use of Semantic MediaWiki available for all their wikis on request, thus allowing all the wikis they hosted to function as semantic wikis. [11] However, since upgrading to version 1.19 of MediaWiki in 2013, they have stopped supporting Semantic MediaWiki for new requests on the basis of performance problem. [12]

In July 2010, Google purchased Metaweb, the company behind Freebase. [13]

In April 2012, work began on Wikidata, a collaborative, multi-language store of data, whose data could then be used within Wikipedia articles, as well as by the outside world.

Semantic wiki software

There are a number of wiki applications that provide semantic functionality. Some standalone semantic wiki applications exist, including OntoWiki. Other semantic wiki software is structured as extensions or plugins to standard wiki software. The best-known of these is Semantic MediaWiki, an extension to MediaWiki. Another example is the SemanticXWiki [14] extension for XWiki.

Some standard wiki engines also include the ability to add typed, semantic links to pages, including PhpWiki and Tiki Wiki CMS Groupware.

Freebase, though not billed as a wiki engine, is a web database with semantic-wiki-like properties.

Common features

Semantic wikis vary in their degree of formalization. Semantics may be either included in, or placed separately from, the wiki markup. Users may be supported when adding this content, using forms or autocompletion, or more complex proposal generation or consistency checks. The representation language may be wiki syntax, a standard language like RDF or OWL, or some database directly populated by the tool that withdraws the semantics from the raw data. Separate versioning support or correction editing for the formalized content may also be provided. Provenance support for the formalized content, that is, tagging the author of the data separately from the data itself, varies.

What data can get formalized also varies. One may be able to specify types for pages, categories, or paragraphs or sentences (the latter features were more common in pre-web systems). Links are usually also typed. The source, property, and target may be determined by some defaults, e.g. in Semantic MediaWiki the source is always the current page.

Reflexivity also varies. More reflexive user interfaces provide strong ontology support from within the wiki, and allow it to be loaded, saved, created, and changed.

Some wikis inherit their ontology entirely from a pre-existing strong ontology like Cyc or SKOS, while, on the other extreme, in other semantic wikis the entire ontology is generated by users.

Conventional, non-semantic wikis typically still have ways for users to express data and metadata, typically by tagging, categorizing, and using namespaces. In semantic wikis, these features still typically exist, but integrated these with other semantic declarations, and sometimes with their use restricted.

Some semantic wikis provide reasoning support, using a variety of engines. Such reasoning may require that all instance data comply with the ontologies.

Most semantic wikis have simple querying support (such as searching for all triples with a certain subject, predicate, object), but the degree of advanced query support varies; some semantic wikis provide querying in standard languages like SPARQL, while others instead provide a custom language. User interface support to construct these also varies. Visualization of the links especially may be supported.

Many semantic wikis can display the relationships between pages, or other data such as dates, geographical coordinates, and number values, in various formats, such as graphs, tables, charts, calendars, and maps.

See also

Related Research Articles

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

The Resource Description Framework (RDF) is a World Wide Web Consortium (W3C) standard originally designed as a data model for metadata. It has come to be used as a general method for description and exchange of graph data. RDF provides a variety of syntax notations and data serialization formats, with Turtle currently being the most widely used notation.

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for various domains: the nouns representing classes of objects and the verbs representing relations between the objects.

SPARQL is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It was made a standard by the RDF Data Access Working Group (DAWG) of the World Wide Web Consortium, and is recognized as one of the key technologies of the semantic web. On 15 January 2008, SPARQL 1.0 was acknowledged by W3C as an official recommendation, and SPARQL 1.1 in March, 2013.

<span class="mw-page-title-main">Semantic MediaWiki</span> Software for creating, managing and sharing structured data in MediaWiki

Semantic MediaWiki (SMW) is an extension to MediaWiki that allows for annotating semantic data within wiki pages, thus turning a wiki that incorporates the extension into a semantic wiki. Data that has been encoded can be used in semantic searches, used for aggregation of pages, displayed in formats like maps, calendars and graphs, and exported to the outside world via formats like RDF and CSV.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

Oracle Spatial and Graph, formerly Oracle Spatial, is a free option component of the Oracle Database. The spatial features in Oracle Spatial and Graph aid users in managing geographic and location-data in a native type within an Oracle database, potentially supporting a wide range of applications — from automated mapping, facilities management, and geographic information systems (AM/FM/GIS), to wireless location services and location-enabled e-business. The graph features in Oracle Spatial and Graph include Oracle Network Data Model (NDM) graphs used in traditional network applications in major transportation, telcos, utilities and energy organizations and RDF semantic graphs used in social networks and social interactions and in linking disparate data sets to address requirements from the research, health sciences, finance, media and intelligence communities.

An RDF query language is a computer language, specifically a query language for databases, able to retrieve and manipulate data stored in Resource Description Framework (RDF) format.

Semantic publishing on the Web, or semantic web publishing, refers to publishing information on the web as documents accompanied by semantic markup. Semantic publication provides a way for computers to understand the structure and even the meaning of the published information, making information search and data integration more efficient.

Ontotext is a software company that produces software relating to data management. Its main products are GraphDB, an RDF database; and Ontotext Platform, a general data management platform based on knowledge graphs. It was founded in 2000 in Bulgaria, and now has offices internationally. Together with the BBC, Ontotext developed one of the early large-scale industrial semantic applications, Dynamic Semantic Publishing, starting in 2010.

<span class="mw-page-title-main">Apache Jena</span> Open source semantic web framework for Java

Apache Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these. A model can also be queried through SPARQL 1.1.

The concept of the Social Semantic Web subsumes developments in which social interactions on the Web lead to the creation of explicit and semantically rich knowledge representations. The Social Semantic Web can be seen as a Web of collective knowledge systems, which are able to provide useful information based on human contributions and which get better as more people participate. The Social Semantic Web combines technologies, strategies and methodologies from the Semantic Web, social software and the Web 2.0.

<span class="mw-page-title-main">DBpedia</span> Online database project

DBpedia is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.

A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject–predicate–object, like "Bob is 35" or "Bob knows Fred".

The Semantic Web Stack, also known as Semantic Web Cake or Semantic Web Layer Cake, illustrates the architecture of the Semantic Web.

Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contributions. Freebase aimed to create a global resource that allowed people to access common information more effectively. It was developed by the American software company Metaweb and run publicly beginning in March 2007. Metaweb was acquired by Google in a private sale announced on 16 July 2010. Google's Knowledge Graph is powered in part by Freebase.

<span class="mw-page-title-main">Twine (social network)</span>

Twine was an online social web service for information storage, authoring and discovery that existed from 2007 to 2010. It was created and run by Radar Networks. It was announced on October 19, 2007 and opened to the public on October 21, 2008. On March 11, 2010, Radar Networks was acquired by Evri Inc. along with Twine.com. On May 14, 2010, twine.com was shut down, becoming a redirect to evri.com.

GeoSPARQL is a model for representing and querying geospatial linked data for the Semantic Web. It is standardized by the Open Geospatial Consortium as OGC GeoSPARQL. The definition of a small ontology based on well-understood OGC standards is intended to provide a standardized exchange basis for geospatial RDF data which can support both qualitative and quantitative spatial reasoning and querying with the SPARQL database query language.

<span class="mw-page-title-main">Sebastian Schaffert</span>

Sebastian Schaffert is a software engineer and researcher. He was born in Trostberg, Bavaria, Germany on March 18, 1976 and obtained his doctorate in 2004.

Schema-agnostic databases or vocabulary-independent databases aim at supporting users to be abstracted from the representation of the data, supporting the automatic semantic matching between queries and databases. Schema-agnosticism is the property of a database of mapping a query issued with the user terminology and structure, automatically mapping it to the dataset vocabulary.

References

  1. Semantic Wikis and Disaster Relief Operations, Soenke Ziesche, xml.com, December 13, 2006
  2. Semantic Wikis: A Comprehensible Introduction with Examples from the Health Sciences, Maged N. Kamel Boulos, Journal of Emerging Technologies in Web Intelligence, Vol. 1, No. 1, August 2009
  3. A semantic wiki for collaborative knowledge formation, Sebastian Schaffert, Andreas Gruber, and Rupert Westenthaler, Research Report, Knowledge-based Information Systems Group, Salzburg Research, Austria, November 23, 2005
  4. IkeWiki: A semantic wiki for collaborative knowledge management, Sebastian Schaffert, Proceedings of the 15th IEEE International Workshops on Enabling Technologies: Infrastructure for Collaborative Enterprises (WETICE'06), June 6, 2006
  5. Comparison of Semantic MediaWiki and Wikibase
  6. Andy Dingley (21 January 2001). "Wikiwiki (was Theory: "opportunistic hypertext")". Newsgroup:  comp.infosystems.www.authoring.site-design.
  7. Leo Sauermann (2003). "The Gnowsis-Using Semantic Web Technologies to build a Semantic Desktop" (PDF). Technical University of Vienna. Retrieved 2007-06-20.
  8. Dr. Lars Ludwig (2013). "Extended Artificial Memory. Toward an integral cognitive theory of memory and technology" (pdf). Technical University of Kaiserslautern. Retrieved 2017-02-07.
  9. Call for Papers: SemWiki 2006
  10. SemWiki.org
  11. Wikia offers Semantic MediaWiki hosting, semantic-mediawiki.org, March 12, 2008
  12. Semantic Mediawiki gone from Wikia forever?
  13. Deeper understanding with Metaweb, Official Google Blog, July 16, 2010
  14. Semantic XWikiExtension, ObjectSecurity Ltd, November 16, 2012