Semantic MediaWiki

Last updated
Semantic MediaWiki
Developer(s) various
Stable release
4.1.3 [1]   OOjs UI icon edit-ltr-progressive.svg / 17 February 2024;59 days ago (17 February 2024)
Repository
Written in PHP
Type MediaWiki extension
License GPL-2.0-or-later
Website www.semantic-mediawiki.org

Semantic MediaWiki (SMW) is an extension to MediaWiki that allows for annotating semantic data within wiki pages, thus turning a wiki that incorporates the extension into a semantic wiki. Data that has been encoded can be used in semantic searches, used for aggregation of pages, displayed in formats like maps, calendars and graphs, and exported to the outside world via formats like RDF and CSV.

Contents

Authors

Semantic MediaWiki was initially created by Markus Krötzsch, Denny Vrandečić and Max Völkel, and was first released in 2005. Its development was initially funded by the EU-funded FP6 project SEKT (CORDIS site), and was later supported in part by Institute AIFB of the University of Karlsruhe (later renamed the Karlsruhe Institute of Technology). Currently James Hong Kong is the lead developer as of 2017, while the other core developer is Jeroen De Dauw.

Basic syntax

Every semantic annotation within SMW is a "property" connecting the page on which it resides to some other piece of data, either another page or a data value of some type, using triples of the form "subject, predicate, object".

As an example, a page about Germany could have, encoded within it, the fact its capital city is Berlin. On the page "Germany", the syntax would be:

... the capital city is [[Has capital::Berlin]] ... 

which is semantically equivalent to the statement "Germany" "Has capital" "Berlin". In this example the "Germany" page is the subject, "Has capital" is the predicate, and "Berlin" is the object that the semantic link is pointing to.

However, the much more common way of storing data within Semantic MediaWiki is via MediaWiki templates which themselves contain the necessary SMW markup. For this example, the "Germany" page could contain a call to a template called "Country", that looked like this:

{{Country ... | Capital = Berlin ... }} 

The "Country" template would handle storing whatever the value of the parameter "Capital" is, using the property "Has capital". The template would also handle the display of the data. Semantic MediaWiki developers have estimated that 99% of SMW data is stored in this way. [2]

Semantic MediaWiki also has its own inline querying tools. For instance, if pages about countries stored additional information like population data, a query could be added to a page that displays a list of all countries with a population greater than 50 million, along with their capital city; and Germany would appear in such a list, with Berlin alongside it. [3]

Usage

Semantic MediaWiki is in use on over 1,600 public active wikis around the world, in addition to an unknown number of private wikis. [4] [5] Notable public wikis that use SMW include the Metacafe wiki, Web Platform, SNPedia, SKYbrary, Metavid, Familypedia, OpenEI, [6] the Libreplanet wiki, the Free Software Directory [7] and translatewiki.net. [8] Organizations that use SMW internally include Pfizer, [9] Harvard Pilgrim Health Care, [10] Johnson & Johnson Pharmaceutical Research and Development, [11] the Pacific Northwest National Laboratory, [12] the Metropolitan Museum of Art, [13] NATO, [14] U.S. Department of Defense, [15] and the International Atomic Energy Agency.

SMW has notably gained traction in the health care domain for collaboratively creating bio-medical terminologies and ontologies. [16] Examples are LexWiki, [17] which is jointly run by the Mayo Clinic, National Cancer Institute, World Health Organization and Stanford University; and Neuroscience Information Framework's NeuroLex.

Semantic MediaWiki used to be supported on the now-defunct wiki farm Referata, [18] by default. [19] Wikia has previously activated Semantic MediaWiki on user request, but has stopped doing so since upgrading to version 1.19 of MediaWiki; Wikia sites, such as Familypedia, that had started using it are able to continue.

Semantic MediaWiki and Wikidata

Some members of the academic community began urging the use of SMW on Wikipedia since it was first proposed. [20] In a 2006 paper, Max Völkel et al. wrote that in spite of Wikipedia's utility, "its contents are barely machine-interpretable. Structural knowledge, e.g. about how concepts are interrelated, can neither be formally stated nor automatically processed. Also the wealth of numerical data is only available as plain text and thus can not be processed by its actual meaning." [21]

The Wikimedia community began adding semantic microformat markup to Wikipedia [22] in 2007. In 2010, Wikimedia Foundation Deputy Director Erik Möller stated that Wikimedia was interested in adding semantic capabilities to Wikipedia, but that they were unsure whether Semantic MediaWiki was the right solution, since it was unclear whether it could be used without negatively affecting Wikipedia's performance. [23]

In April 2012, the Wikimedia Foundation project Wikidata began, which provides a massive shared database for use in articles of every language in Wikipedia, and other Wikimedia projects. Its content is also freely available to anyone else. [5] Wikidata supplants the potential use of Semantic MediaWiki on Wikipedia, its software uses Wikibase. [24]

Spinoff extensions

A form to edit a page, using the Semantic Forms extension AcaWiki Semantic Forms screenshot.png
A form to edit a page, using the Semantic Forms extension

A variety of open-source MediaWiki extensions exist that use the data structure provided by Semantic MediaWiki. [25] Among the most notable are:

Community

The official gathering for Semantic MediaWiki developers and users is SMWCon, which has been held twice a year since 2010, in various cities in the United States and Europe. [26] The largest such event, in October 2013 in Berlin, had around 90 attendees. [27] The first virtual SMWCon 2020 attracted 234 attendees. [28]

See also

Related Research Articles

<span class="mw-page-title-main">Wiki</span> Type of website that visitors can edit

A wiki is a form of online hypertext publication that is collaboratively edited and managed by its own audience directly through a web browser. A typical wiki contains multiple pages for the subjects or scope of the project, and could be either open to the public or limited to use within an organization for maintaining its internal knowledge base.

<span class="mw-page-title-main">MediaWiki</span> Free and open-source wiki software

MediaWiki is a free and open-source wiki software originally developed by Magnus Manske for use on Wikipedia on January 25, 2002, and further improved by Lee Daniel Crocker, after which it has been coordinated by the Wikimedia Foundation. It powers several wiki hosting websites across the Internet, as well as most websites hosted by the Foundation including Wikipedia, Wiktionary, Wikimedia Commons, Wikiquote, Meta-Wiki and Wikidata, which define a large part of the set requirements for the software. MediaWiki is written in the PHP programming language and stores all text content into a database. The software is optimized to efficiently handle large projects, which can have terabytes of content and hundreds of thousands of views per second. Because Wikipedia is one of the world's largest and most visited websites, achieving scalability through multiple layers of caching and database replication has been a major concern for developers. Another major aspect of MediaWiki is its internationalization; its interface is available in more than 400 languages. The software has more than 1,000 configuration settings and more than 1,800 extensions available for enabling various features to be added or changed. Besides its usage on Wikimedia sites, MediaWiki has been used as a knowledge management and content management system on websites such as Fandom, wikiHow and major internal installations like Intellipedia and Diplopedia.

<span class="mw-page-title-main">DokuWiki</span> Wiki software

DokuWiki is an open source wiki application licensed under GPLv2 and written in the PHP programming language. It works on plain text files and thus does not need a database. Its syntax is similar to the one used by MediaWiki. It is often recommended as a more lightweight, easier to customize alternative to MediaWiki. The 'Doku' in DokuWiki is short for Dokumentation which in German means documentation.

<span class="mw-page-title-main">XWiki</span> Wiki engine

XWiki is a free wiki software platform written in Java with a design emphasis on extensibility. XWiki is an enterprise wiki. It includes WYSIWYG editing, OpenDocument based document import/export, semantic annotations and tagging, and advanced permissions management.

Microformats (μF) are a set of defined HTML classes created to serve as consistent and descriptive metadata about an element, designating it as representing a certain type of data. They allow software to process the information reliably by having set classes refer to a specific type of data rather than being arbitrary.

SPARQL is an RDF query language—that is, a semantic query language for databases—able to retrieve and manipulate data stored in Resource Description Framework (RDF) format. It was made a standard by the RDF Data Access Working Group (DAWG) of the World Wide Web Consortium, and is recognized as one of the key technologies of the semantic web. On 15 January 2008, SPARQL 1.0 was acknowledged by W3C as an official recommendation, and SPARQL 1.1 in March, 2013.

A semantic wiki is a wiki that has an underlying model of the knowledge described in its pages. Regular, or syntactic, wikis have structured text and untyped hyperlinks. Semantic wikis, on the other hand, provide the ability to capture or identify information about the data within pages, and the relationships between pages, in ways that can be queried or exported like a database through semantic queries.

<span class="mw-page-title-main">Magnus Manske</span> German biochemist and MediaWiki developer

Heinrich Magnus Manske is a German biochemist, who is a leading researcher on malaria. He is a senior staff scientist at the Wellcome Sanger Institute in Cambridge, UK and a software developer of one of the first versions of the MediaWiki software, which powers Wikipedia and a number of other wiki-based websites.

<span class="mw-page-title-main">DBpedia</span> Online database project

DBpedia is a project aiming to extract structured content from the information created in the Wikipedia project. This structured information is made available on the World Wide Web using OpenLink Virtuoso. DBpedia allows users to semantically query relationships and properties of Wikipedia resources, including links to other related datasets.

Business Intelligence 2.0 is a development of the existing business intelligence model that began in the mid-2000s, where data can be obtained from many sources. The process allows for querying real-time corporate data by employees but approaches the data with a web browser-based solution. This is in contrast to previous proprietary querying tools that characterized previous BI software.

Freebase was a large collaborative knowledge base consisting of data composed mainly by its community members. It was an online collection of structured data harvested from many sources, including individual, user-submitted wiki contributions. Freebase aimed to create a global resource that allowed people to access common information more effectively. It was developed by the American software company Metaweb and run publicly beginning in March 2007. Metaweb was acquired by Google in a private sale announced on 16 July 2010. Google's Knowledge Graph is powered in part by Freebase.

<span class="mw-page-title-main">SMW+</span>

SMW+ was an open-source software bundle composed of the wiki application MediaWiki along with a number of its extensions, that was developed by the German software company Ontoprise GmbH from 2007 to 2012. In 2012, Ontoprise GmbH filed for bankruptcy and went out of business. DIQA-Projektmanagement GmbH, a start-up founded by former Ontoprise employees, now offers support for the software in SMW+, though under the name "DataWiki."

<span class="mw-page-title-main">Yahoo! SearchMonkey</span> Former search engine optimiser

Yahoo! SearchMonkey was a Yahoo! service which allowed developers and site owners to use structured data to make Yahoo! Search results more useful and visually appealing, and drive more relevant traffic to their sites. The service was shut down in October 2010 along with other Yahoo! services as part of the Microsoft and Yahoo! search deal. The name SearchMonkey is an homage to Greasemonkey. Officially the product name has no space and two capital letters.

<span class="mw-page-title-main">Wikidata</span> Free knowledge database project

Wikidata is a collaboratively edited multilingual knowledge graph hosted by the Wikimedia Foundation. It is a common source of open data that Wikimedia projects such as Wikipedia, and anyone else, is able to use under the CC0 public domain license. Wikidata is a wiki powered by the software MediaWiki, including its extension for semi-structured data, the Wikibase. As of early 2023, Wikidata had 1.54 billion item statements.

translatewiki.net, formerly named Betawiki, is a web-based translation platform powered by the Translate extension for MediaWiki. It can be used to translate various kinds of texts but is commonly used for creating localisations for software interfaces.

<span class="mw-page-title-main">Infobox</span> Template used to collect and present a subset of information about a subject

An infobox is a digital or physical table used to collect and present a subset of information about its subject, such as a document. It is a structured document containing a set of attribute–value pairs, and in Wikipedia represents a summary of information about the subject of an article. In this way, they are comparable to data tables in some aspects. When presented within the larger document it summarizes, an infobox is often presented in a sidebar format.

<span class="mw-page-title-main">Sebastian Schaffert</span>

Sebastian Schaffert is a software engineer and researcher. He was born in Trostberg, Bavaria, Germany on March 18, 1976 and obtained his doctorate in 2004.

<span class="mw-page-title-main">Wikibase</span> MediaWiki software extensions

Wikibase is a set of MediaWiki extensions for working with versioned semi-structured data in a central repository. It is based upon JSON instead of the unstructured data of wikitext normally used in MediaWiki. Its primary components are the Wikibase Repository, an extension for storing and managing data, and the Wikibase Client which allows for the retrieval and embedding of structured data from a Wikibase repository. It was developed for and is used by Wikidata, by Wikimedia Deutschland.

Abstract Wikipedia is an in-development project of the Wikimedia Foundation. It aims to use Wikifunctions to create a language-independent version of Wikipedia using its structured data. First conceived in 2020, Abstract Wikipedia has been under active development ever since, with the related project of Wikifunctions launched successfully in 2023. Nevertheless, the project has proved controversial. As envisioned, Abstract Wikipedia would consist of "Constructors", "Content", and "Renderers".

<span class="mw-page-title-main">Denny Vrandečić</span> Croatian computer scientist

Zdenko "Denny" Vrandečić is a Croatian computer scientist. He was a co-developer of Semantic MediaWiki and Wikidata, the lead developer of the Wikifunctions project, and an employee of the Wikimedia Foundation as a Head of Special Projects, Structured Content. He published modules for the German role-playing game The Dark Eye.

References

  1. "Release 4.1.3". 17 February 2024. Retrieved 20 February 2024.
  2. "semantic templates help". Semantic MediaWiki. Retrieved 2011-07-21.
  3. "inline queries help". Semantic MediaWiki. 2011-06-22. Retrieved 2011-07-21.
  4. "Semantic MediaWiki website count". WikiApiary. Retrieved 2019-10-12.
  5. 1 2 "Semantic MediaWiki Frequently-Asked Questions". Semantic-mediawiki.org. 2011-06-09. Retrieved 2011-07-21.
  6. DOE Launches New Website to Bring Energy Technology Information to the Public Archived 2010-11-22 at the Wayback Machine , press release, December 9, 2009
  7. "Semantic MediaWiki testimonials page". Semantic-mediawiki.org. 2011-06-30. Retrieved 2011-07-21.
  8. Bry, Francois; Schaffert, Sebastian; Vrandecic, Denny; Weiand, Klara (2012). "Semantic Wikis: Approaches, Applications, and Perspectives". Reasoning Web. Semantic Technologies for Advanced Query Answering. Lecture Notes in Computer Science. Vol. 7487. pp. 329–369. doi:10.1007/978-3-642-33158-9_9. ISBN   978-3-642-33157-2. ISSN   0302-9743. In the article, it's provided as example of "Novel Semantic Wiki Applications"; according to the authors, «Semantic wikis could be used to contribute to the semi-automatisation of the translation process by making explicit the multi-lingual correspondences between texts».
  9. "Bio-IT World 2009, Track 3". Bio-itworldexpoeurope.com. Archived from the original on 2011-07-20. Retrieved 2011-07-21.
  10. Wikify Your Metadata! Integrating Business Semantics, Metadata Discovery, and Knowledge Management Archived 2012-05-13 at the Wayback Machine , March 16, 2010, EnterpriseDataWorld Conference Schedule
  11. knowIT, a semantic informatics knowledge management system, WikiSym 2009, Laurent Alquier, Keith McCormick and Ed Jaeger
  12. "Semantic MediaWiki Projects at the Pacific Northwest National Laboratory". Wiki.ontoprise.com. Retrieved 2011-07-21.[ permanent dead link ]
  13. Bringing the Semantic Web to Museums, Paul Miller, January 27, 2009
  14. Use of SMW for Enterprise Architecture, SMWCon Spring 2014
  15. Flexible, purposive SMW use, SMWCon Spring 2010, Clarence Dillon
  16. Semantic Wikis: A Comprehensible Introduction with Examples from the Health Sciences. Maged N. Kamel Boulos. Journal of Emerging Technologies in Web Intelligence, Vol. 1, No. 1, August 2009.
  17. "LexWiki and LexWiki Publisher". National Cancer Institute. 2015-04-30. Retrieved 2017-07-29.
  18. "Referata". Referata. 2011-06-28. Archived from the original on 2008-08-27. Retrieved 2011-07-21.
  19. Get Your MediaWiki Hosting Here Archived 2010-03-16 at the Wayback Machine , Jennifer Zaino, SemanticWeb.com, December 1, 2008
  20. Markus Krötzsch; Denny Vrandecic; Denny Vr; Max Völkel (2005), Wikipedia and the Semantic Web – The Missing Links, Proceedings of Wikimania 2005
  21. M Völkel; M Krötzsch; D Vrandecic (2006), Semantic MediaWiki, Proceedings of the 15th international conference on World Wide Web, p. 585, doi:10.1145/1135777.1135863, ISBN   1-59593-323-9, S2CID   45934347
  22. Heilman, Chris (2009-01-19). "Retrieving and displaying data from Wikipedia with YQL". Yahoo Developer Network. Yahoo. Archived from the original on 2011-01-27. Retrieved 2009-01-19.
  23. Wikipedia to Add Meaning to Its Pages Archived 2010-09-13 at the Wayback Machine , Tom Simonite, Technology Review , July 7, 2010
  24. "Wikibase — Home".
  25. List of Semantic MediaWiki extensions.
  26. "SMWCon homepage". semantic-mediawiki.org. Retrieved 2011-09-25.
  27. "SMWCon Fall 2013 a big success". semantic-mediawiki.org. Retrieved 2013-11-06.
  28. @SemanticMW (November 26, 2020). "The largest ever #SMWCon is over! Thank you to 234 (!) people attending and first and formost our incredible speake…" (Tweet) via Twitter.

Further reading