DSSim

Last updated

DSSim [1] [2] is an ontology mapping system, that has been conceived to achieve a certain level of the envisioned machine intelligence on the Semantic Web. The main driving factors behind its development was to provide an alternative to the existing heuristics or machine learning based approaches with a multi-agent approach that makes use of uncertain reasoning. The system provides a possible approach to establish machine understanding over Semantic Web data through multi-agent beliefs and conflict resolution.

Contents

Theoretical background

The DSSim framework for ontology mapping was introduced in 2005 [3] by Miklos Nagy and Maria Vargas-Vera at the Open University (OU). DSSim addresses three challenges of the Semantic Web:

DSSim uses novel 3D visualisation techniques of both mapping and reasoning results. The main purpose of the reasoning storage and visualisation is to retain the reasoning states, in order to visualise it later to the end users. The main objective is to show to the end users why the system has selected a mapping candidate from two different ontologies.

Evaluation of the system

The evaluation of the system was carried out in the Ontology Alignment Evaluation Initiative (OAEI). DSSim has participated in 2006, [4] 2007, [5] 2008 [6] and 2009 [7] achieving gradually improved results. The following sections present the result of two tracks out of 8 from the OAEI 2008.

YearAvailable tracksParticipated tracksNumber of systems participatedDSSim track coverage
2006511020%
2007761785%
20088813100%
2009871687%

Library track at OAEI 2008

According to the original task definition [8] provided by the organizers of the OAEI 2008, the library track involved the alignment of two Dutch thesauri. These Dutch thesauri are used to index books from two collections held by the National Library of the Netherlands (KB). KB maintains two big collections: the Deposit Collection, containing all the Dutch printed publications (one million items), and the Scientific Collection, with about 1.4 million books mainly about the history, language and culture of the Netherlands. Each collection is described according to its own indexing system and conceptual vocabulary. On the one hand, the Scientific Collection was described using the GTT, a huge vocabulary containing 35,000 general concepts ranging from Wolkenkrabbers (Sky-scrapers) to Verzorging (Care). On the other hand, the books contained in the Deposit Collection are mainly indexed against the Brinkman thesaurus, containing a large set of headings (more than 5,000) that were expected to serve as global subjects of books. For each concept, the thesauri provided the usual lexical and semantic information: preferred labels, synonyms and notes, broader and related concepts, etc. The language of both thesauri was Dutch, but a quite substantial part of Brinkman concepts (around 60%) come with English labels. The library track was difficult partly because of its relative large size and because of its multilingual representation. Nevertheless in the library track DSSim has performed the best out of the 3 participating systems. However these ontologies contain related and broader terms therefore the mapping can be carried out without consulting multi-lingual background knowledge.

Dssim-oaei2008-library.png

Directory track at OAEI 2008

As stated by the original task definition [9] provided by the organizers of the OAEI 2008, this track is designed to evaluate mapping quality in a real world taxonomy integration scenario. The main objective is to measure whether ontology alignment tools can effectively be applied to integration of "shallow ontologies". The evaluation dataset was extracted from Google, Yahoo! and Looksmart web directories. The way these ontology pairs were created was to rely on a reference interpretation for nodes, constructed by looking at their use. The assumption was that the semantics of nodes could have been derived from their pragmatics, namely from analysing, which documents were classified under which nodes. The basic idea was therefore to compute the relationship hypotheses based on the co-occurrence of documents. The specific characteristics of the dataset were:

SystemPrec.Rec.F-measure.
DSSim0.600.410.49
CIDER0.600.380.47
Lily0.590.370.46
TaxoMap [10] 0.590.390.43
MapPSO [11] 0.570.310.40
RiMOM [12] 0.550.170.26
ASMOV0.640.120.20

In the directory track only 6 systems have participated in 2008. In terms of F-value DSSim has performed the best however the difference was marginal compared to the CIDER or Lily systems.

Related Research Articles

<span class="mw-page-title-main">Semantic Web</span> Extension of the Web to facilitate data exchange

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for various domains: the nouns representing classes of objects and the verbs representing relations between the objects.

Semantic similarity is a metric defined over a set of documents or terms, where the idea of distance between items is based on the likeness of their meaning or semantic content as opposed to lexicographical similarity. These are mathematical tools used to estimate the strength of the semantic relationship between units of language, concepts or instances, through a numerical description obtained according to the comparison of information supporting their meaning or describing their nature. The term semantic similarity is often confused with semantic relatedness. Semantic relatedness includes any relation between two terms, while semantic similarity only includes "is a" relations. For example, "car" is similar to "bus", but is also related to "road" and "driving".

Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information, documents of all sorts, contacts, search results, and advertising and marketing relevance derived from them. In this regard, semantics focuses on the organization of and action upon information by acting as an intermediary between heterogeneous data sources, which may conflict not only by structure but also context or value.

Ontology alignment, or ontology matching, is the process of determining correspondences between concepts in ontologies. A set of correspondences is also called an alignment. The phrase takes on a slightly different meaning, in computer science, cognitive science or philosophy.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

<span class="mw-page-title-main">Ian Horrocks</span> British academic (b.1958)

Ian Robert Horrocks is a professor of computer science at the University of Oxford in the UK and a Fellow of Oriel College, Oxford. His research focuses on knowledge representation and reasoning, particularly ontology languages, description logic and optimised tableaux decision procedures.

<span class="mw-page-title-main">Frank van Harmelen</span>

Frank van Harmelen is a Dutch computer scientist and professor in Knowledge Representation & Reasoning in the AI department at the Vrije Universiteit Amsterdam. He was scientific director of the LarKC project (2008-2011), "aiming to develop the Large Knowledge Collider, a platform for very large scale semantic web reasoning."

Ontology-based data integration involves the use of one or more ontologies to effectively combine data or information from multiple heterogeneous sources. It is one of the multiple data integration approaches and may be classified as Global-As-View (GAV). The effectiveness of ontology‑based data integration is closely tied to the consistency and expressivity of the ontology used in the integration process.

The terms schema matching and mapping are often used interchangeably for a database process. For this article, we differentiate the two as follows: schema matching is the process of identifying that two objects are semantically related while mapping refers to the transformations between the objects. For example, in the two schemas DB1.Student and DB2.Grad-Student ; possible matches would be: DB1.Student ≈ DB2.Grad-Student; DB1.SSN = DB2.ID etc. and possible transformations or mappings would be: DB1.Marks to DB2.Grades.

<span class="mw-page-title-main">Ontology engineering</span> Field that studies the methods and methodologies for building ontologies

In computer science, information science and systems engineering, ontology engineering is a field which studies the methods and methodologies for building ontologies, which encompasses a representation, formal naming and definition of the categories, properties and relations between the concepts, data and entities of a given domain of interest. In a broader sense, this field also includes a knowledge construction of the domain using formal ontology representations such as OWL/RDF. A large-scale representation of abstract concepts such as actions, time, physical objects and beliefs would be an example of ontological engineering. Ontology engineering is one of the areas of applied ontology, and can be seen as an application of philosophical ontology. Core ideas and objectives of ontology engineering are also central in conceptual modeling.

Minimal mappings are the result of an advanced technique of semantic matching, a technique used in computer science to identify information which is semantically related.

Semantic matching is a technique used in computer science to identify information which is semantically related.

A lightweight ontology is an ontology or knowledge organization system in which concepts are connected by rather general associations than strict formal connections. Examples of lightweight ontologies include associative network and multilingual classifications but the term is not used consistently.

<span class="mw-page-title-main">Ulrike Sattler</span>

Ulrike M. Sattler is a professor of computer science in the information management group of the Department of Computer Science at the University of Manchester and a visiting professor at the University of Oslo.

<span class="mw-page-title-main">Conceptualization (information science)</span> Abstract simplified view of selected part(s) of the world

In information science a conceptualization is an abstract simplified view of some selected part of the world, containing the objects, concepts, and other entities that are presumed of interest for some particular purpose and the relationships between them. An explicit specification of a conceptualization is an ontology, and it may occur that a conceptualization can be realized by several distinct ontologies. An ontological commitment in describing ontological comparisons is taken to refer to that subset of elements of an ontology shared with all the others. "An ontology is language-dependent", its objects and interrelations described within the language it uses, while a conceptualization is always the same, more general, its concepts existing "independently of the language used to describe it". The relation between these terms is shown in the figure to the right.

The Extended Semantic Web Conference, formerly known as the European Semantic Web Conference, is a yearly international academic conference on the topic of the Semantic Web. The event began in 2004, as the European Semantic Web Symposium. The goal of the event is "to bring together researchers and practitioners dealing with different aspects of semantics on the Web".

Natasha Fridman Noy is a Russian-born American Research scientist who works at Google Research in Mountain View, CA, who focuses on making structured data more accessible and usable. She is the team leader for Dataset Search, a web-based search engine for all datasets. Natasha worked at Stanford Center for Biomedical Informatics Research before joining Google, where she made significant contributions to ontology building and alignment, as well as collaborative ontology engineering. Natasha is on the Editorial Boards of many Semantic Web and Information Systems publications and is the Immediate Past President of the Semantic Web Science Association. From 2011 to 2017, she was the president of the Semantic Web Science Association.

<span class="mw-page-title-main">Pascal Hitzler</span> German-American computer scientist

Pascal Hitzler is a German American computer scientist specializing in Semantic Web and Artificial Intelligence. He is endowed Lloyd T. Smith Creativity in Engineering Chair, one of the Directors of the Institute for Digital Agriculture and Advanced Analytics (ID3A) and Director of the Center for Artificial Intelligence and Data Science (CAIDS) at Kansas State University, and the founding Editor-in-Chief of the Semantic Web journal and the IOS Press book series Studies on the Semantic Web.

<span class="mw-page-title-main">Knowledge graph</span> Type of knowledge base

In knowledge representation and reasoning, a knowledge graph is a knowledge base that uses a graph-structured data model or topology to represent and operate on data. Knowledge graphs are often used to store interlinked descriptions of entities – objects, events, situations or abstract concepts – while also encoding the semantics or relationships underlying these entities.

References

  1. Miklos Nagy and Maria Vargas-Vera. Towards an Automatic Semantic Data Integration: Muti-Agent Framework Approach. Semantic Web. Gang Wu (ed), Chapter 7, pp. 107-134; In-Tech Education and Publishing KG; 2010, ISBN   978-953-7619-54-1.
  2. Nagy, Miklos; Vargas-Vera, Maria (2011). "Multiagent Ontology Mapping Framework for the Semantic Web". IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans. Institute of Electrical and Electronics Engineers (IEEE). 41 (4): 693–704. doi:10.1109/tsmca.2011.2132704. ISSN   1083-4427. S2CID   206792175.
  3. Miklos Nagy, Maria Vargas-Vera and Enrico Motta. Multi-agent Ontology Mapping Framework in the AQUA Question Answering System. the Fourth International Mexican Conference on Artificial Intelligence (MICAI-2005), Lecture Notes in Artificial Intelligence LNAI 3789, Gelbukh, A de Albornoz and H. Terashima (Eds), pp. 70-79, Monterrey Mexico, November 14–18, 2005.
  4. Jérôme Euzenat, Malgorzata Mochol, Pavel Shvaiko, Heiner Stuckenschmidt, Ondrej Sváb, Vojtech Svátek, Willem Robert van Hage, Mikalai Yatskevich: Results of the Ontology Alignment Evaluation Initiative 2006. In Proceedings of the 1st International Workshop on Ontology Matching, ISWC 2006
  5. Jérôme Euzenat, Antoine Isaac, Christian Meilicke, Pavel Shvaiko, Heiner Stuckenschmidt, Ondřej Šváb, Vojtěch Svátek, Willem Robert van Hage, Mikalai Yatskevich (2007). "Results of the Ontology Alignment Evaluation Initiative 2007" (PDF). Proceedings of the 2nd International Workshop on Ontology Matching, ISWC 2007.{{cite journal}}: CS1 maint: multiple names: authors list (link)
  6. Caterina Caracciolo, Jérôme Euzenat, Laura Hollink, Ryutaro Ichise, Antoine Isaac, Véronique Malaisé, Christian Meilicke, Juan Pane, Pavel Shvaiko, Heiner Stuckenschmidt, Ondřej Šváb-Zamazal, and Vojtěch Svátek: Results of the Ontology Alignment Evaluation Initiative 2008. In Proceedings of the 3rd International Workshop on Ontology Matching, ISWC 2008
  7. Jérôme Euzenat, Alfio Ferrara, Laura Hollink, Antoine Isaac, Cliff Joslyn, Véronique Malaisé, Christian Meilicke, Andriy Nikolov, Juan Pane, Marta Sabou, François Scharffe, Pavel Shvaiko, Vassilis Spiliopoulos, Heiner Stuckenschmidt, Ondřej Šváb-Zamazal, Vojtěch Svátek, Cássia Trojahn dos Santos, George Vouros and Shenghui Wang: Results of the Ontology Alignment Evaluation Initiative 2007. In Proceedings of the 4th International Workshop on Ontology Matching, ISWC 2009
  8. Ontology Alignment Evaluation Initiative::Library Track
  9. Ontology Alignment Evaluation Initiative::Track name
  10. http://sunsite.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-431/oaei08_paper12.pdf [ bare URL PDF ]
  11. "MapPSO". 30 April 2013.
  12. Juanzi Li; Jie Tang; Yi Li; Qiong Luo (2019-04-16). "RiMOM: A Dynamic Multistrategy Ontology Alignment Framework". IEEE Transactions on Knowledge and Data Engineering. 21 (8): 1218–1232. CiteSeerX   10.1.1.649.4619 . doi:10.1109/TKDE.2008.202. S2CID   4332951.