Open Semantic Framework

Last updated

Open Semantic Framework
Open Semantic Framework logo.png
Developer(s) Structured Dynamics
Initial releaseJune 2009
Stable release
OSF v 3.4 / March 2016
Repository
Written in
Operating system platform independent
Type
License Apache 2
Website opensemanticframework.org

The Open Semantic Framework (OSF) is an integrated software stack using semantic technologies for knowledge management. [1] It has a layered architecture that combines existing open source software with additional open source components developed specifically to provide a complete Web application framework. OSF is made available under the Apache 2 license.

Contents

OSF is a platform-independent Web services framework for accessing and exposing structured data, semi-structured data, and unstructured data using ontologies to reconcile semantic heterogeneities within the contributing data and schema. Internal to OSF, all data is converted to RDF to provide a common data model. The OWL 2 ontology language is used to describe the data schema overlaying all of the constituent data sources.

The architecture of OSF is built around a central layer of RESTful web services, designed to enable most constituent modules within the software stack to be substituted without major adverse impacts on the entire stack. A central organizing perspective of OSF is that of the dataset. These datasets contain the records in any given OSF instance. One or more domain ontologies is used by a given OSF instance to define the structural relationships amongst the data and their attributes and concepts.

Some of the use applications for OSF include local government, [2] health information systems, [3] community indicator systems, [4] eLearning, [5] citizen engagement, [6] or any domain that may be modeled by ontologies.

Documentation and training videos are provided with the open-source OSF application.

History

Early components of OSF were provided under the names of structWSF and conStruct starting in June 2009. [7] The first version 1.x of OSF was announced in August 2010. The first automated OSF installer was released in March 2012. [8] OSF was expanded with an ontology manager, structOntology in August 2012. [9] The version 2.x developments of OSF occurred for enterprise sponsors in the period of early 2012 until the end of 2013. None of these interim 2.x versions were released to the public. Then, at the conclusion of this period, Structured Dynamics, the main developer of OSF, refactored these specific enterprise developments to leapfrog to a new version 3.0 of OSF, announced in early 2014. [10] These public releases were last updated to OSF version 3.4.0 in August 2016. [11]

Architecture and technologies

OSF simple stack architecture OSF stack simple.png
OSF simple stack architecture

The Open Semantic Framework has a basic three-layer architecture. User interactions and content management are provided by an external content management system, which is currently Drupal (but does not depend on it). This layer accesses the pivotal OSF Web Services; there are now more than 20 providing OSF's distributed computing functionality. Full CRUD access and user permissions and security is provided to all digital objects in the stack. This middleware layer then provides a means to access the third layer, the engines and indexers that drive the entire stack. Both the top CMS layer and the engines layer are provided by existing off-the-shelf software. What makes OSF a complete stack are the connecting scripts and the intermediate Web services layer.

The premise of the OSF stack is based on the RDF data model. RDF provides the means for integrating existing structured data assets in any format, with semi-structured data like XML and HTML, and unstructured documents or text. The OSF framework is made operational via ontologies that capture the domain or knowledge space, matched with internal ontologies that guide OSF operations and data display. This design approach is known as ODapps, for ontology-driven applications. [1]

Content management layer

OSF delegates all direct user interactions and standard content management to an external CMS. In the case of Drupal, this integration is tighter, [12] and supports connectors and modules that can replace standard Drupal storage and databases with OSF triplestores. [13]

Web services layer

This intermediate OSF Web Services layer may also be accessed directly via API or command line or utilities like cURL, suitable for interfacing with standard content management systems (CMSs), or via a dedicated suite of connectors and modules that leverage the open source Drupal CMS. These connectors and modules, also part of the standard OSF stack and called OSF for Drupal, natively enable Drupal's existing thousands of modules and ecosystem of developers and capabilities to access OSF using familiar Drupal methods. [12]

The OSF middleware framework is generally RESTful in design and is based on HTTP and Web protocols and W3C open standards. The initial OSF framework comes packaged with a baseline set of more than 20 Web services in CRUD, browse, search, tagging, ontology management, and export and import. All Web services are exposed via APIs and SPARQL endpoints. Each request to an individual Web service returns an HTTP status and optionally a document of resultsets. Each results document can be serialized in many ways, and may be expressed as either RDF, pure XML, JSON, or other formats.[ citation needed ]

Engines layer

The engines layer represents the major workflow requirements and data management and indexing of the system. The premise of the Open Semantic Framework is based on the RDF data model. Using a common data model means that all Web services and actions against the data only need to be programmed via a single, canonical form. Simple converters convert external, native data formats to the RDF form at time of ingest; similar converters can translate the internal RDF form back into native forms for export (or use by external applications). This use of a canonical form leads to a simpler design at the core of the stack and a uniform basis to which tools or other work activities can be written.[ original research? ]

The OSF engines are all open source and work to support this premise. The OSF engines layer governs the index and management of all OSF content. Documents are indexed by the Solr [14] engine for full-text search, while information about their structural characteristics and metadata are stored in an RDF triplestore database provided by OpenLink's Virtuoso software. [15] The schema aspects of the information (the "ontologies") are separately managed and manipulated with their own W3C standard application, the OWL API. [16] At ingest time, the system automatically routes and indexes the content into its appropriate stores. Another engine, GATE (General Architecture for Text Engineering), [17] provides semi-automatic assistance in tagging input information and other natural language processing (NLP) tasks.

Alternatives

OSF is sometimes referred to as a linked data application. [18] Alternative applications in this space include:

The Open Semantic Framework also has alternatives in the semantic publishing and semantic computing arenas.

See also

Related Research Articles

The Semantic Web, sometimes known as Web 3.0, is an extension of the World Wide Web through standards set by the World Wide Web Consortium (W3C). The goal of the Semantic Web is to make Internet data machine-readable.

In computer science and information science, an ontology encompasses a representation, formal naming, and definition of the categories, properties, and relations between the concepts, data, and entities that substantiate one, many, or all domains of discourse. More simply, an ontology is a way of showing the properties of a subject area and how they are related, by defining a set of concepts and categories that represent the subject.

The Web Ontology Language (OWL) is a family of knowledge representation languages for authoring ontologies. Ontologies are a formal way to describe taxonomies and classification networks, essentially defining the structure of knowledge for various domains: the nouns representing classes of objects and the verbs representing relations between the objects.

Enterprise information integration (EII) is the ability to support an unified view of data and information for an entire organization. In a data virtualization application of EII, a process of information integration, using data abstraction to provide a unified interface for viewing all the data within an organization, and a single set of structures and naming conventions to represent this data; the goal of EII is to get a large set of heterogeneous data sources to appear to a user or system as a single, homogeneous data source.

FOAF (ontology) Semantic Web ontology to describe relations between people

FOAF is a machine-readable ontology describing persons, their activities and their relations to other people and objects. Anyone can use FOAF to describe themselves. FOAF allows groups of people to describe social networks without the need for a centralised database.

Semantic technology

The ultimate goal of semantic technology is to help machines understand data. To enable the encoding of semantics with the data, well-known technologies are RDF and OWL. These technologies formally represent the meaning involved in information. For example, ontology can describe concepts, relationships between things, and categories of things. These embedded semantics with the data offer significant advantages such as reasoning over data and dealing with heterogeneous data sources.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

Virtuoso Universal Server Computer software

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional relational database management system (RDBMS), object–relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is an "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

Oracle Spatial and Graph, formerly Oracle Spatial, is a free option component of the Oracle Database. The spatial features in Oracle Spatial and Graph aid users in managing geographic and location-data in a native type within an Oracle database, potentially supporting a wide range of applications — from automated mapping, facilities management, and geographic information systems (AM/FM/GIS), to wireless location services and location-enabled e-business. The graph features in Oracle Spatial and Graph include Oracle Network Data Model (NDM) graphs used in traditional network applications in major transportation, telcos, utilities and energy organizations and RDF semantic graphs used in social networks and social interactions and in linking disparate data sets to address requirements from the research, health sciences, finance, media and intelligence communities.

Ontotext is a Bulgarian software company headquartered in Sofia. It is the semantic technology branch of Sirma Group. Its main domain of activity is the development of software based on the Semantic Web languages and standards, in particular RDF, OWL and SPARQL. Ontotext is best known for the Ontotext GraphDB semantic graph database engine. Another major business line is the development of enterprise knowledge management and analytics systems that involve big knowledge graphs. Those systems are developed on top of the Ontotext Platform that builds on top of GraphDB capabilities for text mining using big knowledge graphs.

Apache Jena Open source semantic web framework for Java

Apache Jena is an open source Semantic Web framework for Java. It provides an API to extract data from and write to RDF graphs. The graphs are represented as an abstract "model". A model can be sourced with data from files, databases, URLs or a combination of these. A model can also be queried through SPARQL 1.1.

Knowledge Discovery Metamodel (KDM) is a publicly available specification from the Object Management Group (OMG). KDM is a common intermediate representation for existing software systems and their operating environments, that defines common metadata required for deep semantic integration of Application Lifecycle Management tools. KDM was designed as the OMG's foundation for software modernization, IT portfolio management and software assurance. KDM uses OMG's Meta-Object Facility to define an XMI interchange format between tools that work with existing software as well as an abstract interface (API) for the next-generation assurance and modernization tools. KDM standardizes existing approaches to knowledge discovery in software engineering artifacts, also known as software mining.

NEPOMUK is an open-source software specification that is concerned with the development of a social semantic desktop that enriches and interconnects data from different desktop applications using semantic metadata stored as RDF. Between 2006 and 2008 it was funded by a European Union research project of the same name that grouped together industrial and academic actors to develop various Semantic Desktop technologies.

A semantic reasoner, reasoning engine, rules engine, or simply a reasoner, is a piece of software able to infer logical consequences from a set of asserted facts or axioms. The notion of a semantic reasoner generalizes that of an inference engine, by providing a richer set of mechanisms to work with. The inference rules are commonly specified by means of an ontology language, and often a description logic language. Many reasoners use first-order predicate logic to perform reasoning; inference commonly proceeds by forward chaining and backward chaining. There are also examples of probabilistic reasoners, including non-axiomatic reasoning systems, and probabilistic logic networks.

Semantic Web Stack

The Semantic Web Stack, also known as Semantic Web Cake or Semantic Web Layer Cake, illustrates the architecture of the Semantic Web.

Middleware is a type of computer software that provides services to software applications beyond those available from the operating system. It can be described as "software glue".

Apache Stanbol

Apache Stanbol is an open source modular software stack and reusable set of components for semantic content management. Apache Stanbol components are meant to be accessed over RESTful interfaces to provide semantic services for content management. Thus, one application is to extend traditional content management systems with semantic services.

The PoolParty Semantic Suite is a technology platform provided by the Semantic Web Company. The EU-based company belongs to the early pioneers of the Semantic Web movement. The software supports enterprises in knowledge management, data analytics and content organisation. The product uses standards-based technologies as defined by W3C, which prevents vendor lock-in. Reference customers are among others Boehringer Ingelheim, Credit Suisse, European Commission, REEEP, Wolters Kluwer and the World Bank Group.

Ontotext GraphDB RDF-store

Ontotext GraphDB is a graph database and knowledge discovery tool compliant with RDF and SPARQL and available as a high-availability cluster. Ontotext GraphDB is used in various European research projects.

References

  1. 1 2 Michael K. Bergman (13 March 2014). OSF: An ontology-driven semantic platform for enterprises (PDF). 2014 Ontology Summit, Track B "Tools, Services, Techniques".
  2. Comments, Posted: 05/13/2013 1:12 PM &#124 (23 May 2013). "New website profiles neighbourhoods of Winnipeg". Winnipeg Free Press. Retrieved 30 September 2014.
  3. "HealthDirect Australia" . Retrieved 30 September 2014.
  4. United Way of Winnipeg (2012). "PEG" . Retrieved 30 September 2014.
  5. Richard Huber; Kirsten Hantelmann; Alexandru Todor; Sebastian Krebs; Ralf Heese; Adrian Paschke (2010). "Use of semantic technologies for the development of a dynamic trajectories generator in a semantic chemistry eLearning platform". arXiv: 1012.1646 [cs.AI].
  6. Steven Ardire (27–28 October 2010). Using an open source semantic framework to create meaningful, interoperable information for better citizen engagement. The Government Open Source Conference, GOSCON 2010. Portland, Oregon: Oregon State University.
  7. "Structured data and web services framework for Drupal unveiled". Structured Dynamics. 16 June 2009. Retrieved 30 September 2014.
  8. Angela Guess (5 March 2012). "Open Semantic Framework installer released". SemanticWeb.com. Retrieved 30 September 2014.
  9. Angela Guess (3 August 2012). "Inside UMBEL: structOntology". SemanticWeb.com. Retrieved 30 September 2014.
  10. Angela Guess (21 January 2014). "SD unveils enterprise-ready version of the Open Semantic Framework". SemanticWeb.com. Retrieved 30 September 2014.
  11. Frédérick Giasson (4 March 2016). "OSF 3.4 Released: now easily deployable in CentOS 6 and 7". fgiasson.com. Retrieved 4 March 2016.
  12. 1 2 "OSF for Drupal". 21 October 2013. Retrieved 30 September 2014.
  13. Frédérick Giasson (10 June 2013). "structFieldStorage: A new field storage system for Drupal". fgiasson.com. Retrieved 30 September 2014.
  14. David Smiley & Eric Pugh (20 November 2011). Apache Solr 3 enterprise search server (1st ed.). Packt Publishing. p. 418. ISBN   978-1-84951-606-8.
  15. OpenLink Software (11 April 2006). "Open source edition of OpenLink Virtuoso, unleashed!" . Retrieved 3 February 2010.
  16. Matthew Horridge & Sean Bechhofer (2011). "The OWL API: A Java API for OWL ontologies". Semantic Web. Vol. 2, no. 1. pp. 11-21.
  17. H. Cunningham; D. Maynard; K. Bontcheva; V. Tablan (2002). GATE: A framework and graphical development environment for robust NLP tools and applications (PDF). Proceedings of the 40th Anniversary Meeting of the Association for Computational Linguistics, 2002.
  18. Vagner Nascimento & Daniel Schwabe (7–10 December 2003). Sören Auer; Oscar Diaz & George A. Papadopoulos (eds.). Semantic data driven interfaces for web applications. Web Engineering: 11th International Conference, ICWE 2011. Paphos, Cyprus: Springer Berlin Heidelberg. pp. 121–136. doi: 10.1007/978-3-642-39200-9_5 .

Further information