Vocabulary-based transformation

Last updated

In metadata, a vocabulary-based transformation (VBT) is a transformation aided by the use of a semantic equivalence statements within a controlled vocabulary.

Contents

Many organizations today require communication between two or more computers. Although many standards exist to exchange data between computers such as HTML or email, there is still much structured information that needs to be exchanged between computers that is not standardized. The process of mapping one source of data into another is often a slow and labor-intensive process.

VBT is a possible way to avoid much of the time and cost of manual data mapping using traditional extract, transform, load technologies.

History

The term vocabulary-based transformation was first defined by Roy Shulte of the Gartner Group around May 2003 and appeared in annual "hype-cycle" for integration.

Application

VBT allows computer systems integrators to more automatically "look up" the definitions of data elements in a centralized data dictionary and use that definition and the equivalent mappings to transform that data element into a foreign namespace.

The Web Ontology Language (OWL) language also support three semantic equivalence statements.

Companies or products

See also

Related Research Articles

<span class="mw-page-title-main">Computer-aided software engineering</span>

Computer-aided software engineering (CASE) is the domain of software tools used to design and implement applications. CASE tools are similar to and were partly inspired by Computer-Aided Design (CAD) tools used for designing hardware products. CASE tools were used for developing high-quality, defect-free, and maintainable software. CASE software is often associated with methods for the development of information systems together with automated tools that can be used in the software development process.

<span class="mw-page-title-main">Metamodeling</span> Concept of software engineering

A metamodel or surrogate model is a model of a model, and metamodeling is the process of generating such metamodels. Thus metamodeling or meta-modeling is the analysis, construction and development of the frames, rules, constraints, models and theories applicable and useful for modeling a predefined class of problems. As its name implies, this concept applies the notions of meta- and modeling in software engineering and systems engineering. Metamodels are of many types and have diverse applications.

Controlled vocabularies provide a way to organize knowledge for subsequent retrieval. They are used in subject indexing schemes, subject headings, thesauri, taxonomies and other knowledge organization systems. Controlled vocabulary schemes mandate the use of predefined, preferred terms that have been preselected by the designers of the schemes, in contrast to natural language vocabularies, which have no such restriction.

In computing and data management, data mapping is the process of creating data element mappings between two distinct data models. Data mapping is used as a first step for a wide variety of data integration tasks, including:

A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.

The semantic spectrum is a series of increasingly precise or rather semantically expressive definitions for data elements in knowledge representations, especially for machine use.

Semantic translation is the process of using semantic information to aid in the translation of data in one representation or data model to another representation or data model. Semantic translation takes advantage of semantics that associate meaning with individual data elements in one dictionary to create an equivalent meaning in a second system.

In computer metadata, semantic equivalence is a declaration that two data elements from different vocabularies contain data that has similar meaning. There are three types of semantic equivalence statements:

Metadata publishing is the process of making metadata data elements available to external users, both people and machines using a formal review process and a commitment to change control processes.

A semantic mapper is tool or service that aids in the transformation of data elements from one namespace into another namespace. A semantic mapper is an essential component of a semantic broker and one tool that is enabled by the Semantic Web technologies.

Semantic integration is the process of interrelating information from diverse sources, for example calendars and to do lists, email archives, presence information, documents of all sorts, contacts, search results, and advertising and marketing relevance derived from them. In this regard, semantics focuses on the organization of and action upon information by acting as an intermediary between heterogeneous data sources, which may conflict not only by structure but also context or value.

Simple Knowledge Organization System (SKOS) is a W3C recommendation designed for representation of thesauri, classification schemes, taxonomies, subject-heading systems, or any other type of structured controlled vocabulary. SKOS is part of the Semantic Web family of standards built upon RDF and RDFS, and its main objective is to enable easy publication and use of such vocabularies as linked data.

Data exchange is the process of taking data structured under a source schema and transforming it into a target schema, so that the target data is an accurate representation of the source data. Data exchange allows data to be shared between different computer programs.

OneSource is an evolving data analysis tool used internally by the Air Combat Command (ACC) Vocabulary Services Team, and made available to general data management community. It is used by the greater US Department of Defense (DoD) and NATO community for controlled vocabulary management and exploration. It provides its users with a consistent view of syntactical, lexical, and semantic data vocabularies through a community-driven web environment. It was created with the intention of directly supporting the DoD Net-centric Data Strategy of visible, understandable, and accessible data assets.

The Governance Interoperability Framework (GIF) is an open, standards-based specification and set of technologies that describes and promotes interoperability among components of a service-oriented architecture (SOA). GIF integrates SOA ecosystem technologies to achieve heterogeneous service lifecycle governance and is supported by Hewlett-Packard Company and by GIF partners.

indecs was a project partly funded by the European Community Info 2000 initiative and by several organisations representing the music, rights, text publishing, authors, library and other sectors in 1998-2000, which has since been used in a number of metadata activities. A final report and related documents were published; the indecs Metadata Framework document is a concise summary.

Business semantics management (BSM) encompasses the technology, methodology, organization, and culture that brings business stakeholders together to collaboratively realize the reconciliation of their heterogeneous metadata; and consequently the application of the derived business semantics patterns to establish semantic alignment between the underlying data structures.

Model Driven Interoperability (MDI) is a methodological framework, which provides a conceptual and technical support to make interoperable enterprises using ontologies and semantic annotations, following model driven development (MDD) principles.

DataOps is a set of practices, processes and technologies that combines an integrated and process-oriented perspective on data with automation and methods from agile software engineering to improve quality, speed, and collaboration and promote a culture of continuous improvement in the area of data analytics. While DataOps began as a set of best practices, it has now matured to become a new and independent approach to data analytics. DataOps applies to the entire data lifecycle from data preparation to reporting, and recognizes the interconnected nature of the data analytics team and information technology operations.