Canonical schema pattern

Last updated

In software engineering, Canonical Schema is a design pattern, applied within the service-orientation design paradigm, which aims to reduce the need for performing data model [1] transformation when services [2] exchange messages that reference the same data model. [3]

Contents

Rationale

The interaction between services often requires exchanging business documents. In order for a service consumer to send data (related to a particular business entity e.g. a purchase order), it needs to know the structure of the data i.e. the data model. For this, the service provider publishes the structure of the data that it expects within the incoming message from the service consumer. In case of services being implemented as web services, [4] this would be the XML schema document. Once the service consumer knows the required data model, it can structure the data accordingly. However, under some conditions it may be possible that the service consumer already possesses the required data, which relates to a particular business document, but the data does not conform to the data model as specified by the service provider. This disparity among the data models results in the requirement of data model transformation so that the message is transformed into the required structure as dictated by the service provider. Building upon the aforementioned example, it is entirely possible that, after processing the received business document, the service provider sends back the processed document to the service consumer that once again performs the data model transformation to convert the processed business document back to the data model that it uses within its logic to represent the business document.
This runtime data model transformation adds processing overhead and complicates the design of service compositions. [5] In order to avoid the need for data model transformation, the Canonical Schema pattern dictates the use of standardized data models for those business documents that are commonly processed by the services in a service inventory. [6] [7]

Usage

Diagram A
Service A is using a different data model as compared to Service B for the same business document. When messages are exchanged, runtime data model transformation needs to be performed. SOA DP Canonical Schema A.JPG
Diagram A
Service A is using a different data model as compared to Service B for the same business document. When messages are exchanged, runtime data model transformation needs to be performed.
Diagram B
Both services are using the same data model for representing a particular business document. As a result, no data model transformation is required when messages are exchanged. SOA DP Canonical Schema B.JPG
Diagram B
Both services are using the same data model for representing a particular business document. As a result, no data model transformation is required when messages are exchanged.

This design pattern is fully supported by the application of the Standardized Service Contract design principle. The Standardized Service Contract design principle advocates that the service contracts be based on standardized data models. This is achieved by performing an analysis of the service inventory blueprint [8] in order to find out the commonly occurring business documents that are exchanged between services. These business documents are then modeled in a standardized manner. For example, in case of web services, the business documents are modeled as XML schemas. Once a standardized data representation layer exists in a service inventory, different service contracts can make use of the same data models if they need to exchange the same business documents. This eliminates the need for any data model transformation and reduces the processing overhead associated with the data model transformation. It also increases the reusability potential of a service as now the service can be consumed without requiring any custom data model transformation logic. In a way, the application of the Canonical Schema pattern reduces the need for the application of the Data Model Transformation [9] design pattern.

Considerations

The application of this design pattern requires design standards [10] in place that make the use of standardized data models mandatory, as the mere creation of data models does not guarantee their use. [11] Although simple in principle but difficult to enforce as it needs commitment from different project teams which may entail extra efforts, on part of each team, in terms of designing solutions that accommodate standardized data models.
On some occasions, either because of the sheer size of the organization or because of the resistance from different segments of the enterprise, the Canonical Schema design pattern may need to be applied within a particular domain inventory, created by the application of the Domain Inventory design pattern. [7]
The schemas need to be designed separately than the service contract design so that there is no dependency between them. [11]

See also

Related Research Articles

XML Markup language developed by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The World Wide Web Consortium's XML 1.0 Specification of 1998 and several other related specifications—all of them free open standards—define XML.

The term Web service (WS) is either:

Schematron is a rule-based validation language for making assertions about the presence or absence of patterns in XML trees. It is a structural schema language expressed in XML using a small number of elements and XPath.

Service-oriented architecture (SOA) is an architectural style that supports service orientation. By consequence, it is as well applied in the field of software design where services are provided to the other components by application components, through a communication protocol over a network. A service is a discrete unit of functionality that can be accessed remotely and acted upon and updated independently, such as retrieving a credit card statement online. SOA is also intended to be independent of vendors, products and technologies.

A mashup, in web development, is a web page or web application that uses content from more than one source to create a single new service displayed in a single graphical interface. For example, a user could combine the addresses and photographs of their library branches with a Google map to create a map mashup. The term implies easy, fast integration, frequently using open application programming interfaces and data sources to produce enriched results that were not necessarily the original reason for producing the raw source data. The term mashup originally comes from creating something by combining elements from two or more sources. In recent English parlance it can refer to music, where people seamlessly combine audio from one song with the vocal track from another - thereby mashing them together to create something new.

Content Assembly Mechanism (CAM) is an XML-based standard for creating and managing information exchanges that are interoperable and deterministic descriptions of machine-processable information content flows into and out of XML structures. CAM is a product of the OASIS Content Assembly Technical Committee.

Web data services refers to service-oriented architecture (SOA) applied to data sourced from the World Wide Web and the Internet as a whole. Web data services enable maximal mashup, reuse, and sharing of structured data, semi-structured information, and unstructured information.

Service-orientation design principles are proposed principles for developing the solution logic of services within service-oriented architectures (SOA).

In the domain of the service-orientation design paradigm, the Enterprise Inventory is a design pattern by Thomas Erl that answers the question, "How can services be delivered to maximize recomposition?"; the application of this pattern results in a standardized enterprise-wide service inventory that fosters repeated service composition.

The event-driven messaging is a design pattern, applied within the service-orientation design paradigm to enable the service consumers, which are interested in events that occur within the periphery of a service provider, to get notifications about these events as and when they occur without resorting to the traditional inefficient polling based mechanism.

The standardized service contract is a software design principle applied within the service-orientation design paradigm to guarantee that service contracts within a service inventory adhere to the same set of design standards. This facilitates standardized service contracts across the service inventory.

Service abstraction is a design principle that is applied within the service-orientation design paradigm so that the information published in a service contract is limited to what is required to effectively utilize the service The service contract should not contain any superfluous information that is not required for its invocation. Also that the information should be limited to the serviced contract only, no other document or medium should be made available to the service consumers other than the service contract that contains additional service related information.

In computing, service composability is a design principle, applied within the service-orientation design paradigm, that encourages the design of services that can be reused in multiple solutions that are themselves made up of composed services. The ability to recompose the service is ideally independent of the size and complexity of the service composition.

Domain Inventory is a design pattern, applied within the service-orientation design paradigm, whose application enables creating pools of services, which correspond to different segments of the enterprise, instead of creating a single enterprise-wide pool of services. This design pattern is usually applied when it is not possible to create a single inventory of services for whole of the enterprise by following the same design standards across the different segments of the enterprise. The Domain Inventory Design pattern by Thomas Erl asks, "How can services be delivered to maximize recomposition when enterprise-wide standardization is not possible?" and is discussed as part of this podcast.

Service normalization is a design pattern, applied within the service-orientation design paradigm, whose application ensures that services that are part of the same service inventory do not contain any redundant functionality. This design pattern emphasizes on creating normalized services, much like creating normalized tables in a database where all the attributes in a table only relate to the entity described by the table and any attributes that do not directly relate to the entity are either put into a new table or in an existing table that better fits the context of that attribute.

Logic Centralization is a design pattern, applied within the service-orientation design paradigm, whose application aims to increase the reusability potential of agnostic logic by ensuring that services do not contain redundant agnostic logic and that any reusable logic should only be represented by a service that has the most suitable functional context.

Canonical Protocol is a design pattern, applied within the service-orientation design paradigm, which attempts to make services, within a service inventory, interoperable with each other by standardizing the communication protocols used by the services. This eliminates the need for bridging communication protocols when services use different communication protocols.

Utility abstraction is a design pattern, applied within the service-orientation design paradigm, which advocates designing services that provide cross-cutting non-business related functionality, which can be positioned as utility resources to automate multiple business processes.

A canonical model is a design pattern used to communicate between different data formats. Essentially: create a data model which is a superset of all the others ("canonical"), and create a "translator" module or layer to/from which all existing modules exchange data with other modules. The individual modules can then be considered endpoints on an intelligent bus; the bus centralises all the data-translation intelligence.

References

  1. The structure of the data e.g. in a database, the structure of the data contained in a table is represented by the table schema. In case of XML based documents, the corresponding XML schema document contains the structure of the XML document.
  2. "Services". Archived from the original on 2012-05-01. Retrieved 2010-03-17.
  3. Mauro. et al. Service Oriented Device Integration - An Analysis of SOA Design Patterns. Archived 2010-03-28 at the Wayback Machine [Online], pp.1-10, 2010 43rd Hawaii International Conference on System Sciences, 2010. Date accessed: 30 April 2010.
  4. Service can be implemented using any technology as long as it conforms to the service-orientation guidelines.
  5. "Service Compositions". Archived from the original on 2010-03-11. Retrieved 2010-03-17.
  6. "service inventory". Archived from the original on 2010-03-13. Retrieved 2010-03-17.
  7. 1 2 Thomas Erl, Herbjörn Wilhelmsen.Canonical Schema Design Pattern[Online]. Date accessed: April 8, 2010.
  8. "Service Inventory Blueprint". Archived from the original on 2010-05-11. Retrieved 2010-03-17.
  9. "Data Model Transformation". Archived from the original on 2010-02-13. Retrieved 2010-03-17.
  10. "design standards". Archived from the original on 2010-03-17. Retrieved 2010-03-17.
  11. 1 2 Eben Hewitt.Java SOA Cookbook [ permanent dead link ][Online].pp 50.Date accessed: 25 April 2010.