Reference data

Last updated

Reference data is data used to classify or categorize other data. [1] Typically, they are static or slowly changing over time.

Contents

Examples of reference data include:

Reference data sets are sometimes alternatively referred to as a "controlled vocabulary" [2] or "lookup" data. [3]

Reference data differs from master data. While both provide context for business transactions, reference data is concerned with classification and categorisation, while master data is concerned with business entities. [4] A further difference between reference data and master data is that a change to the reference data values may require an associated change in business process to support the change, while a change in master data will always be managed as part of existing business processes. For example, adding a new customer or sales product is part of the standard business process. However, adding a new product classification (e.g. "restricted sales item") or a new customer type (e.g. "gold level customer") will result in a modification to the business processes to manage those items.

Externally-defined reference data

For most organisations, most or all reference data is defined and managed within that organisation. Some reference data, however, may be externally defined and managed, for example by standards organizations. [5] An example of externally-defined reference data is the set of country codes as defined in ISO 3166-1. [6] [7]

Reference data management

Curating and managing reference data is key to ensuring its quality and thus fitness for purpose. All aspects of an organisation, operational and analytical, are greatly dependent on the quality of an organization's reference data. Without consistency across business process or applications, for example, similar things may be described in quite different ways. Reference data gain in value when they are widely re-used and widely referenced.

Examples of good practice in reference data management include:

  1. Formalize the reference data management
  2. Use external reference data as much as possible
  3. Govern the reference data specific to your enterprise
  4. Manage reference data at enterprise level
  5. Version control your reference data [8]

Related Research Articles

<span class="mw-page-title-main">Oracle Corporation</span> American multinational computer technology corporation

Oracle Corporation is an American multinational computer technology corporation headquartered in Austin, Texas. In 2020, Oracle was the third-largest software company in the world by revenue and market capitalization. The company sells database software and technology, cloud engineered systems, and enterprise software products, such as enterprise resource planning (ERP) software, human capital management (HCM) software, customer relationship management (CRM) software, enterprise performance management (EPM) software, and supply chain management (SCM) software.

A relational database is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

Structured Query Language (SQL), is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.

<span class="mw-page-title-main">IBM Db2</span> Relational model database server

Db2 is a family of data management products, including database servers, developed by IBM. It initially supported the relational model, but was extended to support object–relational features and non-relational structures like JSON and XML. The brand name was originally styled as DB/2, then DB2 until 2017 and finally changed to its present form.

Oracle Database is a multi-model database management system produced and marketed by Oracle Corporation.

<span class="mw-page-title-main">IBM Information Management System</span> Joint hierarchical database made by IBM

The IBM Information Management System (IMS) is a joint hierarchical database and information management system that supports transaction processing.

Essbase is a multidimensional database management system (MDBMS) that provides a platform upon which to build analytic applications. Essbase began as a product from Arbor Software, which merged with Hyperion Software in 1998. Oracle Corporation acquired Hyperion Solutions Corporation in 2007. Until late 2005 IBM also marketed an OEM version of Essbase as DB2 OLAP Server.

<span class="mw-page-title-main">Hyperion Solutions</span> American software company

Hyperion Solutions Corporation was a software company located in Santa Clara, California, which was acquired by Oracle Corporation in 2007. Many of its products were targeted at the business intelligence (BI) and business performance management markets, and as of 2013 were developed and sold as Oracle Hyperion products. Hyperion Solutions was formed from the merger of Hyperion Software and Arbor Software in 1998.

Enterprise software, also known as enterprise application software (EAS), is computer software used to satisfy the needs of an organization rather than individual users. Such organizations include businesses, schools, interest-based user groups, clubs, charities, and governments. Enterprise software is an integral part of a (computer-based) information system; a collection of such software is called an enterprise system. These systems handle a number of operations in an organization to enhance the business and management reporting tasks. The systems must process the information at a relatively high speed and can be deployed across a variety of networks.

IBM Z Family name used by IBM for its z/Architecture mainframe computers

IBM Z is a family name used by IBM for all of its z/Architecture mainframe computers. In July 2017, with another generation of products, the official family was changed to IBM Z from IBM z Systems; the IBM Z family now includes the newest model, the IBM z16, as well as the z15, the z14, and the z13, the IBM zEnterprise models, the IBM System z10 models, the IBM System z9 models and IBM eServer zSeries models.

Oracle Fusion Middleware consists of several software products from Oracle Corporation. FMW spans multiple services, including Java EE and developer tools, integration services, business intelligence, collaboration, and content management. FMW depends on open standards such as BPEL, SOAP, XML and JMS.

In information science and information technology, single source of truth (SSOT) architecture, or single point of truth (SPOT) architecture, for information systems is the practice of structuring information models and associated data schemas such that every data element is mastered in only one place, providing data normalization to a canonical form. Any possible linkages to this data element are by reference only. Because all other locations of the data just refer back to the primary "source of truth" location, updates to the data element in the primary location propagate to the entire system, providing multiple advantages simultaneously: greater efficiency/productivity, easy prevention of mistaken inconsistencies, and greatly simplified version control. Without SSOT architecture, rampant forking impairs clarity and productivity, imposing laborious maintenance needs.

Master data represents "data about the business entities that provide context for business transactions". The most commonly found categories of master data are parties, products, financial structures and locational concepts.

Master data management (MDM) is a technology-enabled discipline in which business and information technology work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets.

IBM Netezza is a subsidiary of American technology company IBM that designs and markets high-performance data warehouse appliances and advanced analytics applications for uses including enterprise data warehousing, business intelligence, predictive analytics and business continuity planning.

Innovative Routines International (IRI), Inc. is an American software company first known for bringing mainframe sort merge functionality into open systems. IRI was the first vendor to develop a commercial replacement for the Unix sort command, and combine data transformation and reporting in Unix batch processing environments. In 2007, IRI's coroutine sort ("CoSort") became the first product to collate and convert multi-gigabyte XML and LDIF files, join and lookup across multiple files, and apply role-based data privacy functions for fields within sensitive files.

Information capital is a concept which asserts that information has intrinsic value which can be shared and leveraged within and between organizations. Information capital connotes that sharing information is a means of sharing power, supporting personnel, and optimizing working processes. Information capital is the pieces of information which enables the exchange of Knowledge capital.

<span class="mw-page-title-main">Arcplan</span> Business intelligence software company

Arcplan is a software for business intelligence (BI), budgeting, planning & forecasting (BP&F), business analytics and collaborative Business Intelligence. It is the enhancement of the enterprise software inSight® and dynaSight of the former German provider arcplan Information Services GmbH.

<span class="mw-page-title-main">Oracle Cloud</span> Cloud computing service

Oracle Cloud is a cloud computing service offered by Oracle Corporation providing servers, storage, network, applications and services through a global network of Oracle Corporation managed data centers. The company allows these services to be provisioned on demand over the Internet.

The Data Management Association (DAMA), formerly known as the Data Administration Management Association, is a global not-for-profit organization which aims to advance concepts and practices about information management and data management. It describes itself as vendor-independent, all-volunteer organization, and has a membership consisting of technical and business professionals. Its international branch is called DAMA International, and DAMA also has various continental and national branches around the world.

References

  1. DAMA-DMBOK: Data Management Body of Knowledge (2nd ed.). Data Management Association. 2017. ISBN   978-1634622349.
  2. "Multilingual reference data". EU Open Data Portal. European Commission. Retrieved 2020-06-07.
  3. "Using reference data for lookups in Stream Analytics". Microsoft. Microsoft. Retrieved 2020-06-07.
  4. DAMA-DMBOK: Data Management Body of Knowledge (2nd ed.). Data Management Association. 2017. ISBN   978-1634622349.
  5. Chisholm, Malcolm. "The Foundations of Successful Reference Data Management" (PDF). TopQuadrant. TopQuadrant. Retrieved 2020-06-07.
  6. "IBM Redbooks | Reference Data Management". www.redbooks.ibm.com. 2013-05-16. Retrieved 2015-12-09.
  7. "Reference Data Management and Master Data: Are they Related ? (Oracle Master Data Management)". blogs.oracle.com. Archived from the original on 2015-10-11. Retrieved 2015-12-09.
  8. "5 best practices for managing reference data - LightsOnData". LightsOnData. 2018-07-25. Retrieved 2018-08-17.

Further reading

See also