Master data management

Last updated January 23, 2025

Master data management (MDM) is a discipline in which business and information technology collaborate to ensure the uniformity, accuracy, stewardship, semantic consistency, and accountability of the enterprise's official shared master data assets.^[1]^[2]

Reasons for master data management

Data consistency and accuracy: MDM ensures that the organization's critical data is consistent and accurate across all systems, reducing discrepancies and errors caused by multiple, siloed copies of the same data.
Improved decision-making: By providing a single version of the truth, MDM aims to have business leaders make informed, data-driven decisions, and improve overall business performance.
Operational efficiency: With consistent and accurate data, operational processes such as reporting, inventory management, and customer service become more efficient.
Regulatory compliance: MDM tries to help organizations comply with industry standards and regulations by ensuring that master data is accurately recorded, maintained, and audited.

However, issues with data quality, classification, and reconciliation may require data transformation. As with other Extract, Transform, Load-based data movements, these processes are expensive and inefficient, reducing return on investment for a project.

Business unit and product line segmentation

As a result of business unit and product line segmentation, the same entity (whether a customer, supplier, or product) will be included in different product lines. This leads to data redundancy and even confusion.

For example, a customer takes out a mortgage at a bank. If the marketing and customer service departments have separate databases, advertisements might still be sent to the customer, even though they've already signed up. The two parts of the bank are unaware, and the customer is sent irrelevant communications. Record linkage can associate different records corresponding to the same entity, mitigating this issue.

Mergers and acquisitions

One of the most common problems for master data management is company growth through mergers or acquisitions. Reconciling these separate master data systems can present difficulties, as existing applications have dependencies on the master databases. Ideally, database administrators resolve this problem through deduplication of the master data as part of the merger.

Over time, as further mergers and acquisitions occur, the problem can multiply. Data reconciliation processes can become extremely complex or even unreliable. Some organizations end up with 10, 15, or even 100 separate and poorly integrated master databases. This can cause serious problems in customer satisfaction, operational efficiency, decision support, and regulatory compliance.

Another problem involves determining the proper degrees of detail and normalization to include in the master data schema. For example, in a federated Human Resources environment, the enterprise software may focus on storing people's data as current status, adding a few fields to identify the date of hire, date of last promotion, etc. However, this simplification can introduce business-impacting errors into dependent systems for planning and forecasting. The stakeholders of such systems may be forced to build a parallel network of new interfaces to track the onboarding of new hires, planned retirements, and divestment, which works against one of the aims of master data management.

People, processes and technology

Master data management is enabled by technology, but is more than the technologies that enable it. An organization's master data management capability will also include people and processes in its definition.

People

Several roles should be staffed within MDM. Most prominently, the Data Owner and the Data Steward. Several people would likely be allocated to each role and each person responsible for a subset of Master Data (e.g. one data owner for employee master data, another for customer master data).

The Data Owner is responsible for the requirements for data definition, data quality, data security, etc. as well as for compliance with data governance and data management procedures. The Data Owner should also be funding improvement projects in case of deviations from the requirements.

The Data Steward is running the master data management on behalf of the data owner and probably also being an advisor to the Data Owner.

Processes

Master data management can be viewed as a "discipline for specialized quality improvement"^[3] defined by the policies and procedures put in place by a data governance organization. It has the objective of providing processes for collecting, aggregating, matching, consolidating, quality-assuring, persisting and distributing master data throughout an organization to ensure a common understanding, consistency, accuracy and control,^[4] in the ongoing maintenance and application use of that data.

Processes commonly seen in master data management include source identification, data collection, data transformation, normalization, rule administration, error detection and correction, data consolidation, data storage, data distribution, data classification, taxonomy services, item master creation, schema mapping, product codification, data enrichment, hierarchy management, business semantics management and data governance.

Technology

A master data management tool can be used to support master data management by removing duplicates, standardizing data (mass maintaining),^[5] and incorporating rules to eliminate incorrect data from entering the system to create an authoritative source of master data. Master data are the products, accounts, and parties for which the business transactions are completed.

Where the technology approach produces a "golden record" or relies on a "source of record" or "system of record", it is common to talk of where the data is "mastered". This is accepted terminology in the information technology industry, but care should be taken, both with specialists and with the wider stakeholder community, to avoid confusing the concept of "master data" with that of "mastering data".

Implementation models

There are several models for implementing a technology solution for master data management. These depend on an organization's core business, its corporate structure, and its goals. These include:

Source of record
Registry
Consolidation
Coexistence
Transaction/centralized

Source of record

This model identifies a single application, database, or simpler source (e.g. a spreadsheet) as being the "source of record" (or "system of record" where solely application databases are relied on). The benefit of this model is its conceptual simplicity, but it may not fit with the realities of complex master data distribution in large organizations.

The source of record can be federated, for example by groups of attributes (so that different attributes of a master data entity may have different sources of record) or geographically (so that different parts of an organization may have different master sources). Federation is only applicable in certain use cases, where there is a clear delineation of which subsets of records will be found in which sources.

The source of record model can be applied more widely than simply to master data, for example to reference data.

Transmission of master data

There are several ways in which master data may be collated and distributed to other systems.^[6] This includes:

Data consolidation – The process of capturing master data from multiple sources and integrating it into a single hub (operational data store) for replication to other destination systems.
Data federation – The process of providing a single virtual view of master data from one or more sources to one or more destination systems.
Data propagation – The process of copying master data from one system to another, typically through point-to-point interfaces in legacy systems.

Change management in implementation

Challenges in adopting master data management within large organizations often arise when stakeholders disagree on a "single version of the truth" concept is not affirmed by stakeholders, who believe that their local definition of the master data is necessary. For example, the product hierarchy used to manage inventory may be entirely different from the product hierarchies used to support marketing efforts or pay sales representatives. It is above all necessary to identify if different master data is genuinely required. If it is required, then the solution implemented (technology and process) must be able to allow multiple versions of the truth to exist but will provide simple, transparent ways to reconcile the necessary differences. If it is not required, processes must be adjusted. Often, solutions can be found that retain the integrity of the master data but allow users to access it in ways that suit their needs. For example, a salesperson may want to group products by size, color, or other attributes, while a purchasing officer may want to group products by supplier or country of origin. Without this active management, users who need the alternate versions will simply "go around" the official processes, thus reducing the effectiveness of the company's overall master data management program.

Related Research Articles

In computing, a data warehouse, also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is a core component of business intelligence. Data warehouses are central repositories of data integrated from disparate sources. They store current and historical data organized so as to make it easy to create reports, query and get insights from the data. Unlike databases, they are intended to be used by analysts and managers to help make organizational decisions.

A management information system (MIS) is an information system used for decision-making, and for the coordination, control, analysis, and visualization of information in an organization. The study of the management information systems involves people, processes and technology in an organizational context. In other words, it serves, as the functions of controlling, planning, decision making in the management level setting.

A business process, business method, or business function is a collection of related, structured activities or tasks performed by people or equipment in which a specific sequence produces a service or product for a particular customer or customers. Business processes occur at all organizational levels and may or may not be visible to the customers. A business process may often be visualized (modeled) as a flowchart of a sequence of activities with interleaving decision points or as a process matrix of a sequence of activities with relevance rules based on data in the process. The benefits of using business processes include improved customer satisfaction and improved agility for reacting to rapid market change. Process-oriented organizations break down the barriers of structural departments and try to avoid functional silos.

Data management comprises all disciplines related to handling data as a valuable resource, it is the practice of managing an organization's data so it can be analyzed for decision making.

Data migration is the process of selecting, preparing, extracting, and transforming data and permanently transferring it from one computer storage system to another. Additionally, the validation of migrated data for completeness and the decommissioning of legacy data storage are considered part of the entire data migration process. Data migration is a key consideration for any system implementation, upgrade, or consolidation, and it is typically performed in such a way as to be as automated as possible, freeing up human resources from tedious tasks. Data migration occurs for a variety of reasons, including server or storage equipment replacements, maintenance or upgrades, application migration, website consolidation, disaster recovery, and data center relocation.

<span class="mw-page-title-main">Business process modeling</span> Activity of representing processes of an enterprise

Business process modeling (BPM) is the action of capturing and representing processes of an enterprise, so that the current business processes may be analyzed, applied securely and consistently, improved, and automated.

A business analyst (BA) is a person who processes, interprets and documents business processes, products, services and software through analysis of data. The role of a business analyst is to ensure business efficiency increases through their knowledge of both IT and business function.

Data quality refers to the state of qualitative or quantitative pieces of information. There are many definitions of data quality, but data is generally considered high quality if it is "fit for [its] intended uses in operations, decision making and planning". Moreover, data is deemed of high quality if it correctly represents the real-world construct to which it refers. Furthermore, apart from these definitions, as the number of data sources increases, the question of internal data consistency becomes significant, regardless of fitness for use for any particular external purpose. People's views on data quality can often be in disagreement, even when discussing the same set of data used for the same purpose. When this is the case, data governance is used to form agreed upon definitions and standards for data quality. In such cases, data cleansing, including standardization, may be required in order to ensure data quality.

Enterprise software, also known as enterprise application software (EAS), is computer software used to satisfy the needs of an organization rather than its individual users. Enterprise software is an integral part of a computer-based information system, handling a number of business operations, for example to enhance business and management reporting tasks, or support production operations and back office functions. Enterprise systems must process information at a relatively high speed.

Enterprise data management (EDM) is the ability of an organization to precisely define, easily integrate and effectively retrieve data for both internal applications and external communication. EDM focuses on the creation of accurate, consistent, and transparent content. EDM emphasizes data precision, granularity, and meaning and is concerned with how the content is integrated into business applications as well as how it is passed along from one business process to another.

Product information management (PIM) is the process of managing all the information required to market and sell products through distribution channels. This product data is created by an internal organization to support a multichannel marketing strategy. A central hub of product data can be used to distribute information to sales channels such as e-commerce websites, print catalogues, marketplaces such as Amazon and Google Shopping, social media platforms like Instagram and electronic data feeds to trading partners. Moreover, the significant role that PIM plays is reducing the abandonment rate by giving better product information.

SAP NetWeaver Master Data Management (SAP NW MDM) is a component of SAP's NetWeaver product group and is used as a platform to consolidate, cleanse and synchronise a single version of the truth for master data within a heterogeneous application landscape. It has the ability to distribute internally and externally to SAP and non-SAP applications. SAP MDM is a key enabler of SAP Service-Oriented Architecture. Standard system architecture would consist of a single central MDM server connected to client systems through SAP Exchange Infrastructure using XML documents, although connectivity without SAP XI can also be achieved. There are five standard implementation scenarios:

Content Consolidation - centralised cleansing, de-duplication and consolidation, enabling key mapping and consolidated group reporting in SAP BI. No re-distribution of cleansed data.
Master Data Harmonisation - as for Content Consolidation, plus re-distribution of cleansed, consolidated master data.
Central Master data management - as for Master Data Harmonisation, but all master data is maintained in the central MDM system. No maintenance of master data occurs in the connected client systems.
Rich Product Content Management - Catalogue management and publishing. Uses elements of Content Consolidation to centrally store rich content (images, PDF files, video, sound etc.) together with standard content in order to produce product catalogues (web or print). Has standard adapters to export content to Desktop Publishing packages.
Global Data Synchronization - provides consistent trade item information exchange with retailers through data hubs (e.g. 1SYNC)

Data governance is a term used on both a macro and a micro level. The former is a political concept and forms part of international relations and Internet governance; the latter is a data management concept and forms part of corporate data governance.

In information science and information technology, single source of truth (SSOT) architecture, or single point of truth (SPOT) architecture, for information systems is the practice of structuring information models and associated data schemas such that every data element is mastered in only one place, providing data normalization to a canonical form.

Master data represents "data about the business entities that provide context for business transactions". The most commonly found categories of master data are parties, products, financial structures and locational concepts.

An operating model is both an abstract and visual representation (model) of how an organization delivers value to its customers or beneficiaries as well as how an organization actually runs itself.

Microsoft SQL Server Master Data Services (MDS) is a Master Data Management (MDM) product from Microsoft that ships as a part of the Microsoft SQL Server relational database management system. Master data management (MDM) allows an organization to discover and define non-transactional lists of data, and compile maintainable, reliable master lists. Master Data Services first shipped with Microsoft SQL Server 2008 R2. Microsoft SQL Server 2016 introduced enhancements to Master Data Services, such as improved performance and security, and the ability to clear transaction logs, create custom indexes, share entity data between different models, and support for many-to-many relationships.

Reference data is data used to classify or categorize other data. Typically, they are static or slowly changing over time.

A view model or viewpoints framework in systems engineering, software engineering, and enterprise engineering is a framework which defines a coherent set of views to be used in the construction of a system architecture, software architecture, or enterprise architecture. A view is a representation of the whole system from the perspective of a related set of concerns.

A staging area, or landing zone, is an intermediate storage area used for data processing during the extract, transform and load (ETL) process. The data staging area sits between the data source(s) and the data target(s), which are often data warehouses, data marts, or other data repositories.

References

↑ "Gartner Glossary: Master Data Management". Gartner. Retrieved 6 June 2020.
↑ Rouse, Margaret (2018-04-09). "Definition from WhatIs.com". SearchDataManagement. Retrieved 2018-04-09.
↑ DAMA-DMBOK Guide, 2010 DAMA International
↑ "Learn how to create a MDM change request – LightsOnData". LightsOnData. 2018-05-09. Retrieved 2018-08-17.
↑ Jürgensen, Knut (2016-05-16). "Master Data Management (MDM): Help or Hindrance?". Simple Talk. Retrieved 2018-04-09.
↑ "Creating the Golden Record: Better Data Through Chemistry", DAMA, slide 26, Donald J. Soulsby, 22 October 2009

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] "Gartner Glossary: Master Data Management". Gartner. Retrieved 6 June 2020.

[SearchDataManagement_2018-2] Rouse, Margaret (2018-04-09). "Definition from WhatIs.com". SearchDataManagement. Retrieved 2018-04-09.

[3] DAMA-DMBOK Guide, 2010 DAMA International

[4] "Learn how to create a MDM change request – LightsOnData". LightsOnData. 2018-05-09. Retrieved 2018-08-17.

[Simple_Talk_2016-5] Jürgensen, Knut (2016-05-16). "Master Data Management (MDM): Help or Hindrance?". Simple Talk. Retrieved 2018-04-09.

[6] "Creating the Golden Record: Better Data Through Chemistry", DAMA, slide 26, Donald J. Soulsby, 22 October 2009

[1]

[2]

[3]

[4]

[5]

[6]

v t e Database management systems
Types	Object-oriented comparison Relational list comparison Key–value Column-oriented list Document-oriented Wide-column store Graph NoSQL NewSQL In-memory list Multi-model comparison Cloud Blockchain-based database
Concepts	Database ACID Armstrong's axioms Codd's 12 rules CAP theorem CRUD Null Candidate key Foreign key PACELC theorem Superkey Surrogate key Unique key
Objects	Relation table column row View Transaction Transaction log Trigger Index Stored procedure Cursor Partition
Components	Concurrency control Data dictionary JDBC XQJ ODBC Query language Query optimizer Query rewriting system Query plan
Functions	Administration Query optimization Replication Sharding
Related topics	Database models Database normalization Database storage Distributed database Federated database system Referential integrity Relational algebra Relational calculus Relational model Object–relational database Transaction processing
Category Outline