Data steward

Last updated

A data steward is an oversight or data governance role within an organization, and is responsible for ensuring the quality and fitness for purpose of the organization's data assets, including the metadata for those data assets. A data steward may share some responsibilities with a data custodian, such as the awareness, accessibility, release, appropriate use, security and management of data. [1] A data steward would also participate in the development and implementation of data assets. A data steward may seek to improve the quality and fitness for purpose of other data assets their organization depends upon but is not responsible for.

Contents

Data stewards have a specialist role that utilizes an organization's data governance processes, policies, guidelines and responsibilities for administering an organizations' entire data in compliance with policy and/or regulatory obligations. The overall objective of a data steward is the data quality of the data assets, datasets, data records and data elements. [1] [2] This includes documenting metainformation for the data, such as definitions, related rules/governance, physical manifestation, and related data models (most of these properties being specific to an attribute/concept relationship), identifying owners/custodian's various responsibilities, relations insight [ definition needed ] pertaining to attribute quality, aiding with project requirement data facilitation and documentation of capture rules.

Data stewards begin the stewarding process with the identification of the data assets and elements which they will steward, with the ultimate result being standards, controls and data entry.[ citation needed ] The steward works closely with business glossary standards analysts (for standards), with data architect/modelers (for standards), with DQ analysts (for controls) and with operations team members (good-quality data going in per business rules) while entering data.

Data stewardship roles are common when organizations attempt to exchange data precisely and consistently between computer systems and to reuse data-related resources.[ citation needed ] Master data management often[ quantify ] makes references to the need for data stewardship for its implementation to succeed. Data stewardship must have precise purpose, fit for purpose or fitness.

Data steward responsibilities

A data steward ensures that each assigned data element:

  1. Has clear and unambiguous data element definition
  2. Does not conflict with other data elements in the metadata registry (removes duplicates, overlap etc.)
  3. Has clear enumerated value definitions if it is of type Code
  4. Is still being used (remove unused data elements)
  5. Is being used consistently in various computer systems
  6. Is being used, fit for purpose = Data Fitness
  7. Has adequate documentation on appropriate usage and notes
  8. Documents the origin and sources of authority on each metadata element
  9. Is protected against unauthorised access or change

Responsibilities of data stewards vary between different organisations and institutions. For example, at Delft University of Technology, data stewards are perceived as the first contact point for any questions related to research data. They also have subject-specific background allowing them to easily connect with researchers and to contextualise data management problems to take into account disciplinary practices. [3]

Types of data stewards

Depending on the set of data stewardship responsibilities assigned to an individual, there are 4 types (or dimensions of responsibility) of data stewards typically found within an organization:

  1. Data object data steward - responsible for managing reference data and attributes of one business data entity
  2. Business data steward - responsible for managing critical data, both reference and transactional, created or used by one business function. The data steward may also serve as a liaison between the organization's data users and technical teams, helping to bridge the gap between business needs and technical requirements. They may also play a role in educating others within the organization about best practices for data management, and advocating for data-driven decision-making.
  3. Process data steward - responsible for managing data across one business process
  4. System data steward - responsible for managing data for at least one IT system [4]

Benefits of data stewardship

Systematic data stewardship can foster:

  1. Faster analysis
  2. Consistent use of data management resources
  3. Easy mapping of data between computer systems and exchange documents
  4. Lower costs associated with migration to (for example) Service Oriented Architecture (SOA)
  5. Mitigation of data risk
  6. Better control of dangers associated with privacy, legal, errors, etc.

Assignment of each data element to a person sometimes seems like an unimportant process. But many groups[ which? ] have found that users have greater trust and usage rates in systems where they can contact a person with questions on each data element.

Examples

Delft University of Technology (TU Delft) offers an example of data stewardship implementation at a research institution. In 2017 the Data Stewardship Project was initiated at TU Delft to address research data management needs in a disciplinary manner across the whole campus. [5] Dedicated data stewards with subject-specific background were appointed at every TU Delft faculty to support researchers with data management questions and to act as a linking point with the other institutional support services. The project is coordinated centrally by TU Delft Library, and it has its own website, [6] blog [7] and a YouTube channel. [8]

The EPA metadata registry furnishes an example of data stewardship. Note that each data element therein has a "POC" (point of contact).

Data stewardship applications

A new market for data governance applications is emerging, one in which both technical and business staff — stewards — manage policies. These new applications, like previous generations, deliver a strong business glossary capability, but they do not stop there. Vendors are introducing additional features addressing the roles of business in addition to technical stewards' concerns. [9]

Information stewardship applications are business solutions used by business users acting in the role of information steward (interpreting and enforcing information governance policy, for example). These developing solutions represent, for the most part, an amalgam of a number of disparate, previously IT-centric tools already on the market, but are organized and presented in such a way that information stewards (a business role) can support the work of information policy enforcement as part of their normal, business-centric, day-to-day work in a range of use cases.

The initial push for the formation of this new category of packaged software came from operational use cases — that is, use of business data in and between transactional and operational business applications. This is where most of the master data management efforts are undertaken in organizations. However, there is also now a faster-growing interest in the new data lake arena for more analytical use cases. [10]

Some of the vendors in Metadata Management, like Alation, have started highlighting the importance of Data Stewards to employees interested in using data to make business decisions. [11]

See also

Related Research Articles

<span class="mw-page-title-main">Dublin Core</span> Standardized set of metadata elements

The Dublin Core, also known as the Dublin Core Metadata Element Set (DCMES), is a set of fifteen main metadata items for describing digital or physical resources. It was the first metadata standard for describing web content. The Dublin Core Metadata Initiative (DCMI) is responsible for formulating the Dublin Core; DCMI is a project of the Association for Information Science and Technology (ASIS&T), a non-profit organization.

A document management system (DMS) is usually a computerized system used to store, share, track and manage files or documents. Some systems include history tracking where a log of the various versions created and modified by different users is recorded. The term has some overlap with the concepts of content management systems. It is often viewed as a component of enterprise content management (ECM) systems and related to digital asset management, document imaging, workflow systems and records management systems.

Business intelligence (BI) consists of strategies and technologies used by enterprises for the data analysis and management of business information. Common functions of BI technologies include reporting, online analytical processing, analytics, dashboard development, data mining, process mining, complex event processing, business performance management, benchmarking, text mining, predictive analytics, and prescriptive analytics.

<span class="mw-page-title-main">Data management</span> Disciplines related to managing data as a resource

Data management comprises all disciplines related to handling data as a valuable resource, it is the practice of managing an organization’s data so it can be analyzed for decision making.

Information technology (IT)governance is a subset discipline of corporate governance, focused on information technology (IT) and its performance and risk management. The interest in IT governance is due to the ongoing need within organizations to focus value creation efforts on an organization's strategic objectives and to better manage the performance of those responsible for creating this value in the best interest of all stakeholders. It has evolved from The Principles of Scientific Management, Total Quality Management and ISO 9001 Quality management system.

Data quality refers to the state of qualitative or quantitative pieces of information. There are many definitions of data quality, but data is generally considered high quality if it is "fit for [its] intended uses in operations, decision making and planning". Moreover, data is deemed of high quality if it correctly represents the real-world construct to which it refers. Furthermore, apart from these definitions, as the number of data sources increases, the question of internal data consistency becomes significant, regardless of fitness for use for any particular external purpose. People's views on data quality can often be in disagreement, even when discussing the same set of data used for the same purpose. When this is the case, data governance is used to form agreed upon definitions and standards for data quality. In such cases, data cleansing, including standardization, may be required in order to ensure data quality.

A metadata registry is a central location in an organization where metadata definitions are stored and maintained in a controlled method.

Enterprise data management (EDM) is the ability of an organization to precisely define, easily integrate and effectively retrieve data for both internal applications and external communication. EDM focuses on the creation of accurate, consistent, and transparent content. EDM emphasizes data precision, granularity, and meaning and is concerned with how the content is integrated into business applications as well as how it is passed along from one business process to another.

Information lifecycle management (ILM) refers to strategies for administering storage systems on computing devices.

Data governance is a term used on both a macro and a micro level. The former is a political concept and forms part of international relations and Internet governance; the latter is a data management concept and forms part of corporate data governance.

Master data represents "data about the business entities that provide context for business transactions". The most commonly found categories of master data are parties, products, financial structures and locational concepts.

SOA Governance is a set of processes used for activities related to exercising control over services in a service-oriented architecture (SOA). One viewpoint, from IBM and others, is that SOA governance is an extension (subset) of IT governance which itself is an extension of corporate governance. The implicit assumption in this view is that services created using SOA are just one more type of IT asset in need of governance, with the corollary that SOA governance does not apply to IT assets that are "not SOA". A contrasting viewpoint, expressed by blogger Dave Oliver and others, is that service orientation provides a broad organising principle for all aspects of IT in an organisation — including IT governance. Hence SOA governance is nothing but IT governance informed by SOA principles.

Internal control, as defined by accounting and auditing, is a process for assuring of an organization's objectives in operational effectiveness and efficiency, reliable financial reporting, and compliance with laws, regulations and policies. A broad concept, internal control involves everything that controls risks to an organization.

Master data management (MDM) is a discipline in which business and information technology work together to ensure the uniformity, accuracy, stewardship, semantic consistency and accountability of the enterprise's official shared master data assets.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data itself, such as the text of a message or the image itself. There are many distinct types of metadata, including:

A Business Intelligence Competency Center (BICC) is a cross-functional organizational team that has defined tasks, roles, responsibilities and processes for supporting and promoting the effective use of Business Intelligence (BI) across an organization.

Information governance, or IG, is the overall strategy for information at an organization. Information governance balances the risk that information presents with the value that information provides. Information governance helps with legal compliance, operational transparency, and reducing expenditures associated with legal discovery. An organization can establish a consistent and logical framework for employees to handle data through their information governance policies and procedures. These policies guide proper behavior regarding how organizations and their employees handle information whether it is physically or electronically created (ESI).

The web content lifecycle is the multi-disciplinary and often complex process that web content undergoes as it is managed through various publishing stages.

A metadata repository is a database created to store metadata. Metadata is information about the structures that contain the actual data. Metadata is often said to be "data about data", but this is misleading. Data profiles are an example of actual "data about data". Metadata adds one layer of abstraction to this definition– it is data about the structures that contain data. Metadata may describe the structure of any data, of any subject, stored in any format.

References

  1. 1 2 Cramer, Jonathan James (March 5, 2019). "6 Key Responsibility of the Invaluable Data Steward". DNB. Archived from the original on March 28, 2019. Retrieved November 11, 2022.
  2. "What is Data Stewardship? Its Importance, Benefits, Programs and more". Simplilearn. November 30, 2021. Archived from the original on January 21, 2022. Retrieved November 11, 2022.
  3. NewMedia Centre (2018-05-16), 1 Data Stewardship at the TU Delft V2, archived from the original on 2021-12-19, retrieved 2018-06-12
  4. "Understanding the different types of a data steward - LightsOnData". LightsOnData. 2018-06-13. Retrieved 2018-06-20.
  5. Teperek, Marta; Cruz, Maria J.; Verbakel, Ellen; Böhmer, Jasmin K.; Dunning, Alastair (2018-01-22). "Data Stewardship – addressing disciplinary data management needs". Open Science Framework. doi:10.17605/OSF.IO/MJK9T. S2CID   59344239.
  6. "Data Stewardship". TU Delft. Retrieved 2018-06-12.
  7. "Data Stewardship". Open Working. 2018-02-13. Retrieved 2018-06-12.
  8. "Data Stewardship TU Delft". YouTube. Retrieved 2018-06-12.
  9. "The Forrester Wave™: Data Governance Stewardship Applications, Q1 2016". www.forrester.com. Retrieved 2016-12-20.
  10. De Simoni, Guido (15 April 2016). "Market Guide for Information Stewardship Applications" . Gartner.
  11. "Magic Quadrant for Metadata Management Solutions" . Gartner. 9 August 2018.

Further reading