Climate and Forecast Metadata Conventions

Last updated

The Climate and Forecast (CF) metadata conventions are conventions for the description of Earth sciences data, intended to promote the processing and sharing of data files. The metadata defined by the CF conventions are generally included in the same file as the data, thus making the file "self-describing". The conventions provide a definitive description of what the data values found in each netCDF variable represent, and of the spatial and temporal properties of the data, including information about grids, such as grid cell bounds and cell averaging methods. This enables users of files from different sources to decide which variables are comparable, and is a basis for building software applications with powerful data extraction, grid remapping, data analysis, and data visualization capabilities.

Contents

History and evolution

The CF conventions were introduced in 2003, after several years of development by a collaboration that included staff from U.S. and European climate and weather laboratories. [1] The conventions contained generalizations and extensions to the earlier Cooperative Ocean/Atmosphere Research Data Service (COARDS) conventions [2] and the Gregory/Drach/Tett (GDT) conventions. [3] As the scope of the CF conventions grew along with its user base, the CF community adopted an open governance model. [4] In December 2008 the trio of standards, netCDF+CF+OPeNDAP, was adopted by IOOS as a recommended standard (number 08-012) for the representation and transport of gridded data. The CF conventions are being considered by the NASA Standards Process Group (SPG) and others as more broadly applicable standards. [5] [6]

Applications and user base

The CF conventions have been adopted by a wide variety of national and international programs and activities in the Earth sciences. [7] For example, they were required for the climate model output data collected for Coupled model intercomparison projects, which are widely used for the Intergovernmental Panel on Climate Change assessment reports. [8] They are promoted as an important element of scientific community coordination by the World Climate Research Programme. [9] [10] They are also used as a technical foundation for a number of software packages and data systems, including the Climate Model Output Rewriter (CMOR), which is post processing software for climate model data, and the Earth System Grid, which distributes climate and other data. [11] [12] [13] The CF conventions have also been used to describe the physical fields transferred between individual Earth system model software components, such as atmosphere and ocean components, as the model runs . [14]

Supported data types

CF is intended for use with state estimation and forecasting data, in the atmosphere, ocean, and other physical domains. It was designed primarily to address gridded data types such as numerical weather prediction model outputs and climatology data in which data binning is used to impose a regular structure. [13] [15] However, the CF conventions are also applicable to many classes of observational data and have been adopted by a number of groups for such applications.

Supported data formats

CF originated as a standard for data written in netCDF, but its structure is general and it has been adapted for use with other data formats. For example, using the CF conventions with Hierarchical Data Format data has been explored. [16]

Design principles

Several principles guide the development of CF conventions:

Specific CF metadata descriptors use values of attributes to represent

A central element of the CF Conventions is the CF Standard Name Table. The CF Standard Name Table uniquely associates a standard name with each geophysical parameter in a data set, where each name provides a precise description of physical quantities being represented. Note that this is the string value of the standard_name attribute, not the name of the parameter. The CF standard name table identifies over 1,000 physical quantities, each with a precise description and associated canonical units. Guidelines for construction of CF standard names are documented on the conventions web site.

As an example of the information provided by CF standard names, the entry for sea-level atmospheric pressure includes:

Software

Related Research Articles

A coverage is the digital representation of some spatio-temporal phenomenon. ISO 19123 provides the definition:

A GIS file format is a standard for encoding geographical information into a computer file, as a specialized type of file format for use in geographic information systems (GIS) and other geospatial applications. Since the 1970s, dozens of formats have been created based on various data models for various purposes. They have been created by government mapping agencies, GIS software vendors, standards bodies such as the Open Geospatial Consortium, informal user communities, and even individual developers.

The Community Climate System Model (CCSM) is a coupled general circulation model (GCM) developed by the University Corporation for Atmospheric Research (UCAR) with funding from the National Science Foundation (NSF), the Department of Energy (DoE), and the National Aeronautics and Space Administration (NASA). The coupled components include an atmospheric model, a land-surface model, an ocean model, and a sea ice model. CCSM is maintained by the National Center for Atmospheric Research (NCAR).

OPeNDAP is an acronym for "Open-source Project for a Network Data Access Protocol," an endeavor focused on enhancing the retrieval of remote, structured data through a Web-based architecture and a discipline-neutral Data Access Protocol (DAP). Widely used, especially in Earth science, the protocol is layered on HTTP, and its current specification is DAP4, though the previous DAP2 version remains broadly used. Developed and advanced by the non-profit OPeNDAP, Inc., DAP is intended to enable remote, selective data-retrieval as an easily invoked Web service. OPeNDAP, Inc. also develops and maintains zero-cost (reference) implementations of the DAP protocol in both server-side and client-side software.

In climatology, the Coupled Model Intercomparison Project (CMIP) is a collaborative framework designed to improve knowledge of climate change. It was organized in 1995 by the Working Group on Coupled Modelling (WGCM) of the World Climate Research Programme (WCRP). It is developed in phases to foster the climate model improvements but also to support national and international assessments of climate change. A related project is the Atmospheric Model Intercomparison Project (AMIP) for global coupled ocean-atmosphere general circulation models (GCMs).

GRIB is a concise data format commonly used in meteorology to store historical and forecast weather data. It is standardized by the World Meteorological Organization's Commission for Basic Systems, known under number GRIB FM 92-IX, described in WMO Manual on Codes No.306. Currently there are three versions of GRIB. Version 0 was used to a limited extent by projects such as TOGA, and is no longer in operational use. The first edition is used operationally worldwide by most meteorological centers, for Numerical Weather Prediction output (NWP). A newer generation has been introduced, known as GRIB second edition, and data is slowly changing over to this format. Some of the second-generation GRIB are used for derived product distributed in Eumetcast of Meteosat Second Generation. Another example is the NAM model.

The Earth System Modeling Framework (ESMF) is open-source software for building climate, numerical weather prediction, data assimilation, and other Earth science software applications. These applications are computationally demanding and usually run on supercomputers. The ESMF is considered a technical layer, integrated into a sophisticated common modeling infrastructure for interoperability. Other aspects of interoperability and shared infrastructure include: common experimental protocols, common analytic methods, common documentation standards for data and data provenance, shared workflow, and shared model components.

The Open Geospatial Consortium Web Coverage Service Interface Standard (WCS) defines Web-based retrieval of coverages – that is, digital geospatial information representing space/time-varying phenomena.

CGNS stands for CFD General Notation System. It is a general, portable, and extensible standard for the storage and retrieval of CFD analysis data. It consists of a collection of conventions, and free and open software implementing those conventions. It is self-descriptive, cross-platform also termed platform or machine independent, documented, and administered by an international steering committee. It is also an American Institute of Aeronautics and Astronautics (AIAA) recommended practice. The CGNS project originated in 1994 as a joint effort between Boeing and NASA, and has since grown to include many other contributing organizations worldwide. In 1999, control of CGNS was completely transferred to a public forum known as the CGNS Steering Committee. This Committee is made up of international representatives from government and private industry.

The vast scale of the oceans, the difficulty and expense of making measurements due to the hostility of the environment and the internationality of the marine environment has led to a culture of data sharing in the oceanographic data community. As far back as 1961 UNESCO's Intergovernmental Oceanographic Commission (IOC) set up IODE to enhance marine research, exploitation and development by facilitating the exchange of oceanographic data and information.

A geographic data model, geospatial data model, or simply data model in the context of geographic information systems, is a mathematical and digital structure for representing phenomena over the Earth. Generally, such data models represent various aspects of these phenomena by means of geographic data, including spatial locations, attributes, change over time, and identity. For example, the vector data model represents geography as collections of points, lines, and polygons, and the raster data model represent geography as cell matrices that store numeric values. Data models are implemented throughout the GIS ecosystem, including the software tools for data management and spatial analysis, data stored in a variety of GIS file formats, specifications and standards, and specific designs for GIS installations.

<span class="mw-page-title-main">Metadata</span> Data about data

Metadata is "data that provides information about other data", but not the content of the data, such as the text of a message or the image itself. There are many distinct types of metadata, including:

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidata program at the University Corporation for Atmospheric Research (UCAR). They are also the chief source of netCDF software, standards development, updates, etc. The format is an open standard. NetCDF Classic and 64-bit Offset Format are an international standard of the Open Geospatial Consortium.

The Handle System is the Corporation for National Research Initiatives's proprietary registry assigning persistent identifiers, or handles, to information resources, and for resolving "those handles into the information necessary to locate, access, and otherwise make use of the resources".

The term Digital Earth Reference Model (DERM) was coined by Tim Foresman in context with a vision for an all encompassing geospatial platform as an abstract for information flow in support of Al Gore's vision for a Digital Earth. The Digital Earth reference model seeks to facilitate and promote the use of georeferenced information from multiple sources over the Internet. A digital Earth reference model defines a fixed global reference frame for the Earth using four principles of a digital system, namely:

The Common Metadata for Climate Modelling Digital Repositories, or METAFOR project, is creating a Common Information Model (CIM) for climate data and the models that produce it.

The NOAA National Operational Model Archive and Distribution System (NOMADS) is a Web-services based project providing both real-time and retrospective format independent access to climate and weather model data.

The Simple Ocean Data Assimilation (SODA) analysis is an oceanic reanalysis data set consisting of gridded state variables for the global ocean, as well as several derived fields. SODA was developed in the 1990s as a collaborative project between the Department of Atmospheric and Oceanic Science at the University of Maryland and the Department of Oceanography at Texas A&M University with the goal of providing an improved estimate of ocean state from those based solely on observations or numerical simulations. Since its first release there have been several updates, the most recent of which extends from 1958-2008, as well as a “beta release” of a long-term reanalysis for 1871-2008.

<span class="mw-page-title-main">Regional Ocean Modeling System</span> Free-surface, terrain-following, primitive equations ocean model

Regional Ocean Modeling System (ROMS) is a free-surface, terrain-following, primitive equations ocean model widely used by the scientific community for a diverse range of applications. The model is developed and supported by researchers at the Rutgers University, University of California Los Angeles and contributors worldwide.

<span class="mw-page-title-main">Discrete global grid</span> Partition of Earths surface into subdivided cells

A discrete global grid (DGG) is a mosaic that covers the entire Earth's surface. Mathematically it is a space partitioning: it consists of a set of non-empty regions that form a partition of the Earth's surface. In a usual grid-modeling strategy, to simplify position calculations, each region is represented by a point, abstracting the grid as a set of region-points. Each region or region-point in the grid is called a cell.

References

  1. Gregory, Jonathan (2003). "The CF metadata standard" (PDF).
  2. "Conventions for the Standardization of NetCDF files". May 1995. Archived from the original on 2010-05-27. Retrieved 2010-06-23.
  3. "GDT netCDF conventions for climate data, version 1.3". March 14, 1999. Archived from the original on June 10, 2010. Retrieved June 23, 2010.
  4. Lawrence, B.N. (2003). "Maintaining and Advancing the CF Standard for Earth System Science Community Data". Archived from the original on 2010-05-27.
  5. Rew, Russ (2010). "CF Metadata Conventions" (PDF). Archived from the original (PDF) on 2014-04-04.
  6. Domenico, Ben (2009). "Web Coverage Service (WCS) 1.1 extension for CF-netCDF 3.0 encoding" (PDF).
  7. Projects and Groups Adopting or Encouraging the CF-Conventions as a Standard Archived August 23, 2010, at the Wayback Machine
  8. "Report of the 12th Session of the JSC/CLIVAR Working Group on Coupled Modelling (WGCM)" (PDF). 2009.
  9. Kinter, James L. III (2005). "Data Issues for WCRP Weather and Climate Modeling" (PDF).
  10. Taylor, Karl (2005). "Issues Related to the Dissemination of Climate Model Output" (PDF).
  11. Doutriaux, Charles (2010). "Climate Model Output Rewriter" (PDF).[ permanent dead link ]
  12. Drach, Bob (2002). "Metadata Standards for Gridded Climate Data in the Earth System Grid". Archived from the original on 2004-09-12.
  13. 1 2 "NetCDF in ArcGIS 9.2". 2006. Archived from the original on 2010-06-14. Retrieved 2010-06-23.
  14. Dunlap, R. L. Mark, S. Rugaber, V. Balaji, J. Chastang, L. Cinquini, C. DeLuca, D. Middleton, and S. Murphy (2008). "Earth system curator: metadata infrastructure for climate modeling". Earth Science Informatics. 1 (3–4): 131–149. doi: 10.1007/s12145-008-0016-1 .{{cite journal}}: CS1 maint: multiple names: authors list (link)
  15. V. Balaji (2008). "Metadata Standards for Gridded Climate Data in the Earth System Grid" (PDF). Archived from the original (PDF) on 2010-05-28.
  16. Yang, M.; R. Duerr; C. Lee (2009). "Investigation of using HDF5 archival information packages (AIP) to store NASA ECS data". 25th Conference on International Interactive Information and Processing Systems (IIPS) for Meteorology, Oceanography, and Hydrology, 89th American Meteorological Society Meeting.
  17. "NetCDF Importing and Processing". originlab.com. Retrieved 2021-05-11.
  18. "Metadata - Iris Documentation" . Retrieved 2023-02-10.