NetCDF

Last updated
Network Common Data Form
Filename extension
.nc
Internet media type application/netcdf
application/x-netcdf
Magic number CDF\001
\211HDF\r\n\032\n
Developed by UCAR
Type of formatscientific binary data
Extended from CDF
HDF
Website www.unidata.ucar.edu/software/netcdf/ OOjs UI icon edit-ltr-progressive.svg

NetCDF (Network Common Data Form) is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage [1] is hosted by the Unidata program at the University Corporation for Atmospheric Research (UCAR). They are also the chief source of netCDF software, standards development, updates, etc. The format is an open standard. NetCDF Classic and 64-bit Offset Format are an international standard of the Open Geospatial Consortium. [2]

Contents

The project started in 1988 and is still actively supported by UCAR. The original netCDF binary format (released in 1990, now known as "netCDF classic format") is still widely used across the world and continues to be fully supported in all netCDF releases. Version 4.0 (released in 2008) allowed the use of the HDF5 data file format. Version 4.1 (2010) added support for C and Fortran client access to specified subsets of remote data via OPeNDAP. Version 4.3.0 (2012) added a CMake build system for Windows builds. Version 4.7.0 (2019) added support for reading Amazon S3 objects. Further releases are planned to improve performance, add features, and fix bugs.

History

The format was originally based on the conceptual model of the Common Data Format developed by NASA, but has since diverged and is not compatible with it. [3] [4]

Format description

The netCDF libraries support multiple different binary formats for netCDF files:

All formats are "self-describing". This means that there is a header which describes the layout of the rest of the file, in particular the data arrays, as well as arbitrary file metadata in the form of name/value attributes. The format is platform independent, with issues such as endianness being addressed in the software libraries. The data are stored in a fashion that allows efficient subsetting.

Starting with version 4.0, the netCDF API [5] allows the use of the HDF5 data format. NetCDF users can create HDF5 files with benefits not available with the netCDF format, such as much larger files and multiple unlimited dimensions.

Full backward compatibility in accessing old netCDF files and using previous versions of the C and Fortran APIs is supported.

Software

Access libraries

The software libraries supplied by UCAR provide read-write access to netCDF files, encoding and decoding the necessary arrays and metadata. The core library is written in C, and provides an API for C, C++ and two APIs for Fortran applications, one for Fortran 77, and one for Fortran 90. An independent implementation, also developed and maintained by Unidata, is written in 100% Java, which extends the core data model and adds additional functionality. Interfaces to netCDF based on the C library are also available in other languages including R (ncdf, [6] ncvar and RNetCDF [7] packages), Perl, Python, Ruby, Haskell, [8] Mathematica, MATLAB, IDL, Julia and Octave. The specification of the API calls is very similar across the different languages, apart from inevitable differences of syntax. The API calls for version 2 were rather different from those in version 3, but are also supported by versions 3 and 4 for backward compatibility. Application programmers using supported languages need not normally be concerned with the file structure itself, even though it is available as open formats.

Applications

A wide range of application software has been written which makes use of netCDF files. These range from command line utilities to graphical visualization packages. A number are listed below, and a longer list [9] is on the UCAR website.

Common uses

It is commonly used in climatology, meteorology and oceanography applications (e.g., weather forecasting, climate change) and GIS applications.

It is an input/output format for many GIS applications, and for general scientific data exchange. To quote from their site: [25]

"NetCDF (network Common Data Form) is a set of interfaces for array-oriented data access and a freely-distributed collection of data access libraries for C, Fortran, C++, Java, and other languages. The netCDF libraries support a machine-independent format for representing scientific data. Together, the interfaces, libraries, and format support the creation, access, and sharing of scientific data."

Conventions

The Climate and Forecast (CF) conventions are metadata conventions for earth science data, intended to promote the processing and sharing of files created with the NetCDF Application Programmer Interface (API). The conventions define metadata that are included in the same file as the data (thus making the file "self-describing"), that provide a definitive description of what the data in each variable represents, and of the spatial and temporal properties of the data (including information about grids, such as grid cell bounds and cell averaging methods). This enables users of data from different sources to decide which data are comparable, and allows building applications with powerful extraction, regridding, and display capabilities.

Parallel-NetCDF

An extension of netCDF for parallel computing called Parallel-NetCDF (or PnetCDF) has been developed by Argonne National Laboratory and Northwestern University. [26] This is built upon MPI-IO, the I/O extension to MPI communications. Using the high-level netCDF data structures, the Parallel-NetCDF libraries can make use of optimizations to efficiently distribute the file read and write applications between multiple processors. The Parallel-NetCDF package can read/write only classic and 64-bit offset formats. Parallel-NetCDF cannot read or write the HDF5-based format available with netCDF-4.0. The Parallel-NetCDF package uses different, but similar APIs in Fortran and C.

Parallel I/O in the Unidata netCDF library has been supported since release 4.0, for HDF5 data files. Since version 4.1.1 the Unidata NetCDF C library supports parallel I/O to classic and 64-bit offset files using the Parallel-NetCDF library, but with the NetCDF API.

Interoperability of C/Fortran/C++ libraries with other formats

The netCDF C library, and the libraries based on it (Fortran 77 and Fortran 90, C++, and all third-party libraries) can, starting with version 4.1.1, read some data in other data formats. Data in the HDF5 format can be read, with some restrictions. Data in the HDF4 format can be read by the netCDF C library if created using the HDF4 Scientific Data (SD) API.

NetCDF-Java common data model

The NetCDF-Java library currently reads the following file formats and remote access protocols:

There are a number of other formats in development. Since each of these is accessed transparently through the NetCDF API, the NetCDF-Java library is said to implement a common data model for scientific datasets.

The Java common data model has three layers, which build on top of each other to add successively richer semantics:

  1. The data access layer, also known as the syntactic layer, handles data reading.
  2. The coordinate system layer identifies the coordinates of the data arrays. Coordinates are a completely general concept for scientific data; specialized georeferencing coordinate systems, important to the Earth Science community, are specially annotated.
  3. The scientific data type layer identifies specific types of data, such as grids, images, and point data, and adds specialized methods for each kind of data.

The data model of the data access layer is a generalization of the NetCDF-3 data model, and substantially the same as the NetCDF-4 data model. The coordinate system layer implements and extends the concepts in the Climate and Forecast Metadata Conventions. The scientific data type layer allows data to be manipulated in coordinate space, analogous to the Open Geospatial Consortium specifications. The identification of coordinate systems and data typing is ongoing, but users can plug in their own classes at runtime for specialized processing.

See also

Related Research Articles

<span class="mw-page-title-main">JPEG 2000</span> Image compression standard and coding system

JPEG 2000 (JP2) is an image compression standard and coding system. It was developed from 1997 to 2000 by a Joint Photographic Experts Group committee chaired by Touradj Ebrahimi, with the intention of superseding their original JPEG standard, which is based on a discrete cosine transform (DCT), with a newly designed, wavelet-based method. The standardized filename extension is .jp2 for ISO/IEC 15444-1 conforming files and .jpx for the extended part-2 specifications, published as ISO/IEC 15444-2. The registered MIME types are defined in RFC 3745. For ISO/IEC 15444-1 it is image/jp2.

Flexible Image Transport System (FITS) is an open standard defining a digital file format useful for storage, transmission and processing of data: formatted as multi-dimensional arrays, or tables. FITS is the most commonly used digital file format in astronomy. The FITS standard was designed specifically for astronomical data, and includes provisions such as describing photometric and spatial calibration information, together with image origin metadata.

<span class="mw-page-title-main">Hierarchical Data Format</span> Set of file formats

Hierarchical Data Format (HDF) is a set of file formats designed to store and organize large amounts of data. Originally developed at the U.S. National Center for Supercomputing Applications, it is supported by The HDF Group, a non-profit corporation whose mission is to ensure continued development of HDF5 technologies and the continued accessibility of data stored in HDF.

OPeNDAP is an acronym for "Open-source Project for a Network Data Access Protocol," an endeavor focused on enhancing the retrieval of remote, structured data through a Web-based architecture and a discipline-neutral Data Access Protocol (DAP). Widely used, especially in Earth science, the protocol is layered on HTTP, and its current specification is DAP4, though the previous DAP2 version remains broadly used. Developed and advanced by the non-profit OPeNDAP, Inc., DAP is intended to enable remote, selective data-retrieval as an easily invoked Web service. OPeNDAP, Inc. also develops and maintains zero-cost (reference) implementations of the DAP protocol in both server-side and client-side software.

GRIB is a concise data format commonly used in meteorology to store historical and forecast weather data. It is standardized by the World Meteorological Organization's Commission for Basic Systems, known under number GRIB FM 92-IX, described in WMO Manual on Codes No.306. Currently there are three versions of GRIB. Version 0 was used to a limited extent by projects such as TOGA, and is no longer in operational use. The first edition is used operationally worldwide by most meteorological centers, for Numerical Weather Prediction output (NWP). A newer generation has been introduced, known as GRIB second edition, and data is slowly changing over to this format. Some of the second-generation GRIB are used for derived product distributed in Eumetcast of Meteosat Second Generation. Another example is the NAM model.

Apache Hadoop is a collection of open-source software utilities that facilitates using a network of many computers to solve problems involving massive amounts of data and computation. It provides a software framework for distributed storage and processing of big data using the MapReduce programming model. Hadoop was originally designed for computer clusters built from commodity hardware, which is still the common use. It has since also found use on clusters of higher-end hardware. All the modules in Hadoop are designed with a fundamental assumption that hardware failures are common occurrences and should be automatically handled by the framework.

Jakarta Persistence is a Jakarta EE application programming interface specification that describes the management of relational data in enterprise Java applications.

<span class="mw-page-title-main">CUDA</span> Parallel computing platform and programming model

CUDA is a parallel computing platform and application programming interface (API) that allows software to use certain types of graphics processing units (GPUs) for general purpose processing, an approach called general-purpose computing on GPUs (GPGPU). CUDA is a software layer that gives direct access to the GPU's virtual instruction set and parallel computational elements, for the execution of compute kernels.

<span class="mw-page-title-main">EAS3</span>

EAS3 is a software toolkit for reading and writing structured binary data with geometry information and for postprocessing of these data. It is meant to exchange floating-point data according to IEEE standard between different computers, to modify them or to convert them into other file formats. It can be used for all kinds of structured data sets. It is mainly used in the field of direct numerical simulations.

CGNS stands for CFD General Notation System. It is a general, portable, and extensible standard for the storage and retrieval of CFD analysis data. It consists of a collection of conventions, and free and open software implementing those conventions. It is self-descriptive, cross-platform also termed platform or machine independent, documented, and administered by an international steering committee. It is also an American Institute of Aeronautics and Astronautics (AIAA) recommended practice. The CGNS project originated in 1994 as a joint effort between Boeing and NASA, and has since grown to include many other contributing organizations worldwide. In 1999, control of CGNS was completely transferred to a public forum known as the CGNS Steering Committee. This Committee is made up of international representatives from government and private industry.

This is a comparison of web frameworks for front-end web development that are heavily reliant on JavaScript code for their behavior.

XMDF is a library providing a standard format for the geometric data storage of river cross-sections, 2D/3D structured and unstructured meshes, geometric paths through space, and associated time data. XMDF uses HDF5 for cross-platform data storage and compression. It was initiated in Engineer Research and Development Center (ERDC) and is developed by Aquaveo . API includes interfaces for C/C++ and Fortran.

The Grid Analysis and Display System (GrADS) is an interactive desktop tool that is used for easy access, manipulation, and visualization of earth science data. The format of the data may be either binary, GRIB, NetCDF, or HDF-SDS. GrADS has been implemented worldwide on a variety of commonly used operating systems and is freely distributed over the Internet.

<span class="mw-page-title-main">Wuala</span>

Wuala was a secure online file storage, file synchronization, versioning and backup service originally developed and run by Caleido Inc. It is now part of LaCie, which is in turn owned by Seagate Technology. The service stores files in data centres that are provided by Wuala in multiple European countries. An earlier version also supported distributed storage on other users' machines, however this feature has been dropped. On 17 August 2015 Wuala announced that it was discontinuing its service and that all stored data would be deleted on 15 November 2015. Wuala recommended a rival cloud storage startup, Tresorit, as an alternative to its remaining customers.

The Climate and Forecast (CF) metadata conventions are conventions for the description of Earth sciences data, intended to promote the processing and sharing of data files. The metadata defined by the CF conventions are generally included in the same file as the data, thus making the file "self-describing". The conventions provide a definitive description of what the data values found in each netCDF variable represent, and of the spatial and temporal properties of the data, including information about grids, such as grid cell bounds and cell averaging methods. This enables users of files from different sources to decide which variables are comparable, and is a basis for building software applications with powerful data extraction, grid remapping, data analysis, and data visualization capabilities.

GEOMS – Generic Earth Observation Metadata Standard is a metadata standard used for archiving data from groundbased networks, like the Network for the Detection of Atmospheric Composition Change (NDACC), and for using this kind of data for the validation of NASA and ESA satellite data.

Microsoft Silverlight is an application framework for writing and running rich web applications that was actively developed and marketed by Microsoft from 2007 to 2012. This is a technical overview of the platform's history.

Windows Runtime (WinRT) is a platform-agnostic component and application architecture first introduced in Windows 8 and Windows Server 2012 in 2012. It is implemented in C++ and officially supports development in C++, Rust/WinRT, Python/WinRT, JavaScript-TypeScript, and the managed code languages C# and Visual Basic .NET (VB.NET).

HTML5 Audio is a subject of the HTML5 specification, incorporating audio input, playback, and synthesis, as well as speech to text, in the browser.

Absoft Fortran Compilers are set of Fortran compilers for Microsoft Windows, Apple Macintosh, and Linux produced by Absoft Corporation. The compilers are source code compatible across platforms.

References

  1. "NetCDF Home Page". Unidata/UCAR. Archived from the original on 2017-12-06. Retrieved 2017-12-05.
  2. "OGC standard netCDF Classic and 64-bit Offset". Opengeospatial.org. Archived from the original on 2017-11-30. Retrieved 2017-12-05.
  3. "Background - The NetCDF Users' Guide". Unidata.ucar.edu. Retrieved 2013-11-27.
  4. "CDF - Frequently asked questions". NASA. Archived from the original on 2018-06-19. Retrieved 2018-11-02.
  5. "Version 4.0 of the netCDF API". Unidata.ucar.edu. Archived from the original on 2015-06-17. Retrieved 2013-11-27.
  6. "ncdf". Cirrus.ucsd.edu. 2013-08-06. Archived from the original on 2013-12-03. Retrieved 2013-11-27.
  7. "Rnetcdf". Cran.r-project.org. 2012-07-19. Archived from the original on 2013-12-02. Retrieved 2013-11-27.
  8. "hnetcdf: Haskell NetCDF library". hackage.haskell.org. 2014-07-10. Archived from the original on 2014-07-09. Retrieved 2014-07-10.
  9. "Software for Manipulating or Displaying NetCDF Data". Unidata.ucar.edu. Retrieved 2020-10-23.
  10. "ncBrowse". Epic.noaa.gov. Archived from the original on 2013-12-03. Retrieved 2013-11-27.
  11. "ncview". Meteora.ucsd.edu. Archived from the original on 2014-02-12. Retrieved 2013-11-27.
  12. "Panoply". Giss.nasa.gov. Goddard Institute for Space Studies. Archived from the original on 2014-06-20. Retrieved 2013-11-27.
  13. "PyNIO". Pyngl.ucar.edu. 2011-07-28. Archived from the original on 2013-11-25. Retrieved 2013-11-27.
  14. "netCDF4". Archived from the original on 2017-11-29. Retrieved 2017-12-04.
  15. "xarray: N-D labeled arrays and datasets in Python". Archived from the original on 2016-09-01. Retrieved 2016-09-07.
  16. "GrADS Home Page". Archived from the original on 2016-02-13. Retrieved 2018-04-10.
  17. "Coyote's Guide to IDL Programming". Dfanning.com. 2013-11-23. Archived from the original on 2015-09-23. Retrieved 2013-11-27.
  18. "Coyote Library". Dfanning.com. 2013-11-23. Archived from the original on 2015-09-23. Retrieved 2013-11-27.
  19. "ArcGIS version 9.2". Esri.com. Archived from the original on 2013-11-22. Retrieved 2013-11-27.
  20. "NetCDF Importing and Processing". originlab.com. Retrieved 2021-05-11.
  21. "NetCDF network Common Data Form". Gdal.org. Archived from the original on 2013-06-06. Retrieved 2013-11-27.
  22. David Pierce (2014). ncdf4: Interface to Unidata netCDF (version 4 or earlier) format data files. R package version 1.13. https://cran.r-project.org/package=ncdf4
  23. Pavel Michna and with contributions from Milton Woods (2015). RNetCDF: Interface to NetCDF Datasets. R package version 1.7-3. https://cran.r-project.org/package=RNetCDF
  24. OpenChrom: a cross-platform open source software for the mass spectrometric analysis of chromatographic data, Philip Wenig, Juergen Odermatt, BMC Bioinformatics ; 2010; doi : 10.1186/1471-2105-11-405
  25. "What Is netCDF?". Unidata Program Center. Archived from the original on 2013-03-15. Retrieved 2012-11-26.
  26. "parallel-netcdf". Mcs.anl.gov. 2013-11-17. Archived from the original on 2008-12-01. Retrieved 2013-11-27.
  27. "BUFR FORMAT DOCUMENTATION". Archived from the original on October 9, 2007. Retrieved February 2, 2008.
  28. Archived September 5, 2008, at the Wayback Machine
  29. Archived December 14, 2005, at the Wayback Machine
  30. Archived May 21, 2008, at the Wayback Machine
  31. "GINI Satellite Format". Weather.unisys.com. Archived from the original on 2013-12-02. Retrieved 2013-11-27.
  32. "Unidata | GEMPAK". Unidata.ucar.edu. Archived from the original on 2013-11-04. Retrieved 2013-11-27.
  33. Archived February 12, 2008, at the Wayback Machine
  34. "NetCDF". Unidata.ucar.edu. Archived from the original on 2013-11-29. Retrieved 2013-11-27.
  35. "NetCDF-4". Unidata.ucar.edu. Archived from the original on 2015-06-17. Retrieved 2013-11-27.
  36. Steve Ansari. "NCDC: Radar Resources". Ncdc.noaa.gov. Archived from the original on 2013-12-02. Retrieved 2013-11-27.