BUFR

Last updated

The Binary Universal Form for the Representation of meteorological data (BUFR) is a binary data format maintained by the World Meteorological Organization (WMO). The latest version is BUFR Edition 4. BUFR Edition 3 is also considered current for operational use. BUFR was created in 1988 with the goal of replacing the WMO's dozens of character-based, position-driven meteorological codes, such as SYNOP (surface observations), TEMP (upper air soundings) and CLIMAT (monthly climatological data). BUFR was designed to be portable, compact, and universal. Any kind of data can be represented, along with its specific spatial/temporal context and any other associated metadata. In the WMO terminology, BUFR belongs to the category of table-driven code forms, where the meaning of data elements is determined by referring to a set of tables that are kept and maintained separately from the message itself.

Contents

BUFR is a complex format that can be difficult to use [1] and it presents some weaknesses. [2] The introduction of BUFR format led to data "disparition" and many formatting errors. [3] [4] [5]

Description of format

A BUFR message is composed of six sections, numbered zero through five.

The product description contained in Section 3 can be made sophisticated and non-trivial by the use of replication and/or operator descriptors. (See below for a brief overview of the different kinds of descriptors; refer to the WMO Guide on BUFR for further detail.)

Templates

Section 3 contains a short header followed by a sequence of descriptors that matches the contents of Section 4's bit-stream. The sequence of descriptors in Section 3 could be understood as the template of the BUFR message. The template contains the information necessary to describe the structure of the data values embedded in the matching bit-stream. It is to be interpreted in a step-by-step, algorithm-like manner. Given a set of BUFR messages, the values contained in Section 4 may differ from one message to the next, but their ordering and structure will be kept predictable if the template provided in Section 3 remains unchanged. Templates can be designed to meet the requirements of a specific data product (weather observations, for instance). Such templates can then be used to standardize the content and structure of BUFR data products. The WMO has released a number of BUFR templates for surface and upper air observational data.

Descriptors

All descriptors, 16 bits wide, have a F-X-Y structure, where F refers to the two most significant bits (leftmost); X refers to the 6 middle bits and Y to the least significant (rightmost) 8 bits. The F value (0 to 3) determines the type of descriptor.

Subsets

The data structure established in the Section 3 template may be re-used multiple times within a single BUFR message. In such a case, Section 4 will contain a succession of so-called subsets. For instance, subsets could be used to convey observations from several locations in a single message.

Related Research Articles

<span class="mw-page-title-main">ISO 9660</span> File system for CD-R and CD-ROM optical discs

ISO 9660 is a file system for optical disc media. The file system is an international standard available from the International Organization for Standardization (ISO). Since the specification is available for anybody to purchase, implementations have been written for many operating systems.

The Portable Executable (PE) format is a file format for executables, object code, DLLs and others used in 32-bit and 64-bit versions of Windows operating systems. The PE format is a data structure that encapsulates the information necessary for the Windows OS loader to manage the wrapped executable code. This includes dynamic library references for linking, API export and import tables, resource management data and thread-local storage (TLS) data. On NT operating systems, the PE format is used for EXE, DLL, SYS, MUI and other file types. The Unified Extensible Firmware Interface (UEFI) specification states that PE is the standard executable format in EFI environments.

Abstract Syntax Notation One (ASN.1) is a standard interface description language for defining data structures that can be serialized and deserialized in a cross-platform way. It is broadly used in telecommunications and computer networking, and especially in cryptography.

Lempel–Ziv–Welch (LZW) is a universal lossless data compression algorithm created by Abraham Lempel, Jacob Ziv, and Terry Welch. It was published by Welch in 1984 as an improved implementation of the LZ78 algorithm published by Lempel and Ziv in 1978. The algorithm is simple to implement and has the potential for very high throughput in hardware implementations. It is the algorithm of the Unix file compression utility compress and is used in the GIF image format.

Generic programming is a style of computer programming in which algorithms are written in terms of data types to-be-specified-later that are then instantiated when needed for specific types provided as parameters. This approach, pioneered by the ML programming language in 1973, permits writing common functions or types that differ only in the set of types on which they operate when used, thus reducing duplicate code.

In computer science, a set is an abstract data type that can store unique values, without any particular order. It is a computer implementation of the mathematical concept of a finite set. Unlike most other collection types, rather than retrieving a specific element from a set, one typically tests a value for membership in a set.

METAR is a format for reporting weather information. A METAR weather report is predominantly used by aircraft pilots, and by meteorologists, who use aggregated METAR information to assist in weather forecasting.

The archiver, also known simply as ar, is a Unix utility that maintains groups of files as a single archive file. Today, ar is generally used only to create and update static library files that the link editor or linker uses and for generating .deb packages for the Debian family; it can be used to create archives for any purpose, but has been largely replaced by tar for purposes other than static libraries. An implementation of ar is included as one of the GNU Binutils.

A human interface device or HID is a type of computer device usually used by humans that takes input from humans and gives output to humans.

A Java class file is a file containing Java bytecode that can be executed on the Java Virtual Machine (JVM). A Java class file is usually produced by a Java compiler from Java programming language source files containing Java classes. If a source file has more than one class, each class is compiled into a separate class file.

GRIB is a concise data format commonly used in meteorology to store historical and forecast weather data. It is standardized by the World Meteorological Organization's Commission for Basic Systems, known under number GRIB FM 92-IX, described in WMO Manual on Codes No.306. Currently there are three versions of GRIB. Version 0 was used to a limited extent by projects such as TOGA, and is no longer in operational use. The first edition is used operationally worldwide by most meteorological centers, for Numerical Weather Prediction output (NWP). A newer generation has been introduced, known as GRIB second edition, and data is slowly changing over to this format. Some of the second-generation GRIB is used for derived products distributed in the Eumetcast of Meteosat Second Generation. Another example is the NAM model.

<span class="mw-page-title-main">C data types</span> Data types supported by the C programming language

In the C programming language, data types constitute the semantics and characteristics of storage of data elements. They are expressed in the language syntax in form of declarations for memory locations or variables. Data types also determine the types of operations or methods of processing of data elements.

SYNOP is a numerical code used for reporting weather observations made by staffed and automated weather stations. SYNOP reports are typically sent every six hours by Deutscher Wetterdienst on shortwave and low frequency using RTTY. A report consists of groups of numbers describing general weather information, such as the temperature, barometric pressure and visibility at a weather station. It can be decoded by open-source software such as seaTTY, metaf2xml or Fldigi.

ISO 8583 is an international standard for financial transaction card originated interchange messaging. It is the International Organization for Standardization standard for systems that exchange electronic transactions initiated by cardholders using payment cards.

Program-specific information (PSI) is metadata about a program (channel) and part of an MPEG transport stream.

C++11 is a version of the ISO/IEC 14882 standard for the C++ programming language. C++11 replaced the prior version of the C++ standard, called C++03, and was later replaced by C++14. The name follows the tradition of naming language versions by the publication year of the specification, though it was formerly named C++0x because it was expected to be published before 2010.

This is an overview of Fortran 95 language features. Included are the additional features of TR-15581:Enhanced Data Type Facilities, which have been universally implemented. Old features that have been superseded by new ones are not described – few of those historic features are used in modern programs although most have been retained in the language to maintain backward compatibility. The current standard is Fortran 2018; many of its new features are still being implemented in compilers. The additional features of Fortran 2003, Fortran 2008 and Fortran 2018 are described by Metcalf, Reid and Cohen.

<span class="mw-page-title-main">GPS signals</span> Signals broadcast by GPS satellites

GPS signals are broadcast by Global Positioning System satellites to enable satellite navigation. Receivers on or near the Earth's surface can determine location, time, and velocity using this information. The GPS satellite constellation is operated by the 2nd Space Operations Squadron (2SOPS) of Space Delta 8, United States Space Force.

CLIMAT is a code for reporting monthly climatological data assembled at land-based meteorological surface observation sites to data centres. CLIMAT-coded messages contain information on several meteorological variables that are important to monitor characteristics, changes, and variability of climate. Usually these messages are sent and exchanged via the Global Telecommunication System (GTS) of the World Meteorological Organisation (WMO). Modifications of the CLIMAT code are the CLIMAT SHIP and CLIMAT TEMP / CLIMAT TEMP SHIP codes which serve to report monthly climatological data assembled at ocean-based meteorological surface observation sites and at land-/ocean-based meteorological upper-air observation sites, respectively. The monthly values included usually are obtained by averaging observational values of one or several daily observations over the respective month.

X.690 is an ITU-T standard specifying several ASN.1 encoding formats:

References

  1. "BUFR: A METEOROLOGICAL CODE FOR THE 21ST CENTURY (pdf)". Archived from the original on 2018-02-15. Retrieved 2018-02-14.
  2. "On the suitability of BUFR and GRIB for archiving data". 10 January 2013.
  3. Hand, E. (2016). "Obsolescence looms for balloon data". Science. 352 (6283): 281–282. Bibcode:2016Sci...352..281H. doi:10.1126/science.352.6283.281. PMID   27081049.
  4. "Dealing with Disappearing Surface Data: The Migration to BUFR and the Discontinuation of Text SYNOP and Buoy Reports". 25 January 2017.
  5. "ECMWF - TAC2BUFR - ECMWF Confluence Wiki" (PDF). Archived from the original (PDF) on 2018-02-15. Retrieved 2018-02-14.

Online BUFR validators

Software libraries