OPeNDAP

Last updated

OPeNDAP (Open-source Project for a Network Data Access Protocol) is an endeavor focused on enhancing the retrieval of remote, structured data through a Web-based architecture and a discipline-neutral Data Access Protocol (DAP).

Contents

Project

Widely used, especially in Earth science, the protocol is layered on HTTP, and its current specification is DAP4, [1] though the previous DAP2 version remains broadly used. Developed and advanced (openly and collaboratively) by the non-profit OPeNDAP, Inc., [2] DAP is intended to enable remote, selective data-retrieval as an easily invoked Web service. OPeNDAP, Inc. also develops and maintains zero-cost (reference) implementations of the DAP protocol in both server-side and client-side software.

"OPeNDAP" often is used in place of "DAP" to denote the protocol but also may refer to an entire DAP-based data-retrieval architecture. Other DAP-centered architectures, such as THREDDS [3] and ERDDAP, the NOAA GEO-IDE UAF ERDDAP [4] exhibit significant interoperability with one another as well as with systems employing OPeNDAP's own (open-source) servers and software.

A DAP client can be an ordinary browser or even a spreadsheet, though with limited functionality (see OPeNDAP's Web page on Available Client Software). More typically, DAP clients are:

Regardless of their types, and whether developed commercially or by an end-user, clients almost universally link to DAP servers through libraries that implement the DAP2 or DAP4 protocol in one language or another. OPeNDAP offers open-source libraries in C++ and Java, but many clients rely on community developed libraries such as PyDAP or, especially, the NetCDF suite. Developed and maintained by the Unidata Program at the UCAR in multiple programming languages, all NetCDF libraries include embedded capabilities for retrieving (array-style) data from DAP servers.

A data-using client references a data set by its URL and requests metadata or content by issuing (usually through an embedded DAP library) an HTTP request to a DAP server. Content requests usually are preceded by requests for metadata describing the structure and other details about the referenced data set. With this information, the client may construct DAP constraint expressions [7] to retrieve specific content (i.e., subsets) from the source. OPeNDAP servers offer various types of responses, depending on the specific form of the client's request, including XML, JSON, HTML and ASCII. In response to requests for content, OPeNDAP servers can respond with multi-part mime documents that include a binary portion with NetCDF or DAP-native encoding. (These binary forms offer compact means to deliver large volumes of content, and the DAP-native form may even be streamed if desired.)

OPeNDAP's software for building DAP servers (on top of Apache) is dubbed Hyrax and includes adapters that facilitate serving a wide variety of source data. DAP servers most frequently enable (remote) access to (large) HDF or NetCDF files, but the source data can exist in databases or other formats, including user-defined ones. When source data are organized as files, DAP retrievals enable, via subsetting, finer-grained access than does the FTP. Furthermore, OPeNDAP servers can aggregate subsets from multiple files for delivery in a single retrieval. Taken together, subsetting, aggregation and streaming can yield substantial data-access efficiencies, even in the presence of slow networks.

OPeNDAP and other DAP servers are used operationally in government agencies, including NASA and NOAA, for providing access to Earth science data, including satellite imagery and other high-volume information sources. The DAP data model embraces a comprehensive set of data structures, including multidimensional arrays and nested sequences (i.e., records), complemented by a correspondingly rich set of constraint expressions. Hence the OPeNDAP data-retrieval architecture has demonstrated utility across a broad range of scientific data types, including data generated via simulations and data generated via observations (whether remotely sensed or measured in situ).

Related Research Articles

<span class="mw-page-title-main">Client–server model</span> Distributed application structure in computing

The client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs, which share their resources with clients. A client usually does not share any of its resources, but it requests content or service from a server. Clients, therefore, initiate communication sessions with servers, which await incoming requests. Examples of computer applications that use the client–server model are email, network printing, and the World Wide Web.

The Gopher protocol is a communication protocol designed for distributing, searching, and retrieving documents in Internet Protocol networks. The design of the Gopher protocol and user interface is menu-driven, and presented an alternative to the World Wide Web in its early stages, but ultimately fell into disfavor, yielding to Hypertext Transfer Protocol (HTTP). The Gopher ecosystem is often regarded as the effective predecessor of the World Wide Web.

<span class="mw-page-title-main">HTTP</span> Application protocol for distributed, collaborative, hypermedia information systems

HTTP is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.

The Secure Shell (SSH) Protocol is a cryptographic network protocol for operating network services securely over an unsecured network. Its most notable applications are remote login and command-line execution.

<span class="mw-page-title-main">Web server</span> Computer software that distributes web pages

A web server is computer software and underlying hardware that accepts requests via HTTP or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data connections between the client and the server. FTP users may authenticate themselves with a plain-text sign-in protocol, normally in the form of a username and password, but can connect anonymously if the server is configured to allow it. For secure transmission that protects the username and password, and encrypts the content, FTP is often secured with SSL/TLS (FTPS) or replaced with SSH File Transfer Protocol (SFTP).

<span class="mw-page-title-main">Proxy server</span> Computer server that makes and receives requests on behalf of a user

In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. It improves privacy, security, and possibly performance in the process.

<span class="mw-page-title-main">XMPP</span> Communications protocol for message-oriented middleware

Extensible Messaging and Presence Protocol is an open communication protocol designed for instant messaging (IM), presence information, and contact list maintenance. Based on XML, it enables the near-real-time exchange of structured data between two or more network entities. Designed to be extensible, the protocol offers a multitude of applications beyond traditional IM in the broader realm of message-oriented middleware, including signalling for VoIP, video, file transfer, gaming and other uses.

<span class="mw-page-title-main">VNC</span> Graphical desktop-sharing system

VNC is a graphical desktop-sharing system that uses the Remote Frame Buffer protocol (RFB) to remotely control another computer. It transmits the keyboard and mouse input from one computer to another, relaying the graphical-screen updates, over a network. Popular uses for this technology include remote technical support and accessing files on one's work computer from one's home computer, or vice versa.

XForms is an XML format used for collecting inputs from web forms. XForms was designed to be the next generation of HTML / XHTML forms, but is generic enough that it can also be used in a standalone manner or with presentation languages other than XHTML to describe a user interface and a set of common data manipulation tasks.

<span class="mw-page-title-main">Content delivery network</span> Layer in the internet ecosystem addressing bottlenecks

A content delivery network or content distribution network (CDN) is a geographically distributed network of proxy servers and their data centers. The goal is to provide high availability and performance ("speed") by distributing the service spatially relative to end users. CDNs came into existence in the late 1990s as a means for alleviating the performance bottlenecks of the Internet as the Internet was starting to become a mission-critical medium for people and enterprises. Since then, CDNs have grown to serve a large portion of the Internet content today, including web objects, downloadable objects, applications, live streaming media, on-demand streaming media, and social media sites.

An image server is web server software which specializes in delivering images. However, not all image servers support HTTP or can be used on web sites.

Push technology, also known as server Push, refers to a communication method, where the communication is initiated by a server rather than a client. This approach is different from the "pull" method where the communication is initiated by a client.

<span class="mw-page-title-main">HTTP compression</span> Capability that can be built into web servers and web clients

HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization.

The Open Geospatial Consortium Web Coverage Service Interface Standard (WCS) defines Web-based retrieval of coverages – that is, digital geospatial information representing space/time-varying phenomena.

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidata program at the University Corporation for Atmospheric Research (UCAR). They are also the chief source of netCDF software, standards development, updates, etc. The format is an open standard. NetCDF Classic and 64-bit Offset Format are an international standard of the Open Geospatial Consortium.

The following outline is provided as an overview of and topical guide to software:

Agora was a World Wide Web email browser that served as a proof of concept to help people use the full internet. Agora was an email-based web browser designed for non-graphic terminals and to help people without full access to the internet such as in developing countries or without a permanent internet connection. Similar to W3Gate, Agora was a server application designed to fetch HTML documents through e-mail rather than http.

GEOMS – Generic Earth Observation Metadata Standard is a metadata standard used for archiving data from groundbased networks, like the Network for the Detection of Atmospheric Composition Change (NDACC), and for using this kind of data for the validation of NASA and ESA satellite data.

Gemini is an application-layer internet communication protocol for accessing remote documents, similar to HTTP and Gopher. It comes with a special document format, commonly referred to as "gemtext", which allows linking to other documents. Started by a pseudonymous person known as Solderpunk, the protocol is being finalized collaboratively and as of October 2022, has not been submitted to the IETF organization for standardization.

References

  1. "DAP4 Specification - OPeNDAP Documentation". docs.opendap.org.
  2. "Home — OPeNDAP". January 7, 2024.
  3. "Unidata | THREDDS Data Server (TDS)". www.unidata.ucar.edu.
  4. "ERDDAP - Home Page". upwell.pfeg.noaa.gov.
  5. "A Graphical netCDF File Browser".
  6. "OPeNDAP software".
  7. "DAP4: Specification Volume 1 - OPeNDAP Documentation". docs.opendap.org.