OPeNDAP

Last updated

OPeNDAP is an acronym for "Open-source Project for a Network Data Access Protocol," an endeavor focused on enhancing the retrieval of remote, structured data through a Web-based architecture and a discipline-neutral Data Access Protocol (DAP). Widely used, especially in Earth science, the protocol is layered on HTTP, and its current specification is DAP4, [1] though the previous DAP2 version remains broadly used. Developed and advanced (openly and collaboratively) by the non-profit OPeNDAP, Inc., [2] DAP is intended to enable remote, selective data-retrieval as an easily invoked Web service. OPeNDAP, Inc. also develops and maintains zero-cost (reference) implementations of the DAP protocol in both server-side and client-side software.

"OPeNDAP" often is used in place of "DAP" to denote the protocol but also may refer to an entire DAP-based data-retrieval architecture. Other DAP-centered architectures, such as THREDDS [3] and ERDDAP, the NOAA GEO-IDE UAF ERDDAP [4] exhibit significant interoperability with one another as well as with systems employing OPeNDAP's own (open-source) servers and software.

A DAP client can be an ordinary browser or even a spreadsheet, though with limited functionality (see OPeNDAP's Web page on Available Client Software). More typically, DAP clients are:

Regardless of their types, and whether developed commercially or by an end-user, clients almost universally link to DAP servers through libraries that implement the DAP2 or DAP4 protocol in one language or another. OPeNDAP offers open-source libraries in C++ and Java, but many clients rely on community developed libraries such as PyDAP or, especially, the NetCDF suite. Developed and maintained by the Unidata Program at the UCAR in multiple programming languages, all NetCDF libraries include embedded capabilities for retrieving (array-style) data from DAP servers.

A data-using client references a data set by its URL and requests metadata or content by issuing (usually through an embedded DAP library) an HTTP request to a DAP server. Content requests usually are preceded by requests for metadata describing the structure and other details about the referenced data set. With this information, the client may construct DAP constraint expressions [7] to retrieve specific content (i.e., subsets) from the source. OPeNDAP servers offer various types of responses, depending on the specific form of the client's request, including XML, JSON, HTML and ASCII. In response to requests for content, OPeNDAP servers can respond with multi-part mime documents that include a binary portion with NetCDF or DAP-native encoding. (These binary forms offer compact means to deliver large volumes of content, and the DAP-native form may even be streamed if desired.)

OPeNDAP's software for building DAP servers (on top of Apache) is dubbed Hyrax and includes adapters that facilitate serving a wide variety of source data. DAP servers most frequently enable (remote) access to (large) HDF or NetCDF files, but the source data can exist in databases or other formats, including user-defined ones. When source data are organized as files, DAP retrievals enable, via subsetting, finer-grained access than does the FTP. Furthermore, OPeNDAP servers can aggregate subsets from multiple files for delivery in a single retrieval. Taken together, subsetting, aggregation and streaming can yield substantial data-access efficiencies, even in the presence of slow networks.

OPeNDAP and other DAP servers are used operationally in government agencies, including NASA and NOAA, for providing access to Earth science data, including satellite imagery and other high-volume information sources. The DAP data model embraces a comprehensive set of data structures, including multidimensional arrays and nested sequences (i.e., records), complemented by a correspondingly rich set of constraint expressions. Hence the OPeNDAP data-retrieval architecture has demonstrated utility across a broad range of scientific data types, including data generated via simulations and data generated via observations (whether remotely sensed or measured in situ).

Related Research Articles

<span class="mw-page-title-main">Client–server model</span> Distributed application structure in computing

The client–server model is a distributed application structure that partitions tasks or workloads between the providers of a resource or service, called servers, and service requesters, called clients. Often clients and servers communicate over a computer network on separate hardware, but both client and server may reside in the same system. A server host runs one or more server programs, which share their resources with clients. A client usually does not share any of its resources, but it requests content or service from a server. Clients, therefore, initiate communication sessions with servers, which await incoming requests. Examples of computer applications that use the client–server model are email, network printing, and the World Wide Web.

The Gopher protocol is a communication protocol designed for distributing, searching, and retrieving documents in Internet Protocol networks. The design of the Gopher protocol and user interface is menu-driven, and presented an alternative to the World Wide Web in its early stages, but ultimately fell into disfavor, yielding to HTTP. The Gopher ecosystem is often regarded as the effective predecessor of the World Wide Web.

<span class="mw-page-title-main">HTTP</span> Application protocol for distributed, collaborative, hypermedia information systems

The Hypertext Transfer Protocol (HTTP) is an application layer protocol in the Internet protocol suite model for distributed, collaborative, hypermedia information systems. HTTP is the foundation of data communication for the World Wide Web, where hypertext documents include hyperlinks to other resources that the user can easily access, for example by a mouse click or by tapping the screen in a web browser.

<span class="mw-page-title-main">Web server</span> Computer software that distributes web pages

A web server is computer software and underlying hardware that accepts requests via HTTP or its secure variant HTTPS. A user agent, commonly a web browser or web crawler, initiates communication by making a request for a web page or other resource using HTTP, and the server responds with the content of that resource or an error message. A web server can also accept and store resources sent from the user agent if configured to do so.

The File Transfer Protocol (FTP) is a standard communication protocol used for the transfer of computer files from a server to a client on a computer network. FTP is built on a client–server model architecture using separate control and data connections between the client and the server. FTP users may authenticate themselves with a plain-text sign-in protocol, normally in the form of a username and password, but can connect anonymously if the server is configured to allow it. For secure transmission that protects the username and password, and encrypts the content, FTP is often secured with SSL/TLS (FTPS) or replaced with SSH File Transfer Protocol (SFTP).

<span class="mw-page-title-main">Proxy server</span> Computer server that makes and receives requests on behalf of a user

In computer networking, a proxy server is a server application that acts as an intermediary between a client requesting a resource and the server providing that resource. It improves privacy, security, and performance in the process.

<span class="mw-page-title-main">Universal Plug and Play</span> Set of networking protocols

Universal Plug and Play (UPnP) is a set of networking protocols on the Internet Protocol (IP) that permits networked devices, such as personal computers, printers, Internet gateways, Wi-Fi access points and mobile devices, to seamlessly discover each other's presence on the network and establish functional network services. UPnP is intended primarily for residential networks without enterprise-class devices.

<span class="mw-page-title-main">Virtual Network Computing</span> Graphical desktop-sharing system

Virtual Network Computing (VNC) is a graphical desktop-sharing system that uses the Remote Frame Buffer protocol (RFB) to remotely control another computer. It transmits the keyboard and mouse input from one computer to another, relaying the graphical-screen updates, over a network.

XForms is an XML format used for collecting inputs from web forms. XForms was designed to be the next generation of HTML / XHTML forms, but is generic enough that it can also be used in a standalone manner or with presentation languages other than XHTML to describe a user interface and a set of common data manipulation tasks.

In software engineering, the terms frontend and backend refer to the separation of concerns between the presentation layer (frontend), and the data access layer (backend) of a piece of software, or the physical infrastructure or hardware. In the client–server model, the client is usually considered the frontend and the server is usually considered the backend, even when some presentation work is actually done on the server itself.

In computer science and networking in particular, a session is a time-delimited two-way link, a practical layer in the TCP/IP protocol enabling interactive expression and information exchange between two or more communication devices or ends – be they computers, automated systems, or live active users. A session is established at a certain point in time, and then ‘torn down’ - brought to an end - at some later point. An established communication session may involve more than one message in each direction. A session is typically stateful, meaning that at least one of the communicating parties needs to hold current state information and save information about the session history to be able to communicate, as opposed to stateless communication, where the communication consists of independent requests with responses.

REST is a software architectural style that was created to guide the design and development of the architecture for the World Wide Web. REST defines a set of constraints for how the architecture of a distributed, Internet-scale hypermedia system, such as the Web, should behave. The REST architectural style emphasises uniform interfaces, independent deployment of components, the scalability of interactions between them, and creating a layered architecture to promote caching to reduce user-perceived latency, enforce security, and encapsulate legacy systems.

An image server is web server software which specializes in delivering images. However, not all image servers support HTTP or can be used on web sites.

Push technology, also known as server push, refers to a method of communication on the Internet where the initial request for a transaction is initiated by the server, rather than the client. This approach is different from the more commonly known "pull" method, where information transmission is requested by the receiver or client.

<span class="mw-page-title-main">HTTP compression</span> Capability that can be built into web servers and web clients

HTTP compression is a capability that can be built into web servers and web clients to improve transfer speed and bandwidth utilization.

<span class="mw-page-title-main">Virtuoso Universal Server</span> Computer software

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional relational database management system (RDBMS), object–relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is a "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

The Open Geospatial Consortium Web Coverage Service Interface Standard (WCS) defines Web-based retrieval of coverages – that is, digital geospatial information representing space/time-varying phenomena.

NetCDF is a set of software libraries and self-describing, machine-independent data formats that support the creation, access, and sharing of array-oriented scientific data. The project homepage is hosted by the Unidata program at the University Corporation for Atmospheric Research (UCAR). They are also the chief source of netCDF software, standards development, updates, etc. The format is an open standard. NetCDF Classic and 64-bit Offset Format are an international standard of the Open Geospatial Consortium.

Agora was a World Wide Web email browser and was a proof of concept to help people to use the full internet. Agora was an email-based web browser designed for non-graphic terminals and to help people without full access to the internet such as in developing countries or without a permanent internet connection. Similar to W3Gate, Agora was a server application designed to fetch HTML documents through e-mail rather than http.

Gemini is an application-layer internet communication protocol for accessing remote documents, similar to HTTP and Gopher. It comes with a special document format, commonly referred to as "gemtext", which allows linking to other documents. Started by a pseudonymous person known as Solderpunk, the protocol is being finalized collaboratively and as of October 2022, has not been submitted to the IETF organization for standardization.

References