HTSQL

Last updated

Hyper Text Structured Query Language (HTSQL) is a schema-driven URI-to-SQL query language that takes a request over HTTP, converts it to a SQL query, executes the query against a database, and returns the results in a format best suited for the user agent (CSV, HTML, etc.) [1] The HTSQL language is implemented on "HTSQL servers," which use HTSQL to convert web requests into equivalent SQL, executes requests on a server-side database, and returns results in XML, HTML, CSV, JSON, or YAML formats. The current implementation as of April 2010 uses Python and works with PostgreSQL, MySQL, SQLite, Oracle, and Microsoft SQL Server databases.

Contents

HTSQL was prototyped and developed by Clark Evans (who proposed YAML in 2001 [2] ) and implemented by Kirill Simonov, [3] both of Prometheus Research.

Scope

HTSQL's intended audience is business users who are not SQL programmers. It seeks to simplify non-intuitive relational queries, such as aggregates and projections. The goal is not completeness in replicating SQL, but simplicity and intuitiveness. HTSQL allows users who are familiar with the data, but not SQL syntax, to directly access the database without going through an application. [4]

Syntax

HTSQL uses a URI-based query syntax. Queries typically begin with a single table, optionally prefixed by a schema and followed by a filter expression to limit the number of rows that are returned. [5]

Examples

Joins can be made across tables by referencing the column name preceded by the table name. This example requests the title of a degree program from the program table and the school name from the school table:

/program{school.name, title} 

The result of this query from the HTSQL command-line system is:

program~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ school.name               | title --------------------------+----------------------------------- School of Art and Design  | Post Baccalaureate in Art History School of Art and Design  | Bachelor of Arts in Art History School of Art and Design  | Bachelor of Arts in Studio Art School of Business        | Graduate Certificate in Accounting  ...

Aggregate expressions can be used when there are many-to-one relationships in the database.

/school{name, count(program), count(department)} 

The result shows the count (number) of programs and departments in each school:

school~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ name                     | count(program) | count(department) -------------------------+----------------+------------------ School of Art and Design | 3              | 2 School of Business       | 5              | 3 College of Education     | 7              | 2 School of Engineering    | 8              | 4 

Applications

HTSQL can be used for ad hoc database queries. Its developers consider it to be more concise and intuitive and therefore less error-prone than SQL for non-technical users. [5]

HTSQL can also be used by "accidental programmers", such as data analysts and UI developers. Besides more readable, HTTP-based queries, HTSQL's developers claim it helps avoid common SQL mistakes like accidental joins. [6]

Versions and licensing

HTSQL follows a dual license model. The software may be used free of charge with open source relational databases, including MySQL and PostgreSQL. [7] Source code is available on the bitbucket site. [8]

A commercial version of HTSQL, compatible with commercial databases like Oracle and Microsoft SQL Server, is also available. [1]

Similar initiatives

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations, including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues, including supporting concurrent access and fault tolerance.

<span class="mw-page-title-main">MySQL</span> SQL database engine software

MySQL is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter My, and "SQL", the acronym for Structured Query Language. A relational database organizes data into one or more data tables in which data may be related to each other; these relations help structure the data. SQL is a language programmers use to create, modify and extract data from the relational database, as well as control user access to the database. In addition to relational databases and SQL, an RDBMS like MySQL works with an operating system to implement a relational database in a computer's storage system, manages users, allows for network access and facilitates testing database integrity and creation of backups.

<span class="mw-page-title-main">PostgreSQL</span> Free and open-source relational database management system

PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. It was originally named POSTGRES, referring to its origins as a successor to the Ingres database developed at the University of California, Berkeley. In 1996, the project was renamed to PostgreSQL to reflect its support for SQL. After a review in 2007, the development team decided to keep the name PostgreSQL and the alias Postgres.

Structured Query Language, abbreviated as SQL, is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data, i.e. data incorporating relations among entities and variables.

<span class="mw-page-title-main">Object–relational database</span> Database management system

An object–relational database (ORD), or object–relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data types and methods.

In the context of SQL, data definition or data description language (DDL) is a syntax for creating and modifying database objects such as tables, indices, and users. DDL statements are similar to a computer programming language for defining data structures, especially database schemas. Common examples of DDL statements include CREATE, ALTER, and DROP.

A query language, also known as data query language or database query language (DQL), is a computer language used to make queries in databases and information systems. A well known example is the Structured Query Language (SQL).

Dataphor is an open-source truly-relational database management system (RDBMS) and its accompanying user interface technologies, which together are designed to provide highly declarative software application development. The Dataphor Server has its own storage engine or it can be a virtual, or federated, DBMS, meaning that it can utilize other database engines for storage.

ADO.NET is a data access technology from the Microsoft .NET Framework that provides communication between relational and non-relational systems through a common set of components. ADO.NET is a set of computer software components that programmers can use to access data and data services from a database. It is a part of the base class library that is included with the Microsoft .NET Framework. It is commonly used by programmers to access and modify data stored in relational database systems, though it can also access data in non-relational data sources. ADO.NET is sometimes considered an evolution of ActiveX Data Objects (ADO) technology, but was changed so extensively that it can be considered an entirely new product.

An XML database is a data persistence software system that allows data to be specified, and sometimes stored, in XML format. This data can be queried, transformed, exported and returned to a calling system. XML databases are a flavor of document-oriented databases which are in turn a category of NoSQL database.

The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.

In a database, a view is the result set of a stored query, which can be queried in the same manner as a persistent database collection object. This pre-established query command is kept in the database dictionary. Unlike ordinary base tables in a relational database, a view does not form part of the physical schema: as a result set, it is a virtual table computed or collated dynamically from data in the database when access to that view is requested. Changes applied to the data in a relevant underlying table are reflected in the data shown in subsequent invocations of the view.

The object–relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions, must be mapped to database tables defined by a relational schema.

<span class="mw-page-title-main">LAMP (software bundle)</span> Acronym for a common web hosting solution

LAMP is an acronym denoting one of the most common software stacks for many of the web's most popular applications. However, LAMP now refers to a generic software stack model and its components are largely interchangeable.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

A MultiValue database is a type of NoSQL and multidimensional database, typically considered synonymous with PICK, a database originally developed as the Pick operating system.

The following tables compare general and technical information for a number of available database administration tools. Please see individual product articles for further information. This article is neither all-inclusive nor necessarily up to date.

A hierarchical query is a type of SQL query that handles hierarchical model data. They are special cases of more general recursive fixpoint queries, which compute transitive closures.

<span class="mw-page-title-main">Apache Hive</span> Database engine

Apache Hive is a data warehouse software project built on top of Apache Hadoop for providing data query and analysis. Hive gives an SQL-like interface to query data stored in various databases and file systems that integrate with Hadoop. Traditional SQL queries must be implemented in the MapReduce Java API to execute SQL applications and queries over distributed data. Hive provides the necessary SQL abstraction to integrate SQL-like queries (HiveQL) into the underlying Java without the need to implement queries in the low-level Java API. Since most data warehousing applications work with SQL-based querying languages, Hive aids portability of SQL-based applications to Hadoop. While initially developed by Facebook, Apache Hive is used and developed by other companies such as Netflix and the Financial Industry Regulatory Authority (FINRA). Amazon maintains a software fork of Apache Hive included in Amazon Elastic MapReduce on Amazon Web Services.

<span class="mw-page-title-main">Apache Drill</span> Open-source software framework

Apache Drill is an open-source software framework that supports data-intensive distributed applications for interactive analysis of large-scale datasets. Built chiefly by contributions from developers from MapR, Drill is inspired by Google's Dremel system, also productized as BigQuery. Drill is an Apache top-level project. Tom Shiran is the founder of the Apache Drill Project. It was designated an Apache Software Foundation top-level project in December 2016.

References

  1. 1 2 "Home". htsql.org.
  2. YAML
  3. "Hyper text structured query language" . Retrieved 4 May 2023.
  4. "Introduction — HTSQL -- HyperText Structured Query Language v1.0 documentation". www.htsql.org. Archived from the original on 11 July 2010.
  5. 1 2 "HTSQL Manual: HyperText Structured Query Language". clarkevans.com. Archived from the original on 8 July 2011.
  6. "HTSQL -- a Query Language for Accidental Programmers".
  7. "HTSQL Permissive License". htsql.org. Retrieved 4 May 2023.
  8. "prometheus / htsql / source — bitbucket.org". bitbucket.org. Archived from the original on 26 September 2010.