In SQL, a window function or analytic function [1] is a function which uses values from one or multiple rows to return a value for each row. (This contrasts with an aggregate function, which returns a single value for multiple rows.) Window functions have an OVER clause; any function without an OVER clause is not a window function, but rather an aggregate or single-row (scalar) function. [2]
As an example, here is a query which uses a window function to compare the salary of each employee with the average salary of their department (example from the PostgreSQL documentation): [3]
SELECTdepname,empno,salary,avg(salary)OVER(PARTITIONBYdepname)FROMempsalary;
Output:
depname | empno | salary | avg ----------+-------+--------+---------------------- develop | 11 | 5200 | 5020.0000000000000000 develop | 7 | 4200 | 5020.0000000000000000 develop | 9 | 4500 | 5020.0000000000000000 develop | 8 | 6000 | 5020.0000000000000000 develop | 10 | 5200 | 5020.0000000000000000 personnel | 5 | 3500 | 3700.0000000000000000 personnel | 2 | 3900 | 3700.0000000000000000 sales | 3 | 4800 | 4866.6666666666666667 sales | 1 | 5000 | 4866.6666666666666667 sales | 4 | 4800 | 4866.6666666666666667 (10 rows)
The PARTITION BY
clause groups rows into partitions, and the function is applied to each partition separately. If the PARTITION BY
clause is omitted (such as with an empty OVER()
clause), then the entire result set is treated as a single partition. [4] For this query, the average salary reported would be the average taken over all rows.
Window functions are evaluated after aggregation (after the GROUP BY
clause and non-window aggregate functions, for example). [1]
According to the PostgreSQL documentation, a window function has the syntax of one of the following: [4]
function_name([expression[,expression...]])OVERwindow_namefunction_name([expression[,expression...]])OVER(window_definition)function_name(*)OVERwindow_namefunction_name(*)OVER(window_definition)
where window_definition
has syntax:
[existing_window_name][PARTITIONBYexpression[,...]][ORDERBYexpression[ASC|DESC|USINGoperator][NULLS{FIRST|LAST}][,...]][frame_clause]
frame_clause
has the syntax of one of the following:
{RANGE|ROWS|GROUPS}frame_start[frame_exclusion]{RANGE|ROWS|GROUPS}BETWEENframe_startANDframe_end[frame_exclusion]
frame_start
and frame_end
can be UNBOUNDED PRECEDING
, offset PRECEDING
, CURRENT ROW
, offset FOLLOWING
, or UNBOUNDED FOLLOWING
. frame_exclusion
can be EXCLUDE CURRENT ROW
, EXCLUDE GROUP
, EXCLUDE TIES
, or EXCLUDE NO OTHERS
.
expression
refers to any expression that does not contain a call to a window function.
Notation:
Window functions allow access to data in the records right before and after the current record. [5] [6] [7] [8] A window function defines a frame or window of rows with a given length around the current row, and performs a calculation across the set of data in the window. [9] [10]
NAME | ------------ Aaron| <-- Preceding (unbounded) Andrew| Amelia| James| Jill| Johnny| <-- 1st preceding row Michael| <-- Current row Nick| <-- 1st following row Ophelia| Zach| <-- Following (unbounded)
In the above table, the next query extracts for each row the values of a window with one preceding and one following row:
SELECTLAG(name,1)OVER(ORDERBYname)"prev",name,LEAD(name,1)OVER(ORDERBYname)"next"FROMpeopleORDERBYname
The result query contains the following values:
| PREV | NAME | NEXT | |----------|----------|----------| | (null)| Aaron| Andrew| | Aaron| Andrew| Amelia| | Andrew| Amelia| James| | Amelia| James| Jill| | James| Jill| Johnny| | Jill| Johnny| Michael| | Johnny| Michael| Nick| | Michael| Nick| Ophelia| | Nick| Ophelia| Zach| | Ophelia| Zach| (null)|
Window functions were introduced to the SQL:2003 standard and had functionality expanded in later specifications. [11]
Support for particular database implementations was added as follows:
PostgreSQL, also known as Postgres, is a free and open-source relational database management system (RDBMS) emphasizing extensibility and SQL compliance. PostgreSQL features transactions with atomicity, consistency, isolation, durability (ACID) properties, automatically updatable views, materialized views, triggers, foreign keys, and stored procedures. It is supported on all major operating systems, including Linux, FreeBSD, OpenBSD, macOS, and Windows, and handles a range of workloads from single machines to data warehouses or web services with many concurrent users.
Structured Query Language (SQL) is a domain-specific language used to manage data, especially in a relational database management system (RDBMS). It is particularly useful in handling structured data, i.e., data incorporating relations among entities and variables.
A join clause in the Structured Query Language (SQL) combines columns from one or more tables into a new table. The operation corresponds to a join operation in relational algebra. Informally, a join stitches two tables and puts on the same row records with matching fields : INNER
, LEFT OUTER
, RIGHT OUTER
, FULL OUTER
and CROSS
.
The SQL SELECT statement returns a result set of records, from one or more tables.
An SQL INSERT statement adds one or more records to any single table in a relational database.
A database trigger is procedural code that is automatically executed in response to certain events on a particular table or view in a database. The trigger is mostly used for maintaining the integrity of the information on the database. For example, when a new record is added to the employees table, new records should also be created in the tables of the taxes, vacations and salaries. Triggers can also be used to log historical data, for example to keep track of employees' previous salaries.
The following tables compare general and technical information for a number of relational database management systems. Please see the individual products' articles for further information. Unless otherwise specified in footnotes, comparisons are based on the stable versions without any add-ons, extensions or external programs.
In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL
is a reserved word used to identify this marker.
Multidimensional Expressions (MDX) is a query language for online analytical processing (OLAP) using a database management system. Much like SQL, it is a query language for OLAP cubes. It is also a calculation language, with syntax similar to spreadsheet formulae.
A WHERE
clause in SQL specifies that a SQL Data Manipulation Language (DML) statement should only affect rows that meet specified criteria. The criteria are expressed in the form of predicates. WHERE
clauses are not mandatory clauses of SQL DML statements, but can be used to limit the number of rows affected by a SQL DML statement or returned by a query. In brief SQL WHERE clause is used to extract only those results from a SQL statement, such as: SELECT
, INSERT
, UPDATE
, or DELETE
statement.
In computing, a materialized view is a database object that contains the results of a query. For example, it may be a local copy of data located remotely, or may be a subset of the rows and/or columns of a table or join result, or may be a summary using an aggregate function.
A relational database management system uses SQL MERGE
statements to INSERT
new records or UPDATE
existing records depending on whether condition matches. It was officially introduced in the SQL:2003 standard, and expanded in the SQL:2008 standard.
A HAVING
clause in SQL specifies that an SQL SELECT
statement must only return rows where aggregate values meet the specified conditions.
The DUAL table is a special one-row, one-column table present by default in Oracle and other database installations. In Oracle, the table has a single VARCHAR2(1) column called DUMMY that has a value of 'X'. It is suitable for use in selecting a pseudo column such as SYSDATE or USER.
SQL/XML or XML-Related Specifications is part 14 of the Structured Query Language (SQL) specification. In addition to the traditional predefined SQL data types like NUMERIC, CHAR, TIMESTAMP, ...
it introduces the predefined data type XML
together with constructors, several routines, functions, and XML-to-SQL data type mappings to support manipulation and storage of XML in a SQL database.
A hierarchical query is a type of SQL query that handles hierarchical model data. They are special cases of more general recursive fixpoint queries, which compute transitive closures.
In a SQL database query, a correlated subquery is a subquery that uses values from the outer query. Because the subquery may be evaluated once for each row processed by the outer query, it can be slow.
An alias is a feature of SQL that is supported by most, if not all, relational database management systems (RDBMSs). Aliases provide users with the ability to reduce the amount of code required for a query, and to make queries simpler to understand. In addition, aliasing is required when doing self joins
In relational databases a virtual column is a table column whose value(s) is automatically computed using other columns values, or another deterministic expression. Virtual columns are defined of SQL:2003 as Generated Column, and are only implemented by some DBMSs, like MariaDB, SQL Server, Oracle, PostgreSQL, SQLite and Firebird.
The syntax of the SQL programming language is defined and maintained by ISO/IEC SC 32 as part of ISO/IEC 9075. This standard is not freely available. Despite the existence of the standard, SQL code is not completely portable among different database systems without adjustments.