Query plan

Last updated

A query plan (or query execution plan) is a sequence of steps used to access data in a SQL relational database management system. This is a specific case of the relational model concept of access plans.

Contents

Since SQL is declarative, there are typically many alternative ways to execute a given query, with widely varying performance. When a query is submitted to the database, the query optimizer evaluates some of the different, correct possible plans for executing the query and returns what it considers the best option. Because query optimizers are imperfect, database users and administrators sometimes need to manually examine and tune the plans produced by the optimizer to get better performance.

Generating query plans

A given database management system may offer one or more mechanisms for returning the plan for a given query. Some packages feature tools which will generate a graphical representation of a query plan. Other tools allow a special mode to be set on the connection to cause the DBMS to return a textual description of the query plan. Another mechanism for retrieving the query plan involves querying a virtual database table after executing the query to be examined. In Oracle, for instance, this can be achieved using the EXPLAIN PLAN statement.

Graphical plans

Microsoft SQL Server Management Studio displaying a sample query plan. SQLServer QueryPlan.png
Microsoft SQL Server Management Studio displaying a sample query plan.

The Microsoft SQL Server Management Studio tool, which ships with Microsoft SQL Server, for example, shows this graphical plan when executing this two-table join example against an included sample database:

SELECT*FROMHumanResources.EmployeeASeINNERJOINPerson.ContactAScONe.ContactID=c.ContactIDORDERBYc.LastName

The UI allows exploration of various attributes of the operators involved in the query plan, including the operator type, the number of rows each operator consumes or produces, and the expected cost of each operator's work.

Textual plans

The textual plan given for the same query in the screenshot is shown here:

StmtText----|--Sort(ORDERBY:([c].[LastName]ASC))|--NestedLoops(InnerJoin,OUTERREFERENCES:([e].[ContactID],[Expr1004])WITHUNORDEREDPREFETCH)|--ClusteredIndexScan(OBJECT:([AdventureWorks].[HumanResources].[Employee].[PK_Employee_EmployeeID]AS[e]))|--ClusteredIndexSeek(OBJECT:([AdventureWorks].[Person].[Contact].[PK_Contact_ContactID]AS[c]),SEEK:([c].[ContactID]=[AdventureWorks].[HumanResources].[Employee].[ContactID]as[e].[ContactID])ORDEREDFORWARD)

It indicates that the query engine will do a scan over the primary key index on the Employee table and a matching seek through the primary key index (the ContactID column) on the Contact table to find matching rows. The resulting rows from each side will be shown to a nested loops join operator, sorted, then returned as the result set to the connection.

In order to tune the query, the user must understand the different operators that the database may use, and which ones might be more efficient than others while still providing semantically correct query results.

Database tuning

Reviewing the query plan can present opportunities for new indexes or changes to existing indexes. It can also show that the database is not properly taking advantage of existing indexes (see query optimizer).

Query tuning

A query optimizer will not always choose the most efficient query plan for a given query. In some databases the query plan can be reviewed, problems found, and then the query optimizer gives hints on how to improve it. In other databases, alternatives to express the same query (other queries that return the same results) can be tried. Some query tools can generate embedded hints in the query, for use by the optimizer.

Some databases - like Oracle - provide a plan table for query tuning. This plan table will return the cost and time for executing a query. Oracle offers two optimization approaches:

  1. CBO or Cost Based Optimization
  2. RBO or Rule Based Optimization

RBO is slowly being deprecated. For CBO to be used, all the tables referenced by the query must be analyzed. To analyze a table, a DBA can launch code from the DBMS_STATS package.

Other tools for query optimization include:

  1. SQL Trace [1]
  2. Oracle Trace and TKPROF [2]
  3. Microsoft SMS (SQL) Execution Plan [3]
  4. Tableau Performance Recording (all DB) [4]

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

A relational database is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

<span class="mw-page-title-main">Object–relational database</span> Database management system

An object–relational database (ORD), or object–relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data types and methods.

<span class="mw-page-title-main">Ingres (database)</span>

Ingres Database is a proprietary SQL relational database management system intended to support large commercial and government applications.

In database theory, relational algebra is a theory that uses algebraic structures for modeling data, and defining queries on it with a well founded semantics. The theory was introduced by Edgar F. Codd.

Online analytical processing, or OLAP, is an approach to answer multi-dimensional analytical (MDA) queries swiftly in computing. OLAP is part of the broader category of business intelligence, which also encompasses relational databases, report writing and data mining. Typical applications of OLAP include business reporting for sales, marketing, management reporting, business process management (BPM), budgeting and forecasting, financial reporting and similar areas, with new applications emerging, such as agriculture.

<span class="mw-page-title-main">Join (SQL)</span> SQL clause

A join clause in the Structured Query Language (SQL) combines columns from one or more tables into a new table. The operation corresponds to a join operation in relational algebra. Informally, a join stitches two tables and puts on the same row records with matching fields : INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER and CROSS.

The SQL SELECT statement returns a result set of records, from one or more tables.

Dataphor is an open-source truly-relational database management system (RDBMS) and its accompanying user interface technologies, which together are designed to provide highly declarative software application development. The Dataphor Server has its own storage engine or it can be a virtual, or federated, DBMS, meaning that it can utilize other database engines for storage.

In a database, a view is the result set of a stored query, which can be queried in the same manner as a persistent database collection object. This pre-established query command is kept in the data dictionary. Unlike ordinary base tables in a relational database, a view does not form part of the physical schema: as a result set, it is a virtual table computed or collated dynamically from data in the database when access to that view is requested. Changes applied to the data in a relevant underlying table are reflected in the data shown in subsequent invocations of the view.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

In computer science, a database cursor is a mechanism that enables traversal over the records in a database. Cursors facilitate subsequent processing in conjunction with the traversal, such as retrieval, addition and removal of database records. The database cursor characteristic of traversal makes cursors akin to the programming language concept of iterator.

Microsoft SQL Server Analysis Services (SSAS) is an online analytical processing (OLAP) and data mining tool in Microsoft SQL Server. SSAS is used as a tool by organizations to analyze and make sense of information possibly spread out across multiple databases, or in disparate tables or files. Microsoft has included a number of services in SQL Server related to business intelligence and data warehousing. These services include Integration Services, Reporting Services and Analysis Services. Analysis Services includes a group of OLAP and data mining capabilities and comes in two flavors multidimensional and tabular, where the difference between the two is how the data is presented. In a tabular model, the information is arranged in two-dimensional tables which can thus be more readable for a human. A multidimensional model can contain information with many degrees of freedom, and must be unfolded to increase readability by a human.

Query optimization is a feature of many relational database management systems and other databases such as NoSQL and graph databases. The query optimizer attempts to determine the most efficient way to execute a given query by considering the possible query plans.

Database administration is the function of managing and maintaining database management systems (DBMS) software. Mainstream DBMS software such as Oracle, IBM Db2 and Microsoft SQL Server need ongoing management. As such, corporations that use DBMS software often hire specialized information technology personnel called database administrators or DBAs.

Language Integrated Query is a Microsoft .NET Framework component that adds native data querying capabilities to .NET languages, originally released as a major part of .NET Framework 3.5 in 2007.

Apache Empire-db is a Java library that provides a high level object-oriented API for accessing relational database management systems (RDBMS) through JDBC. Apache Empire-db is open source and provided under the Apache License 2.0 from the Apache Software Foundation.

CUBRID ( "cube-rid") is an open-source SQL-based relational database management system (RDBMS) with object extensions developed by CUBRID Corp. for OLTP. The name CUBRID is a combination of the two words cube and bridge, cube standing for a space for data and bridge standing for data bridge.

In database management systems (DBMS), a prepared statement, parameterized statement, or parameterized query is a feature where the database pre-compiles SQL code and stores the results, separating it from data. Benefits of prepared statements are:

The following is provided as an overview of and topical guide to databases:

References

  1. "SQL Trace". Microsoft.com. Microsoft. Retrieved 30 March 2020.
  2. "Using SQL Trace and TKPROF". Oracle.com. Retrieved 30 March 2020.
  3. "Execution Plans". Microsoft.com. Microsoft. Retrieved 30 March 2020.
  4. "Optimize Workbook Performance". Tableau.com. Tableau Inc. Retrieved 30 March 2020.