Unique key

Last updated

In relational database management systems, a unique key is a candidate key. All the candidate keys of a relation can uniquely identify the records of the relation, but only one of them is used as the primary key of the relation. The remaining candidate keys are called unique keys because they can uniquely identify a record in a relation. Unique keys can consist of multiple columns. Unique keys are also called alternate keys. Unique keys are an alternative to the primary key of the relation. In SQL, the unique keys have a UNIQUE constraint assigned to them in order to prevent duplicates (a duplicate entry is not valid in a unique column). Alternate keys may be used like the primary key when doing a single-table select or when filtering in a where clause, but are not typically used to join multiple tables.

Contents

Summary

Keys provide the means for database users and application software to identify, access and update information in a database table. There may be several keys in any given table. For example, in a table of employees, both employee number and login name are individually unique. The enforcement of a key constraint (i.e. a uniqueness constraint) in a table is also a data integrity feature of the database. The DBMS prevents updates that would cause duplicate key values and thereby ensures that tables always comply with the desired rules for uniqueness. Proper selection of keys when designing a database is therefore an important aspect of database integrity.

A relational database table may have one or more available unique keys (formally called candidate keys). One of those keys per table may be designated the primary key; other keys are called alternate keys.

Any key may consist of one or more attributes. For example, a Social Security Number might be a single attribute key for an employee; a combination of flight number and date might be a key consisting of two attributes for a scheduled flight.

There are several types of keys used in database modeling and implementations.

Key NameDefinition
SimpleA key made from only one attribute.
ConcatenatedA key made from more than one attribute joined together as a single key, such as part or whole name with a system generated number appended as often used for E-mail addresses.
Compound A key made from at least two attributes or simple keys, only simple keys exist in a compound key.
Composite Like a compound key, but the individual attributes need not be simple keys.
Natural A key made from data that exists outside the current database. In other words, the data is not system generated, such as a social security number imported from another system.
Surrogate An artificial key made from data that is system assigned or generated when another candidate key exists. Surrogate keys are usually numeric ID values and often used for performance reasons.[ citation needed ]
CandidateA key that may become the primary key.
PrimaryThe key that is selected as the primary key. Only one key within an entity is selected to be the primary key. This is the key that is allowed to migrate to other entities to define the relationships that exist among the entities. When the data model is instantiated into a physical database, it is the key that the system uses the most when accessing the table, or joining the tables together when selecting data.
AlternateA non-primary key that can be used to identify only one row in a table. Alternate keys may be used like a primary key in a single-table select.
Foreign A key that has migrated to another entity.

At the most basic definition, "a key is a unique identifier", [1] so unique key is a pleonasm. Keys that are within their originating entity are unique within that entity. Keys that migrate to another entity may or may not be unique, depending on the design and how they are used in the other table. Foreign keys may be the primary key in another table; for example a PersonID may become the EmployeeID in the Employee table. In this case, the EmployeeID is both a foreign key and the unique primary key, meaning that the tables have a 1:1 relationship. In the case where the person entity contained the biological father ID, the father ID would not be expected to be unique because a father may have more than one child.

Here is an example of a primary key becoming a foreign key on a related table. ID migrates from the Author table to the Book table.

AuthorTableSchema:Author(ID,Name,Address,Born)BookTableSchema:Book(ISBN,AuthorID,Title,Publisher,Price)

Here ID serves as the primary key in the table 'Author', but also as AuthorID serves as a Foreign Key in the table 'Book'. The Foreign Key serves as the link, and therefore the connection, between the two related tables in this sample database.

In a relational database, a candidate key uniquely identifies each row of data values in a database table. A candidate key comprises a single column or a set of columns in a single database table. No two distinct rows or data records in a database table can have the same data value (or combination of data values) in those candidate key columns since NULL values are not used. Depending on its design, a database table may have many candidate keys but at most one candidate key may be distinguished as the primary key.

A key constraint applies to the set of tuples in a table at any given point in time. A key is not necessarily a unique identifier across the population of all possible instances of tuples that could be stored in a table but it does imply a data integrity rule that duplicates should not be allowed in the database table. Some possible examples of keys are Social Security Numbers, ISBNs, vehicle registration numbers or user login names.

In principle any key may be referenced by foreign keys. Some SQL DBMSs only allow a foreign key constraint against a primary key but most systems will allow a foreign key constraint to reference any key of a table.

Defining keys in SQL

The definition of keys in SQL:

ALTERTABLE<tableidentifier>ADD[CONSTRAINT<constraintidentifier>]{PRIMARYKEY|UNIQUE}(<columnname>[{,<columnname>}...])

Likewise, keys can be defined as part of the CREATE TABLE SQL statement.

CREATETABLEtable_name(id_colINT,col2CHARACTERVARYING(20),key_colSMALLINTNOTNULL,...CONSTRAINTkey_uniqueUNIQUE(key_col),...)
CREATETABLEtable_name(id_colINTPRIMARYKEY,col2CHARACTERVARYING(20),...key_colSMALLINTNOTNULLUNIQUE,...)

Differences between primary key constraint and unique constraint

Primary key constraint

  1. A primary key cannot allow null (a primary key cannot be defined on columns that allow nulls).
  2. Each table cannot have more than one primary key.
  3. On some RDBMS a primary key generates a clustered index by default.

Unique constraint

  1. A unique constraint can be defined on columns that allow nulls, in which case rows that include nulls may not actually be unique across the set of columns defined by the constraint.
  2. Each table can have multiple unique constraints.
  3. On some RDBMS a unique constraint generates a nonclustered index by default.

Note that unlike the PRIMARY KEY constraint a UNIQUE constraint does not imply NOT NULL for the columns participating in the constraint. NOT NULL must be specified to make the column(s) a key. It is possible to put UNIQUE constraints on nullable columns but the SQL standard states that the constraint does not guarantee uniqueness of nullable columns (uniqueness is not enforced for rows where any of the columns contains a null).

According to the SQL [2] standard a unique constraint does not enforce uniqueness in the presence of nulls and can therefore contain several rows with identical combinations of nulls and non-null values — however not all RDBMS implement this feature according to the SQL standard. [3] [4]

See also

Related Research Articles

Database normalization is the process of structuring a relational database in accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by British computer scientist Edgar F. Codd as part of his relational model.

A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A database management system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called NoSQL systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow JSON.

In the relational model of databases, a primary key is a specific choice of a minimal set of attributes (columns) that uniquely specify a tuple (row) in a relation (table). Informally, a primary key is "which attributes identify a record," and in simple cases constitute a single attribute: a unique ID. More formally, a primary key is a choice of candidate key ; any other candidate key is an alternate key.

A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples consisting of the foreign key attributes in one relation, R, must also exist in some other relation, S; furthermore that those attributes must also be a candidate key in S.

A candidate key, or simply a key, of a relational database is any set of columns that have a unique combination of values in each row, with the additional constraint that removing any column could produce duplicate combinations of values.

<span class="mw-page-title-main">Referential integrity</span> Where all data references are valid

Referential integrity is a property of data stating that all its references are valid. In the context of relational databases, it requires that if a value of one attribute (column) of a relation (table) references a value of another attribute, then the referenced value must exist.

In the context of SQL, data definition or data description language (DDL) is a syntax for creating and modifying database objects such as tables, indices, and users. DDL statements are similar to a computer programming language for defining data structures, especially database schemas. Common examples of DDL statements include CREATE, ALTER, and DROP.

A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data, unlike a natural key.

<span class="mw-page-title-main">Join (SQL)</span> SQL clause

A join clause in the Structured Query Language (SQL) combines columns from one or more tables into a new table. The operation corresponds to a join operation in relational algebra. Informally, a join stitches two tables and puts on the same row records with matching fields : INNER, LEFT OUTER, RIGHT OUTER, FULL OUTER and CROSS.

An SQL INSERT statement adds one or more records to any single table in a relational database.

Entity integrity is concerned with ensuring that each row of a table has a unique and non-null primary key value; this is the same as saying that each row in a table represents a single instance of the entity type modelled by the table. A requirement of E. F. Codd in his seminal paper is that a primary key of an entity, or any part of it, can never take a null value. The relational model states that every relation must have an identifier, called the primary key, in such a way that every row of the same relation be identifiable by its content, that is, by a unique and minimal value. The PK is a not empty set of attributes. The same format applies to the foreign key because each FK matches a preexistent PK. Each of attributes being part of a PK must have data values but not data marks. Morphologically, a composite primary key is in a "steady state": If it is reduced, PK will lose its property of identifying every row of its relation but if it is extended, PK will be redundant.

In the relational data model a superkey is any set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, non-key attributes are functionally dependent on the superkey.

A table is a collection of related data held in a table format within a database. It consists of columns and rows.

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time said table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

<span class="mw-page-title-main">Null (SQL)</span> Marker used in SQL databases to indicate a value does not exist

In SQL, null or NULL is a special marker used to indicate that a data value does not exist in the database. Introduced by the creator of the relational database model, E. F. Codd, SQL null serves to fulfil the requirement that all true relational database management systems (RDBMS) support a representation of "missing information and inapplicable information". Codd also introduced the use of the lowercase Greek omega (ω) symbol to represent null in database theory. In SQL, NULL is a reserved word used to identify this marker.

<span class="mw-page-title-main">Virtuoso Universal Server</span> Computer software

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional relational database management system (RDBMS), object–relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is a "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

<span class="mw-page-title-main">Database model</span> Type of data model

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

In relational databases, the log trigger or history trigger is a mechanism for automatic recording of information about changes inserting or/and updating or/and deleting rows in a database table.

References

  1. Awad, Elias (1985), Systems Analysis and Design, Second Edition , Richard D. Irwin, Inc., ISBN   0-256-02824-9
  2. Summary of ANSI/ISO/IEC SQL Archived April 25, 2012, at the Wayback Machine
  3. "Constraints - SQL Database Reference Material - Learn sql, read an sql manual, follow an sql tutorial, or learn how to structure an SQL query!". www.sql.org. Retrieved 16 August 2018.
  4. "Comparison of different SQL implementations". troels.arvin.dk. Retrieved 16 August 2018.