Composite key

Last updated

In database design, a composite key is a candidate key that consists of two or more attributes, [1] [2] [3] (table columns) that together uniquely identify an entity occurrence (table row).

Contents

A compound key is a composite key for which each attribute that makes up the key is a foreign key in its own right.[ citation needed ]

Advantages

Composite keys have advantages similar to that of a natural key as it is often composed of multiple natural key attributes.

Storage

Composite keys use less disk space as compared to defining a surrogate key column, this is because the composite key already exists as attributes in the table and does not need to be defined in the table just for the purpose of unique identification. This simplifies the table and also saves space.

Easier to implement and use

Composite keys are easy to implement in a database schema as their component parts are already named items in the database. When they are also natural keys, they are often intuitive for real world scenarios. They are often used when a non-composite key does not always uniquely identify a record. For example, a personal name may often, but not always, be unique in a given database, and some other field such as date of birth may be added to make uniqueness much more probable.

Disadvantages

Requirement changes

The business requirements and rules can change which can change the format of certain real world entities. Composite keys are formed of multiple natural keys which are related to the real world and with the change of their format in the real world, their format in the database will also be changed. This is inconvenient as the number of attributes of composite key will change and all the foreign keys would need to be updated.

Complexity and storage

A composite key consists of multiple attributes and the composite key will be referenced in multiple tables as the foreign key, this uses a lot of disk space as multiple columns are being stored as the foreign key instead of just possibly one. This makes the schema complex and the queries become more CPU expensive as for every join the DBMS will need to compare three attributes instead of just possibly one in case of a single natural key.

Example

An example is an entity that represents the modules each student is attending at University. The entity has a studentID and a moduleCode as its primary key. Each of the attributes that makes up the primary key is a simple key because each represents a unique reference when identifying a student in one instance and a module in the other, so this key is a compound key.

In contrast, using the same example, imagine we identified a student by their firstName + lastName (assuming that people must have different names). In a table representing students our primary key would now be firstName + lastName. Because students can have the same firstName or the same lastName these attributes are not simple keys. The primary key firstName + lastName for students is a composite key.

See also

Related Research Articles

A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A database management system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

<span class="mw-page-title-main">Extract, transform, load</span> Procedure in computing

In computing, extract, transform, load (ETL) is a three-phase process where data is extracted from an input source, transformed, and loaded into an output data container. The data can be collated from one or more sources and it can also be output to one or more destinations. ETL processing is typically executed using software applications but it can also be done manually by system operators. ETL software typically automates the entire process and can be run manually or on recurring schedules either as single jobs or aggregated into a batch of jobs.

In the relational model of databases, a primary key is a specific choice of a minimal set of attributes (columns) that uniquely specify a tuple (row) in a relation (table). Informally, a primary key is "which attributes identify a record," and in simple cases constitute a single attribute: a unique ID. More formally, a primary key is a choice of candidate key ; any other candidate key is an alternate key.

A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples consisting of the foreign key attributes in one relation, R, must also exist in some other relation, S; furthermore that those attributes must also be a candidate key in S.

<span class="mw-page-title-main">Referential integrity</span> Where all data references are valid

Referential integrity is a property of data stating that all its references are valid. In the context of relational databases, it requires that if a value of one attribute (column) of a relation (table) references a value of another attribute, then the referenced value must exist.

In the context of SQL, data definition or data description language (DDL) is a syntax for creating and modifying database objects such as tables, indices, and users. DDL statements are similar to a computer programming language for defining data structures, especially database schemas. Common examples of DDL statements include CREATE, ALTER, and DROP.

A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data, unlike a natural key.

<span class="mw-page-title-main">Data dictionary</span> Set of metadata that contains definitions and representations of data elements

A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". Oracle defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to databases and database management systems (DBMS):

<span class="mw-page-title-main">Star schema</span> Data warehousing schema

In computing, the star schema or star model is the simplest style of data mart schema and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more fact tables referencing any number of dimension tables. The star schema is an important special case of the snowflake schema, and is more effective for handling simpler queries.

A natural key is a type of unique key in a database formed of attributes that exist and are used in the external world outside the database. In the relational model of data, a natural key is a superkey and is therefore a functional determinant for all attributes in a relation.

Database design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model. A database management system manages the data accordingly.

<span class="mw-page-title-main">Fact table</span> Business data structure

In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is located at the center of a star schema or a snowflake schema surrounded by dimension tables. Where multiple fact tables are used, these are arranged as a fact constellation schema. A fact table typically has two types of columns: those that contain facts and those that are a foreign key to dimension tables. The primary key of a fact table is usually a composite key that is made up of all of its foreign keys. Fact tables contain the content of the data warehouse and store different types of measures like additive, non-additive, and semi-additive measures.

Object–relational impedance mismatch is a set of difficulties going between data in relational data stores and data in domain-driven object models. Relational Database Management Systems (RDBMS) is the standard method for storing data in a dedicated database, while object-orientated (OO) programming is the default method for business-centric design in programming languages. The problem lies in neither relational databases nor OO programming, but in the conceptual difficulty mapping between the two logic models. Both logical models are differently implementable using database servers, programming languages, design patterns, or other technologies. Issues range from application to enterprise scale, whenever stored relational data is used in domain-driven object models, and vice versa. Object-oriented data stores can trade this problem for other implementation difficulties.

A slowly changing dimension (SCD) in data management and data warehousing is a dimension which contains relatively static data which can change slowly but unpredictably, rather than according to a regular schedule. Some examples of typical slowly changing dimensions are entities such as names of geographical locations, customers, or products.

<span class="mw-page-title-main">Virtuoso Universal Server</span> Computer software

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional relational database management system (RDBMS), object–relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is a "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

An entity–attribute–value model (EAV) is a data model optimized for the space-efficient storage of sparse—or ad-hoc—property or data values, intended for situations where runtime usage patterns are arbitrary, subject to user variation, or otherwise unforeseeable using a fixed design. The use-case targets applications which offer a large or rich system of defined property types, which are in turn appropriate to a wide set of entities, but where typically only a small, specific selection of these are instantiated for a given entity. Therefore, this type of data model relates to the mathematical notion of a sparse matrix. EAV is also known as object–attribute–value model, vertical database model, and open schema.

In relational database management systems, a unique key is a candidate key. All the candidate keys of a relation can uniquely identify the records of the relation, but only one of them is used as the primary key of the relation. The remaining candidate keys are called unique keys because they can uniquely identify a record in a relation. Unique keys can consist of multiple columns. Unique keys are also called alternate keys. Unique keys are an alternative to the primary key of the relation. In SQL, the unique keys have a UNIQUE constraint assigned to them in order to prevent duplicates. Alternate keys may be used like the primary key when doing a single-table select or when filtering in a where clause, but are not typically used to join multiple tables.

<span class="mw-page-title-main">Database model</span> Type of data model

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

Knowledge extraction is the creation of knowledge from structured and unstructured sources. The resulting knowledge needs to be in a machine-readable and machine-interpretable format and must represent knowledge in a manner that facilitates inferencing. Although it is methodically similar to information extraction (NLP) and ETL, the main criterion is that the extraction result goes beyond the creation of structured information or the transformation into a relational schema. It requires either the reuse of existing formal knowledge or the generation of a schema based on the source data.

The following is provided as an overview of and topical guide to databases:

References

  1. Connolly, Thomas M.; Begg, Carolyn E. (2015). "12.3.4 Keys". Database systems: a practical approach to design, implementation, and management (6., global ed.). Boston Columbus Indianapolis: Pearson. p. 416. ISBN   978-1-292-06118-4.
  2. Elmasri, Ramez; Navathe, Sham (2017). "17.4 Indexes on Multiple Keys". Fundamentals of database systems (Seventh, global ed.). Boston Columbus Indianapolis New York San Francisco Hoboken Amsterdam Cape Town Dubai London Madrid Milan Munich Paris Montreal Toronto Delhi Mexico City São Paulo Sydney Hong Kong Seoul Singapore Taipei Tokyo: Pearson. p. 661. ISBN   978-1-292-09761-9.
  3. Coronel, Carlos; Morris, Steven (2015). "Glossary". Database systems: design, implementation, and management (12e ed.). Cengage Learning. p. 770. ISBN   978-1-305-62748-2.