This article includes a list of general references, but it lacks sufficient corresponding inline citations .(June 2024) |
Second normal form (2NF), in database normalization, is a normal form. A relation is in the second normal form if it fulfills the following two requirements:
Put simply, a relation (or table) is in 2NF if:
If any regular (non-prime) attributes are predictable (dependent) on another (non-prime) attribute, that is addressed in third normal form.
The second normal form was originally defined by E. F. Codd in 1971. [1]
To make a 1NF relation a 2NF relation, remove the functionally dependent attributes in the partial dependencies of the first normal form relation, and place those partial dependency dependent attributes in a relation where their corresponding determinant attributes are an entire candidate key.
The following relation does not satisfy 2NF because:
In other words, since {Manufacturer country} is a non-prime attribute functionally dependent on a proper subset of a candidate key, the relation is in violation of 2NF.
Manufacturer | Model | Manufacturer country |
---|---|---|
Forte | X-Prime | Italy |
Forte | Ultraclean | Italy |
Dent-o-Fresh | EZbrush | USA |
Brushmaster | SuperBrush | USA |
Kobayashi | ST-60 | Japan |
Hoch | Toothmaster | Germany |
Hoch | X-Prime | Germany |
To make the design conform to 2NF, it is necessary to have two relations. To create these relations:
As seen below, {Manufacturer country} is removed from the original table:
Manufacturer | Model |
---|---|
Forte | X-Prime |
Forte | Ultraclean |
Dent-o-Fresh | EZbrush |
Brushmaster | SuperBrush |
Kobayashi | ST-60 |
Hoch | Toothmaster |
Hoch | X-Prime |
As seen below, the partial dependency is put into a new relation where the dependency can exist without being a partial dependency:
Manufacturer | Manufacturer country |
---|---|
Forte | Italy |
Dent-o-Fresh | USA |
Brushmaster | USA |
Kobayashi | Japan |
Hoch | Germany |
It may now be desirable to add a numeric identifier to each row in the "Electric toothbrush models" table to ensure it remains in 1NF, or, the table may be left as is, allowing each unique combination of values in each row to be usable as a primary key, thus allowing the columns {Manufacturer, Model} in this table to be used as the candidate key of the table.
Database normalization is the process of structuring a relational database accordance with a series of so-called normal forms in order to reduce data redundancy and improve data integrity. It was first proposed by British computer scientist Edgar F. Codd as part of his relational model.
A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970.
The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.
In database theory, relational algebra is a theory that uses algebraic structures for modeling data and defining queries on it with well founded semantics. The theory was introduced by Edgar F. Codd.
First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called NoSQL systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow JSON.
Third normal form (3NF) is a database schema design approach for relational databases which uses normalizing principles to reduce the duplication of data, avoid data anomalies, ensure referential integrity, and simplify data management. It was defined in 1971 by Edgar F. Codd, an English computer scientist who invented the relational model for database management.
Fourth normal form (4NF) is a normal form used in database normalization. Introduced by Ronald Fagin in 1977, 4NF is the next level of normalization after Boyce–Codd normal form (BCNF). Whereas the second, third, and Boyce–Codd normal forms are concerned with functional dependencies, 4NF is concerned with a more general type of dependency known as a multivalued dependency. A table is in 4NF if and only if, for every one of its non-trivial multivalued dependencies XY, {X, Y} is a superkey—that is, the combination of all attributes in X and Y is either a candidate key or a superset thereof.
In relational database theory, a functional dependency is the following constraint between two attribute sets in a relation: Given a relation R and attribute sets , X is said to functionally determineY if each X value is associated with precisely one Y value. R is then said to satisfy the functional dependency X → Y. Equivalently, the projection is a function, that is, Y is a function of X. In simple words, if the values for the X attributes are known, then the values for the Y attributes corresponding to x can be determined by looking them up in any tuple of R containing x. Customarily X is called the determinant set and Y the dependent set. A functional dependency FD: X → Y is called trivial if Y is a subset of X.
A candidate key, or simply a key, of a relational database is any set of columns that have a unique combination of values in each row, with the additional constraint that removing any column could produce duplicate combinations of values.
Database design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model. A database management system manages the data accordingly.
In the relational data model a superkey is any set of attributes that uniquely identifies each tuple of a relation. Because superkey values are unique, tuples with the same superkey value must also have the same non-key attribute values. That is, non-key attributes are functionally dependent on the superkey.
Boyce–Codd normal form is a normal form used in database normalization. It is a slightly stricter version of the third normal form (3NF). By using BCNF, a database will remove all redundancies based on functional dependencies.
Fifth normal form (5NF), also known as projection–join normal form (PJ/NF), is a level of database normalization designed to remove redundancy in relational databases recording multi-valued facts by isolating semantically related multiple relationships. A table is said to be in the 5NF if and only if every non-trivial join dependency in that table is implied by the candidate keys. It is the final normal form as far as removing redundancy is concerned.
Sixth normal form (6NF) is a normal form used in relational database normalization which extends the relational algebra and generalizes relational operators to support interval data, which can be useful in temporal databases.
A transitive dependency is an indirect dependency relationship between software components. This kind of dependency is held by virtue of a transitive relation from a component that the software depends on directly.
Domain-key normal form is a normal form used in database normalization which requires that the database contains no constraints other than domain constraints and key constraints.
Relational Model/Tasmania (RM/T) was published by Edgar F. Codd in 1979 and is the name given to a number of extensions to his original relational model (RM) published in 1970. The overall goal of the RM/T was to define some fundamental semantic units, at "atomic" and "molecular" levels, for data modelling. Codd writes: "the result is a model with a richer variety of objects than the original relational model, additional insert-update-delete rules and some additional operators that make the algebra more powerful."
A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.
Elementary key normal form (EKNF) is a subtle enhancement on third normal form, thus EKNF tables are in 3NF by definition. This happens when there is more than one unique compound key and they overlap. Such cases can cause redundant information in the overlapping column(s).
In database normalization, unnormalized form (UNF or 0NF), also known as an unnormalized relation or non-first normal form (N1NF or NF2), is a database data model (organization of data in a database) which does not meet any of the conditions of database normalization defined by the relational model. Database systems which support unnormalized data are sometimes called non-relational or NoSQL databases. In the relational model, unnormalized relations can be considered the starting point for a process of normalization.