Weak entity

Last updated

In a relational database, a weak entity is an entity that cannot be uniquely identified by its attributes alone; therefore, it must use a foreign key in conjunction with its attributes to create a primary key. The foreign key is typically a primary key of an entity it is related to.

Contents

The foreign key is an attribute of the identifying (or owner, parent, or dominant) entity set. Each element in the weak entity set must have a relationship with exactly one element in the owner entity set, [1] and therefore, the relationship cannot be a many-to-many relationship.

Two entities can be associated without either being classified as weak, even if one depends on the other, as long as each has its own unique attribute. [1] For instance, a person entity can be linked to a car without one of them being considered as weak entity.

In entity relationship diagrams (ER diagrams), a weak entity set is indicated by a bold (or double-lined) rectangle (the entity) connected by a bold (or double-lined) type arrow to a bold (or double-lined) diamond (the relationship). This type of relationship is called an identifying relationship and in IDEF1X notation it is represented by an oval entity rather than a square entity for base tables. An identifying relationship is one where the primary key is populated to the child weak entity as a primary key in that entity.

In general (though not necessarily) a weak entity does not have any items in its primary key other than its inherited primary key and a sequence number. There are two types of weak entities: associative entities and subtype entities. The latter represents a crucial type of normalization, where the super-type entity inherits its attributes to subtype entities based on the value of the discriminator.

In IDEF1X, a government standard for capturing requirements, possible sub-type relationships are:

A classic example of a weak entity without a sub-type relationship would be the "header/detail' records in many real world situations such as claims, orders and invoices, where the header captures information common across all forms and the detail captures information specific to individual items.

The standard example of a complete subtype relationship is the party entity. Given the discriminator PARTY TYPE (which could be individual, partnership, C Corporation, Sub Chapter S Association, Association, Governmental Unit, Quasi-governmental agency) the two subtype entities are PERSON, which contains individual-specific information such as first and last name and date of birth, and ORGANIZATION, which would contain such attributes as the legal name, and organizational hierarchies such as cost centers.

When sub-type relationships are rendered in a database, the super-type becomes what is referred to as a base table. The sub-types are considered derived tables, which correspond to weak entities. Referential integrity is enforced via cascading updates and deletes.

Example

Consider a database that records customer orders, where an order is for one or more of the items that the enterprise sells. The database would contain a table identifying customers by a customer number (primary key); another identifying the products that can be sold by a product number (primary key); and it would contain a pair of tables describing orders.

Weak entity ER-example.svg

One of the tables could be called Orders and it would have an order number (primary key) to identify this order uniquely, and would contain a customer number (foreign key) to identify who the products are being sold to, plus other information such as the date and time when the order was placed, how it will be paid for, where it is to be shipped to, and so on.

The other table could be called OrderItem; it would be identified by a compound key consisting of both the order number (foreign key) and an item line number; with other non-primary key attributes such as the product number (foreign key) that was ordered, the quantity, the price, any discount, any special options, and so on. There may be zero, one or many OrderItem entries corresponding to an Order entry, but no OrderItem entry can exist unless the corresponding Order entry exists. (The zero OrderItem case normally only applies transiently, when the order is first entered and before the first ordered item has been recorded.)

The OrderItem table stores weak entities precisely because an OrderItem has no meaning independent of the Order. Some might argue that an OrderItem does have some meaning on its own; it records that at some time not identified by the record, somebody not identified by the record ordered a certain quantity of a certain product. This information might be of some use on its own, but it is of limited use. For example, as soon as you want to find seasonal or geographical trends in the sales of the item, you need information from the related Order record.

An order would not exist without a product and a person to create the order, so it could be argued that an order would be described as a weak entity and that products ordered would be a multivalue attribute of the order.

See also

Related Research Articles

A relational database (RDB) is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A database management system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

The relational model (RM) is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

<span class="mw-page-title-main">Object–relational database</span> Database management system

An object–relational database (ORD), or object–relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model: objects, classes and inheritance are directly supported in database schemas and in the query language. In addition, just as with pure relational systems, it supports extension of the data model with custom data types and methods.

First normal form (1NF) is a property of a relation in a relational database. A relation is in first normal form if and only if no attribute domain has relations as elements. Or more informally, that no table column can have tables as values. Database normalization is the process of representing a database in terms of relations in standard normal forms, where first normal is a minimal requirement. SQL-92 does not support creating or using table-valued columns, which means that using only the "traditional relational database features" most relational databases will be in first normal form by necessity. Database systems which do not require first normal form are often called NoSQL systems. Newer SQL standards like SQL:1999 have started to allow so called non-atomic types, which include composite types. Even newer versions like SQL:2016 allow JSON.

A foreign key is a set of attributes in a table that refers to the primary key of another table, linking these two tables. In the context of relational databases, a foreign key is subject to an inclusion dependency constraint that the tuples consisting of the foreign key attributes in one relation, R, must also exist in some other relation, S; furthermore that those attributes must also be a candidate key in S.

A surrogate key in a database is a unique identifier for either an entity in the modeled world or an object in the database. The surrogate key is not derived from application data, unlike a natural key.

<span class="mw-page-title-main">Entity–relationship model</span> Model or diagram describing interrelated things

An entity–relationship model describes interrelated things of interest in a specific domain of knowledge. A basic ER model is composed of entity types and specifies relationships that can exist between entities.

In database design, a composite key is a candidate key that consists of two or more attributes, that together uniquely identify an entity occurrence.

Database design is the organization of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model. A database management system manages the data accordingly.

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time said table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

<span class="mw-page-title-main">IDEF1X</span>

Integration DEFinition for information modeling (IDEF1X) is a data modeling language for the development of semantic data models. IDEF1X is used to produce a graphical information model which represents the structure and semantics of information within an environment or system.

<span class="mw-page-title-main">Dimension (data warehouse)</span> Structure that categorizes facts and measures in a data warehouse

A dimension is a structure that categorizes facts and measures in order to enable users to answer business questions. Commonly used dimensions are people, products, place and time.

A slowly changing dimension (SCD) in data management and data warehousing is a dimension which contains relatively static data which can change slowly but unpredictably, rather than according to a regular schedule. Some examples of typical slowly changing dimensions are entities such as names of geographical locations, customers, or products.

An entity–attribute–value model (EAV) is a data model optimized for the space-efficient storage of sparse—or ad-hoc—property or data values, intended for situations where runtime usage patterns are arbitrary, subject to user variation, or otherwise unforeseeable using a fixed design. The use-case targets applications which offer a large or rich system of defined property types, which are in turn appropriate to a wide set of entities, but where typically only a small, specific selection of these are instantiated for a given entity. Therefore, this type of data model relates to the mathematical notion of a sparse matrix. EAV is also known as object–attribute–value model, vertical database model, and open schema.

In relational database management systems, a unique key is a candidate key. All the candidate keys of a relation can uniquely identify the records of the relation, but only one of them is used as the primary key of the relation. The remaining candidate keys are called unique keys because they can uniquely identify a record in a relation. Unique keys can consist of multiple columns. Unique keys are also called alternate keys. Unique keys are an alternative to the primary key of the relation. In SQL, the unique keys have a UNIQUE constraint assigned to them in order to prevent duplicates. Alternate keys may be used like the primary key when doing a single-table select or when filtering in a where clause, but are not typically used to join multiple tables.

<span class="mw-page-title-main">EXPRESS (data modeling language)</span> Standard data modeling language for product data

EXPRESS is a standard for generic data modeling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP, and standardized as ISO 10303-11.

Relational Model/Tasmania (RM/T) was published by Edgar F. Codd in 1979 and is the name given to a number of extensions to his original relational model (RM) published in 1970. The overall goal of the RM/T was to define some fundamental semantic units, at "atomic" and "molecular" levels, for data modelling. Codd writes: "the result is a model with a richer variety of objects than the original relational model, additional insert-update-delete rules and some additional operators that make the algebra more powerful."

Within data modelling, cardinality is the numerical relationship between rows of one table and rows in another. Common cardinalities include one-to-one, one-to-many, and many-to-many. Cardinality can be used to define data models as well as analyze entities within datasets.

<span class="mw-page-title-main">Database model</span> Type of data model

A database model is a type of data model that determines the logical structure of a database. It fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

Microsoft SQL Server Master Data Services (MDS) is a Master Data Management (MDM) product from Microsoft that ships as a part of the Microsoft SQL Server relational database management system. Master data management (MDM) allows an organization to discover and define non-transactional lists of data, and compile maintainable, reliable master lists. Master Data Services first shipped with Microsoft SQL Server 2008 R2. Microsoft SQL Server 2016 introduced enhancements to Master Data Services, such as improved performance and security, and the ability to clear transaction logs, create custom indexes, share entity data between different models, and support for many-to-many relationships.

References

  1. 1 2 Elmasri, Ramez (2016). Fundamentals of database systems. Sham Navathe (Seventh ed.). Hoboken, NJ: Pearson Education. p. 79. ISBN   978-0-13-397077-7. OCLC   913842106.