Semi-structured model

Last updated

The semi-structured model is a database model where there is no separation between the data and the schema, and the amount of structure used depends on the purpose.

Database model generic structure of a database type, for instance relational

A database model is a type of data model that determines the logical structure of a database and fundamentally determines in which manner data can be stored, organized and manipulated. The most popular example of a database model is the relational model, which uses a table-based format.

Data (computing) quantities, characters, or symbols on which operations are performed by a computer

Data is any sequence of one or more symbols given meaning by specific act(s) of interpretation.

A Schema is a pictorial representation of the relationship between the database tables in the database that is created. The database schema of a database system is its structure described in a formal language supported by the database management system (DBMS). The term "schema" refers to the organization of data as a blueprint of how the database is constructed. The formal definition of a database schema is a set of formulas (sentences) called integrity constraints imposed on a database. These integrity constraints ensure compatibility between parts of the schema. All constraints are expressible in the same language. A database can be considered a structure in realization of the database language. The states of a created conceptual schema are transformed into an explicit mapping, the database schema. This describes how real-world entities are modeled in the database.

The advantages of this model are the following:

The primary trade-off being made in using a semi-structured database model is that queries cannot be made as efficiently as in a more constrained structure, such as in the relational model. Typically the records in a semi-structured database are stored with unique IDs that are referenced with pointers to their location on disk. This makes navigational or path-based queries quite efficient, but for doing searches over many records (as is typical in SQL), it is not as efficient because it has to seek around the disk following pointers.

Relational model database model

The relational model (RM) for database management is an approach to managing data using a structure and language consistent with first-order predicate logic, first described in 1969 by English computer scientist Edgar F. Codd, where all data is represented in terms of tuples, grouped into relations. A database organized in terms of the relational model is a relational database.

SQL is a domain-specific language used in programming and designed for managing data held in a relational database management system (RDBMS), or for stream processing in a relational data stream management system (RDSMS). It is particularly useful in handling structured data where there are relations between different entities/variables of the data. SQL offers two main advantages over older read/write APIs like ISAM or VSAM: first, it introduced the concept of accessing many records with one single command; and second, it eliminates the need to specify how to reach a record, e.g. with or without an index.

The Object Exchange Model (OEM) is one standard to express semi-structured data, another way is XML.

The Object Exchange Model (OEM) is a model for exchanging semi-structured data between object-oriented databases. It serves as the basic data model in numerous projects of the Stanford University Database Group, including Tsimmis, Lore, and C3.

XML Markup language developed by the W3C for encoding of data

Extensible Markup Language (XML) is a markup language that defines a set of rules for encoding documents in a format that is both human-readable and machine-readable. The W3C's XML 1.0 Specification and several other related specifications—all of them free open standards—define XML.

See also

Related Research Articles

Data model abstract model for organizing data; abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities

A data model is an abstract model that organizes elements of data and standardizes how they relate to one another and to properties of the real world entities. For instance, a data model may specify that the data element representing a car be composed of a number of other elements which, in turn, represent the color and size of the car and define its owner.

WinFS

WinFS was the code name for a canceled data storage and management system project based on relational databases, developed by Microsoft and first demonstrated in 2003 as an advanced storage subsystem for the Microsoft Windows operating system, designed for persistence and management of structured, semi-structured and unstructured data.

A navigational database is a type of database in which records or objects are found primarily by following references from other objects. The term was popularized by the title of Charles Bachman's 1973 Turing Award paper, The Programmer as Navigator. This paper emphasized the fact that the new disk-based database systems allowed the programmer to choose arbitrary navigational routes following relationships from record to record, contrasting this with the constraints of earlier magnetic-tape and punched card systems where data access was strictly sequential.

Data modeling (in software engineering) process of creating a data model for an information system by applying certain formal techniques

Data modeling in software engineering is the process of creating a data model for an information system by applying certain formal techniques.

In computing, the star schema is the simplest style of data mart schema and is the approach most widely used to develop data warehouses and dimensional data marts. The star schema consists of one or more fact tables referencing any number of dimension tables. The star schema is an important special case of the snowflake schema, and is more effective for handling simpler queries.

Database design is the organisation of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model.

GXL is designed to be a standard exchange format for graphs. GXL is an extensible markup language (XML) sublanguage and the syntax is given by an XML document type definition (DTD). This exchange format offers an adaptable and flexible means to support interoperability between graph-based tools.

A database index is a data structure that improves the speed of data retrieval operations on a database table at the cost of additional writes and storage space to maintain the index data structure. Indexes are used to quickly locate data without having to search every row in a database table every time a database table is accessed. Indexes can be created using one or more columns of a database table, providing the basis for both rapid random lookups and efficient access of ordered records.

Extensible Storage Engine (ESE), also known as JET Blue, is an ISAM data storage technology from Microsoft. ESE is the core of Microsoft Exchange Server, Active Directory, and Windows Search. It's also used by a number of Windows components including Windows Update client and Help and Support Center. Its purpose is to allow applications to store and retrieve data via indexed and sequential access.

The object-relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions must be mapped to database tables defined by a relational schema.

Entity–attribute–value model (EAV) is a data model to encode, in a space-efficient manner, entities where the number of attributes that can be used to describe them is potentially vast, but the number that will actually apply to a given entity is relatively modest. Such entities correspond to the mathematical notion of a sparse matrix.

EXPRESS (data modeling language)

EXPRESS is a standard data modeling language for product data. EXPRESS is formalized in the ISO Standard for the Exchange of Product model STEP, and standardized as ISO 10303-11.

Entity Framework (EF) is an open source object-relational mapping (ORM) framework for ADO.NET. It was a part of .NET Framework, but since Entity framework version 6 it is separated from .NET framework.

A triplestore or RDF store is a purpose-built database for the storage and retrieval of triples through semantic queries. A triple is a data entity composed of subject-predicate-object, like "Bob is 35" or "Bob knows Fred".

A correlation database is a database management system (DBMS) that is data-model-independent and designed to efficiently handle unplanned, ad hoc queries in an analytical system environment.

Semi-structured data is a form of structured data that does not conform with the formal structure of data models associated with relational databases or other forms of data tables, but nonetheless contains tags or other markers to separate semantic elements and enforce hierarchies of records and fields within the data. Therefore, it is also known as self-describing structure.

Sedna is an open-source database management system that provides native storage for XML data. The distinctive design decisions employed in Sedna are (i) schema-based clustering storage strategy for XML data and (ii) memory management based on layered address space.

The following outline is provided as an overview of and topical guide to databases: