Centralized database

Last updated March 26, 2024

A centralized database (sometimes abbreviated CDB) is a database that is located, stored, and maintained in a single location. This location is most often a central computer or database system, for example a desktop or server CPU, or a mainframe computer. In most cases, a centralized database would be used by an organization (e.g. a business company) or an institution (e.g. a university.) Users access a centralized database through a computer network which is able to give them access to the central CPU, which in turn maintains to the database itself.^[1]^[2]

Historical context

The need for databases rose in the 60's with the invention of direct access storage, which allowed users to directly access records. Previously, computer systems were tape based, meaning records could only be accessed sequentially.^[3] Organizations quickly adopted databases for storage and retrieval of data. The traditional approach for storing data was to use a centralized database, and users would query the data from various points over a network.^[1]

An example for a centralized database could be given with the Australian Department of Defense, which centralized their databases in the mid 1970s.^[3]

Advantages

Centralized databases hold a substantial amount of advantages against other types of databases. Some of them are listed below:

Data integrity is maximized and data redundancy is minimized, as the single storing place of all the data also implies that a given set of data only has one primary record. This aids in the maintaining of data as accurate and as consistent as possible and enhances data reliability.^[4]
Central host computer can be more easily protected from unauthorized access.^[4]
Generally easier data portability and database administration.
Data kept in the same location is easier to be changed, re-organized, mirrored, or analyzed
Transactions can more easily comply with the properties of ACID.^[5]

Disadvantages

Centralized databases also have a certain amount of limitations, such as those described below:

Access speed is limited by network speed.^[4]
The central computer is a single point of failure, if the computer experiences downtime, users will not be able to access any data.
If there is no fault-tolerant setup and hardware failure occurs, all the data within the database will be lost.
If someone accesses the central computer, all of the data can easily be compromised.
Difficult to scale as the centralized computer would need to be replaced to scale up.^[6]

Centralized databases vs. Distributed databases

The underlying idea of centralized databases is that they should be able to receive, maintain, and complete every single request that the main system must perform by themselves. There is only one database file, kept at a single location on a given network.

A distributed database, however, is a database in which all the information is stored on multiple physical locations.^[7] Distributed databases are divided into two groups: homogeneous and heterogeneous. It relies on replication and duplication within its multiple sub-databases in order to maintain its records up to date. It is composed of multiple database files, all controlled by a central DBMS.

The main differences between centralized and distributed databases arise due to their respective basic characteristics. Differences include but are not limited to:

Centralized databases store data on a single CPU bound to a single certain physical/geographical location. Distributed databases, however, rely on a central DBMS which manages all its different storage devices remotely, as it is not necessary for them to be kept in the same physical and/or geographical location.
As outlined above, centralized databases are easier to maintain up to date than distributed databases. This is so because distributed databases require additional (often manual) work to keep the data stored relevant, and to avoid data redundancy, as well as to improve the overall performance.^[8]
If data is lost in a centralized system, retrieving it would be much harder. If, however, data is lost in a distributed system, retrieving it would be very easy, because there is always a copy of the data in a different location of the database.
Designing a centralized database is generally much less complex than designing a distributed database, as distributed database systems are based on a hierarchical structure.

Related Research Articles

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

In computing, a data warehouse, also known as an enterprise data warehouse (EDW), is a system used for reporting and data analysis and is considered a core component of business intelligence. Data warehouses are central repositories of integrated data from one or more disparate sources. They store current and historical data in one single place that are used for creating reports. This is beneficial for companies as it enables them to interrogate and draw insights from their data and make decisions.

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

<span class="mw-page-title-main">Thin client</span> Non-powerful computer optimized for remote server access

In computer networking, a thin client is a simple (low-performance) computer that has been optimized for establishing a remote connection with a server-based computing environment. They are sometimes known as network computers, or in their simplest form as zero clients. The server does most of the work, which can include launching software programs, performing calculations, and storing data. This contrasts with a rich client or a conventional personal computer; the former is also intended for working in a client–server model but has significant local processing power, while the latter aims to perform its function mostly locally.

A distributed database is a database in which data is stored across different physical locations. It may be stored in multiple computers located in the same physical location ; or maybe dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components.

Scalability is the property of a system to handle a growing amount of work. One definition for software systems specifies that this may be done by adding resources to the system.

A database engine is the underlying software component that a database management system (DBMS) uses to create, read, update and delete (CRUD) data from a database. Most database management systems include their own application programming interface (API) that allows the user to interact with their underlying engine without going through the user interface of the DBMS.

A federated database system (FDBS) is a type of meta-database management system (DBMS), which transparently maps multiple autonomous database systems into a single federated database. The constituent databases are interconnected via a computer network and may be geographically decentralized. Since the constituent database systems remain autonomous, a federated database system is a contrastable alternative to the task of merging several disparate databases. A federated database, or virtual database, is a composite of all constituent databases in a federated database system. There is no actual data integration in the constituent disparate databases as a result of data federation.

A diskless node is a workstation or personal computer without disk drives, which employs network booting to load its operating system from a server.

Network transparency, in its most general sense, refers to the ability of a protocol to transmit data over the network in a manner which is not observable to those using the applications that are using the protocol. In this way, users of a particular application may access remote resources in the same manner in which they would access their own local resources. An example of this is cloud storage, where remote files are presented as being locally accessible, and cloud computing where the resource in question is processing.

Content-addressable storage (CAS), also referred to as content-addressed storage or fixed-content storage, is a way to store information so it can be retrieved based on its content, not its name or location. It has been used for high-speed storage and retrieval of fixed content, such as documents stored for compliance with government regulations. Content-addressable storage is similar to content-addressable memory.

A grid file system is a computer file system whose goal is improved reliability and availability by taking advantage of many smaller file storage areas.

A clustered file system (CFS) is a file system which is shared by being simultaneously mounted on multiple servers. There are several approaches to clustering, most of which do not employ a clustered file system. Clustered file systems can provide features like location-independent addressing and redundancy which improve reliability or reduce the complexity of the other parts of the cluster. Parallel file systems are a type of clustered file system that spread data across multiple storage nodes, usually for redundancy or performance.

Cloud storage is a model of computer data storage in which data, said to be on "the cloud", is stored remotely in logical pools and is accessible to users over a network, typically the Internet. The physical storage spans multiple servers, and the physical environment is typically owned and managed by a cloud computing provider. These cloud storage providers are responsible for keeping the data available and accessible, and the physical environment secured, protected, and running. People and organizations buy or lease storage capacity from the providers to store user, organization, or application data.

In computing, virtualization or virtualisation in British English is the act of creating a virtual version of something at the same abstraction level, including virtual computer hardware platforms, storage devices, and computer network resources.

The three-schema approach, or three-schema concept, in software engineering is an approach to building information systems and systems information management that originated in the 1970s. It proposes three different views in systems development, with conceptual modelling being considered the key to achieving data integration.

In computer storage, a global file system is a distributed file system that can be accessed from multiple locations, typically across a wide-area network, and provides concurrent access to a global namespace from all locations. In order for a file system to be considered global, it must allow for files to be created, modified, and deleted from any location. This access is typically provided by a cloud storage gateway at each edge location, which provides access using the NFS or SMB network file sharing protocols.

A cooperative storage cloud is a decentralized model of networked online storage where data is stored on multiple computers (nodes), hosted by the participants cooperating in the cloud. For the cooperative scheme to be viable, the total storage contributed in aggregate must be at least equal to the amount of storage needed by end users. However, some nodes may contribute less storage and some may contribute more. There may be reward models to compensate the nodes contributing more.

<span class="mw-page-title-main">Shadow table</span> Abstract object in computer science

Shadow tables are objects in computer science used to improve the way machines, networks and programs handle information. More specifically, a shadow table is an object that is read and written by a processor and contains data similar to its primary table, which is the table it's "shadowing". Shadow tables usually contain data that is relevant to the operation and maintenance of its primary table, but not within the subset of data required for the primary table to exist. Shadow tables are related to the data type "trails" in data storage systems. Trails are very similar to shadow tables but instead of storing identically formatted information that is different, they store a history of modifications and functions operated on a table.

The following is provided as an overview of and topical guide to databases:

References

1 2 Turban, Efraim; Carol Pollard; Gregory R. Wood (2021). Information technology for management: driving digital transformation to increase local and global performance, growth and sustainability (Twelfth ed.). p. 71. ISBN 978-1-119-70290-0. OCLC 1333952841.
↑ Neelankavil, James P. (2007). International business research. Armonk, N.Y.: M.E. Sharpe. pp. 96–97. ISBN 978-1-317-42545-8. OCLC 647515744.
1 2 Lake, Peter (2013). Concise guide to databases: a practical introduction. Paul Crowther. ISBN 978-1-4471-5601-7. OCLC 868889675.
1 2 3 Sumathi, S. (2007). Fundamentals of relational database management systems. S. Esakkirajan. Berlin: Springer. ISBN 978-3-540-48399-1. OCLC 184984668.
↑ Iacob, Nicoleta Magdalena; Moise, Mirela Liliana (December 2015). "Centralized vs. Distributed Databases. Case Study" (PDF). Academic Journal of Economic Studies. 1 (4). Archived from the original (PDF) on 2022-11-23. Retrieved 2022-11-23.
↑ Silberschatz, Abraham; Henry F. Korth; S. Sudarshan (2011). Database system concepts (Sixth ed.). New York: McGraw-Hill. ISBN 978-0-07-352332-3. OCLC 436031093.
↑ "Wikispaces".
↑ "Q. What are differences in Centralized and Distributed Database Systems? List the relative advantages of data distribution? - Solved Assignments". Archived from the original on 2023-05-02. Retrieved 2014-10-29.

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[:0-1] 1 2 Turban, Efraim; Carol Pollard; Gregory R. Wood (2021). Information technology for management: driving digital transformation to increase local and global performance, growth and sustainability (Twelfth ed.). p. 71. ISBN 978-1-119-70290-0. OCLC 1333952841.

[2] Neelankavil, James P. (2007). International business research. Armonk, N.Y.: M.E. Sharpe. pp. 96–97. ISBN 978-1-317-42545-8. OCLC 647515744.

[:1-3] 1 2 Lake, Peter (2013). Concise guide to databases: a practical introduction. Paul Crowther. ISBN 978-1-4471-5601-7. OCLC 868889675.

[:2-4] 1 2 3 Sumathi, S. (2007). Fundamentals of relational database management systems. S. Esakkirajan. Berlin: Springer. ISBN 978-3-540-48399-1. OCLC 184984668.

[5] Iacob, Nicoleta Magdalena; Moise, Mirela Liliana (December 2015). "Centralized vs. Distributed Databases. Case Study" (PDF). Academic Journal of Economic Studies. 1 (4). Archived from the original (PDF) on 2022-11-23. Retrieved 2022-11-23.

[6] Silberschatz, Abraham; Henry F. Korth; S. Sudarshan (2011). Database system concepts (Sixth ed.). New York: McGraw-Hill. ISBN 978-0-07-352332-3. OCLC 436031093.

[7] "Wikispaces".

[8] "Q. What are differences in Centralized and Distributed Database Systems? List the relative advantages of data distribution? - Solved Assignments". Archived from the original on 2023-05-02. Retrieved 2014-10-29.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]