Zope Object Database

Last updated
Zope Object Database
Developer(s) Zope Foundation
Stable release
5.8.1 [1] / July 18, 2023;9 days ago (2023-07-18)
Repository github.com/zopefoundation/ZODB
Written in Python
Operating system Cross-platform
Type Object database
License Zope Public License
Website www.zodb.org

The Zope Object Database (ZODB) is an object-oriented database for transparently and persistently storing Python objects. It is included as part of the Zope web application server, but can also be used independently of Zope.

Contents

Features of the ZODB include: transactions, history/undo, transparently pluggable storage, built-in caching, multiversion concurrency control (MVCC), and scalability across a network (using ZEO ).

History

Implementation

Basics

ZODB stores Python objects using an extended version of Python's built-in object persistence (pickle). A ZODB database has a single root object (normally a dictionary), which is the only object directly made accessible by the database. All other objects stored in the database are reached through the root object. Objects referenced by an object stored in the database are automatically stored in the database as well.

ZODB supports concurrent transactions using MVCC and tracks changes to objects on a per-object basis. Only changed objects are committed. Transactions are non-destructive by default and can be reverted.

Example

For example, say we have a car described using 3 classes Car, Wheel and Screw. In Python, this could be represented the following way:

classCar:[...]classWheel:[...]classScrew:[...]myCar=Car()myCar.wheel1=Wheel()myCar.wheel2=Wheel()forwheelin(myCar.wheel1,myCar.wheel2):wheel.screws=[Screw(),Screw()]

If the variable mycar is the root of persistence, then:

zodb['mycar']=myCar

This puts all of the object instances (car, wheel, screws etc.) into storage, which can be retrieved later. If another program gets a connection to the database through the mycar object, performing:

car=zodb['myCar']

And retrieves all the objects, the pointer to the car being held in the car variable. Then at some later stage, this object can be altered with a Python code like:

car.wheel3=Wheel()car.wheel3.screws=[Screw()]

The storage is altered to reflect the change of data (after a commit is ordered).

There is no declaration of the data structure in either Python or ZODB, so new fields can be freely added to any existing object.

Storage unit

For persistence to take place, the Python Car class must be derived from the persistence. Persistent class — this class both holds the data necessary for the persistence machinery to work, such as the internal object id, state of the object, and so on, but also defines the boundary of the persistence in the following sense: every object whose class derives from Persistent is the atomic unit of storage (the whole object is copied to the storage when a field is modified).

In the example above, if Car is the only class deriving from Persistent, when wheel3 is added to car, all of the objects must be written to the storage. In contrast, if Wheel also derives from Persistent, then when carzz.wheel3 = Wheel is performed, a new record is written to the storage to hold the new value of the Car, but the existing Wheel are kept, and the new record for the Car points to the already existing Wheel record inside the storage.

The ZODB machinery doesn't chase modification down through the graph of pointers. In the example above, carzz.wheel3 = something is a modification automatically tracked down by the ZODB machinery, because carzz is of (Persistent) class Car. The ZODB machinery does this by marking the record as dirty. However, if there is a list, any change inside the list isn't noticed by the ZODB machinery, and the programmer must help by manually adding carzz._p_changed = 1, notifying ZODB that the record actually changed. Thus, to a certain extent the programmer must be aware of the working of the persistence machinery.

Atomicity

The storage unit (that is, an object whose class derives from Persistent) is also the atomicity unit. In the example above, if Cars is the only Persistent class, a thread modifies a Wheel (the Car record must be notified), and another thread modifies another Wheel inside another transaction, the second commit will fail. If Wheel is also Persistent, both Wheels can be modified independently by two different threads in two different transactions.

Class persistence

The class persistence—writing the class of a particular object into the storage—is obtained by writing a kind of "fully qualified" name of the class into each record on the disk. In Python, the name of the class involves the hierarchy of directory the source file of the class resides in. A consequence is that the source file of persisting object cannot be moved. If it is, the ZODB machinery is unable to locate the class of an object when retrieving it from the storage, resulting into a broken object.

ZEO

Zope Enterprise Objects (ZEO) is a ZODB storage implementation that allows multiple client processes to persist objects to a single ZEO server. This allows transparent scaling.

Pluggable storages

Failover technologies

Related Research Articles

In software engineering, multitier architecture is a client–server architecture in which presentation, application processing and data management functions are physically separated. The most widespread use of multitier architecture is the three-tier architecture.

A relational database is a database based on the relational model of data, as proposed by E. F. Codd in 1970. A system used to maintain relational databases is a relational database management system (RDBMS). Many relational database systems are equipped with the option of using SQL for querying and updating the database.

In computing, serialization is the process of translating a data structure or object state into a format that can be stored or transmitted and reconstructed later. When the resulting series of bits is reread according to the serialization format, it can be used to create a semantically identical clone of the original object. For many complex objects, such as those that make extensive use of references, this process is not straightforward. Serialization of object-oriented objects does not include any of their associated methods with which they were previously linked.

Zope is a family of free and open-source web application servers written in Python, and their associated online community. Zope stands for "Z Object Publishing Environment", and was the first system using the now common object publishing methodology for the Web. Zope has been called a Python killer app, an application that helped put Python in the spotlight.

Java Data Objects (JDO) is a specification of Java object persistence. One of its features is a transparency of the persistence services to the domain model. JDO persistent objects are ordinary Java programming language classes (POJOs); there is no requirement for them to implement certain interfaces or extend from special classes. JDO 1.0 was developed under the Java Community Process as JSR 12. JDO 2.0 was developed under JSR 243 and was released on May 10, 2006. JDO 2.1 was completed in Feb 2008, developed by the Apache JDO project. JDO 2.2 was released in October 2008. JDO 3.0 was released in April 2010.

<span class="mw-page-title-main">Physical schema</span> Representation of a data design

A physical data model is a representation of a data design as implemented, or intended to be implemented, in a database management system. In the lifecycle of a project it typically derives from a logical data model, though it may be reverse-engineered from a given database implementation. A complete physical data model will include all the database artifacts required to create relationships between tables or to achieve performance goals, such as indexes, constraint definitions, linking tables, partitioned tables or clusters. Analysts can usually use a physical data model to calculate storage estimates; it may include specific storage allocation details for a given database system.

<span class="mw-page-title-main">MonetDB</span>

MonetDB is an open-source column-oriented relational database management system (RDBMS) originally developed at the Centrum Wiskunde & Informatica (CWI) in the Netherlands. It is designed to provide high performance on complex queries against large databases, such as combining tables with hundreds of columns and millions of rows. MonetDB has been applied in high-performance applications for online analytical processing, data mining, geographic information system (GIS), Resource Description Framework (RDF), text retrieval and sequence alignment processing.

Multi-master replication is a method of database replication which allows data to be stored by a group of computers, and updated by any member of the group. All members are responsive to client data queries. The multi-master replication system is responsible for propagating the data modifications made by each member to the rest of the group and resolving any conflicts that might arise between concurrent changes made by different members.

A web framework (WF) or web application framework (WAF) is a software framework that is designed to support the development of web applications including web services, web resources, and web APIs. Web frameworks provide a standard way to build and deploy web applications on the World Wide Web. Web frameworks aim to automate the overhead associated with common activities performed in web development. For example, many web frameworks provide libraries for database access, templating frameworks, and session management, and they often promote code reuse. Although they often target development of dynamic web sites, they are also applicable to static websites.

Gadfly is a relational database management system written in Python. Gadfly is a collection of Python modules that provides relational database functionality entirely implemented in Python. It supports a subset of the standard RDBMS Structured Query Language (SQL).

<span class="mw-page-title-main">RDFLib</span> Python library to serialize, parse and process RDF data

RDFLib is a Python library for working with RDF, a simple yet powerful language for representing information. This library contains parsers/serializers for almost all of the known RDF serializations, such as RDF/XML, Turtle, N-Triples, & JSON-LD, many of which are now supported in their updated form. The library also contains both in-memory and persistent Graph back-ends for storing RDF information and numerous convenience functions for declaring graph namespaces, lodging SPARQL queries and so on. It is in continuous development with the most recent stable release, rdflib 6.1.1 having been released on 20 December 2021. It was originally created by Daniel Krech with the first release in November, 2002.

<span class="mw-page-title-main">Virtuoso Universal Server</span> Computer software

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional relational database management system (RDBMS), object–relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is a "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

In computer science, persistence refers to the characteristic of state of a system that outlives the process that created it. This is achieved in practice by storing the state as data in computer data storage. Programs have to transfer data to and from storage devices and have to provide mappings from the native programming-language data structures to the storage device data structures.

Objectivity/DB is a commercial object database produced by Objectivity, Inc. It allows applications to make standard C++, C#, Java, or Python objects persistent without having to convert the data objects into the rows and columns used by a relational database management system (RDBMS). Objectivity/DB supports the most popular object oriented languages plus SQL/ODBC and XML. It runs on Linux, Macintosh, UNIX and Windows platforms. All of the languages and platforms interoperate, with the Objectivity/DB kernel taking care of compiler and hardware platform differences.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software; it is embedded in the application. It is a broad technology category that includes:

ObjectDB is an object database for Java. It can be used in client-server mode and in embedded mode.

The following is provided as an overview of and topical guide to databases:

The following outline is provided as an overview of and topical guide to MySQL:

Block-level storage is a concept in cloud-hosted data persistence where cloud services emulate the behaviour of a traditional block device, such as a physical hard drive.

References

  1. "Releases · zopefoundation/ZODB". GitHub .