Database administration and automation

Last updated

Database administration is the function of managing and maintaining database management systems (DBMS) software. Mainstream DBMS software such as Oracle, IBM DB2 and Microsoft SQL Server need ongoing management. As such, corporations that use DBMS software often hire specialized information technology personnel called Database Administrators or DBAs.

Microsoft SQL Server is a relational database management system developed by Microsoft. As a database server, it is a software product with the primary function of storing and retrieving data as requested by other software applications—which may run either on the same computer or on another computer across a network.

Information technology (IT) is the use of computers to store, retrieve, transmit, and manipulate data, or information, often in the context of a business or other enterprise. IT is considered to be a subset of information and communications technology (ICT). An information technology system is generally an information system, a communications system or, more specifically speaking, a computer system – including all hardware, software and peripheral equipment – operated by a limited group of users.

Database administrators (DBAs) use specialized software to store and organize data.

Responsibilities

Database design is the organisation of data according to a database model. The designer determines what data must be stored and how the data elements interrelate. With this information, they can begin to fit the data to the database model.

Database tuning describes a group of activities used to optimize and homogenize the performance of a database. It usually overlaps with query tuning, but refers to design of the database files, selection of the database management system (DBMS) application, and configuration of the database's environment.

Types

There are three types of DBAs:

  1. Systems DBAs (also referred to as physical DBAs, operations DBAs or production Support DBAs): focus on the physical aspects of database administration such as DBMS installation, configuration, patching, upgrades, backups, restores, refreshes, performance optimization, maintenance and disaster recovery.
  2. Development DBAs: focus on the logical and development aspects of database administration such as data model design and maintenance, DDL (data definition language) generation, SQL writing and tuning, coding stored procedures, collaborating with developers to help choose the most appropriate DBMS feature/functionality and other pre-production activities.
  3. Application DBAs: usually found in organizations that have purchased 3rd party application software such as ERP (enterprise resource planning) and CRM (customer relationship management) systems. Examples of such application software includes Oracle Applications, Siebel and PeopleSoft (both now part of Oracle Corp.) and SAP. Application DBAs straddle the fence between the DBMS and the application software and are responsible for ensuring that the application is fully optimized for the database and vice versa. They usually manage all the application components that interact with the database and carry out activities such as application installation and patching, application upgrades, database cloning, building and running data cleanup routines, data load process management, etc.

While individuals usually specialize in one type of database administration, in smaller organizations, it is not uncommon to find a single individual or group performing more than one type of database administration.

Nature of database administration

The degree to which the administration of a database is automated dictates the skills and personnel required to manage databases. On one end of the spectrum, a system with minimal automation will require significant experienced resources to manage; perhaps 5-10 databases per DBA. Alternatively an organization might choose to automate a significant amount of the work that could be done manually therefore reducing the skills required to perform tasks. As automation increases, the personnel needs of the organization splits into highly skilled workers to create and manage the automation and a group of lower skilled "line" DBAs who simply execute the automation.

A skilled worker is any worker who has special skill, training, knowledge, and ability in their work. A skilled worker may have attended a college, university or technical school. Or, a skilled worker may have learned their skills on the job. Examples of skilled labor include engineers, software development, paramedics, police officers, soldiers, physicians, crane operators, truck drivers, machinist, drafters, plumbers, craftsmen, cooks and accountants. These workers can be either blue-collar or white-collar workers, with varied levels of training or education.

Database administration work is complex, repetitive, time-consuming and requires significant training. Since databases hold valuable and mission-critical data, companies usually look for candidates with multiple years of experience. Database administration often requires DBAs to put in work during off-hours (for example, for planned after hours downtime, in the event of a database-related outage or if performance has been severely degraded). DBAs are commonly well compensated for the long hours.

One key skill required and often overlooked when selecting a DBA is database recovery (a part of disaster recovery). It is not a case of “if” but a case of “when” a database suffers a failure, ranging from a simple failure to a full catastrophic failure. The failure may be data corruption, media failure, or user induced errors. In either situation the DBA must have the skills to recover the database to a given point in time to prevent a loss of data.

Database administration tools

Often, the DBMS software comes with certain tools to help DBAs manage the DBMS. Such tools are called native tools. For example, Microsoft SQL Server comes with SQL Server Management Studio and Oracle has tools such as SQL*Plus and Oracle Enterprise Manager/Grid Control. In addition, 3rd parties such as BMC, Quest Software, Embarcadero Technologies, and SQL Maestro Group offer GUI tools to monitor the DBMS and help DBAs carry out certain functions inside the database more easily.

Quest Software Privately held software company based in California

Quest Software, also known as Quest, is a software company headquartered in Aliso Viejo, California with 53 offices in 24 countries. Founded in 1987.

Embarcadero Technologies, Inc. is an American computer software company that develops, manufactures, licenses, and supports products and services related to software through several product divisions. It was founded in 1993, went public in 2000, and private in 2007, and became a division of Idera Software in 2015.

Another kind of database software exists to manage the provisioning of new databases and the management of existing databases and their related resources. The process of creating a new database can consist of hundreds or thousands of unique steps from satisfying prerequisites to configuring backups where each step must be successful before the next can start. A human cannot be expected to complete this procedure in the same exact way time after time - exactly the goal when multiple databases exist. As the number of DBAs grows, without automation the number of unique configurations frequently grows to be costly/difficult to support. All of these complicated procedures can be modeled by the best DBAs into database automation software and executed by the standard DBAs. Software has been created specifically to improve the reliability and repeatability of these procedures such as Stratavia's Data Palette and GridApp Systems Clarity.

Stratavia, formerly known as ExtraQuest, was a software company that specialized in enterprise Database and Data Center Automation, and private cloud computing enablement. Stratavia was founded by Venkat Devraj and Rainier Luistro in 2001.

GridApp Systems, Inc. was a database automation software company. It was purchased by BMC Software in December, 2010.

The impact of IT and cloud automation

Automated Database operations has grown since 2009, following Amazon Web Services introduction of AWS RDS, providing automated and managed database as a service. Microsoft Azure launched a similar automated database as a service in 2010, with SQL Azure, providing automated backups, with geo-replication and high availability. The introduction of docker (software) containers has enhanced support for fast delivery of containerized database instances, and both Amazon Web Serices and Microsoft Azure have enhanced automated support for containers in their respective services.

Amazon Web Services subsidiary of Amazon that provides on-demand cloud computing platforms on a metered pay-as-you-go basis

Amazon Web Services (AWS) is a subsidiary of Amazon that provides on-demand cloud computing platforms to individuals, companies and governments, on a metered pay-as-you-go basis. In aggregate, these cloud computing web services provide a set of primitive, abstract technical infrastructure and distributed computing building blocks and tools. One of these services is Amazon Elastic Compute Cloud, which allows users to have at their disposal a virtual cluster of computers, available all the time, through the Internet. AWS's version of virtual computers emulate most of the attributes of a real computer including hardware ; a choice of operating systems; networking; and pre-loaded application software such as web servers, databases, CRM, etc.

Microsoft Azure is a cloud computing service created by Microsoft for building, testing, deploying, and managing applications and services through Microsoft-managed data centers. It provides software as a service (SaaS), platform as a service (PaaS) and infrastructure as a service (IaaS) and supports many different programming languages, tools and frameworks, including both Microsoft-specific and third-party software and systems.

Docker is a computer program that performs operating-system-level virtualization. It was first released in 2013 and is developed by Docker, Inc.

Third party support for database container images has grown, including MongoDB, Postgres, MySQL from Oracle, and Microsoft SQL Server from Microsoft, and from independent port's of docker (software) from Windocks [1]

Newer technologies such as Stratavia's Data Palette suite and GridApp Systems Clarity have begun to increase the automation of databases causing the reduction of database related tasks. However at best this only reduces the amount of mundane, repetitive activities and does not eliminate the need for DBAs. The intention of DBA automation is to enable DBAs to focus on more proactive activities around database architecture, deployment, performance and service level management.

Every database requires a database owner account that can perform all schema management operations. This account is specific to the database and cannot log into Data Director. You can add database owner accounts after database creation. Data Director users must log in with their database-specific credentials to view the database, its entities, and its data or to perform database management tasks. Database administrators and application developers can manage databases only if they have appropriate permissions and roles granted to them by the organization administrator. The permissions and roles must be granted on the database group or on the database, and they only apply within the organization in which they are granted.

Learning database administration

There are several education institutes that offer professional courses, including late-night programs, to allow candidates to learn database administration. Also, DBMS vendors such as Oracle, Microsoft and IBM offer certification programs to help companies to hire qualified DBA practitioners. College degree in Computer Science or related field is helpful but not necessarily a prerequisite.

See also

Related Research Articles

MySQL SQL database engine software

MySQL is an open-source relational database management system (RDBMS). Its name is a combination of "My", the name of co-founder Michael Widenius's daughter, and "SQL", the abbreviation for Structured Query Language.

A distributed database is a database in which not all storage devices are attached to a common processor. It may be stored in multiple computers, located in the same physical location; or may be dispersed over a network of interconnected computers. Unlike parallel systems, in which the processors are tightly coupled and constitute a single database system, a distributed database system consists of loosely coupled sites that share no physical components.

SAP ASE , originally known as Sybase SQL Server, and also commonly known as Sybase DB or Sybase ASE, is a relational model database server developed by Sybase Corporation, which later became part of SAP AG. ASE is predominantly used on the Unix platform, but is also available for Microsoft Windows.

Oracle Database is a proprietary multi-model database management system produced and marketed by Oracle Corporation.

A data dictionary, or metadata repository, as defined in the IBM Dictionary of Computing, is a "centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format". Oracle defines it as a collection of tables with metadata. The term can have one of several closely related meanings pertaining to databases and database management systems (DBMS):

Dataphor is an open-source truly-relational database management system (RDBMS) and its accompanying user interface technologies, which together are designed to provide highly declarative software application development. The Dataphor Server has its own storage engine or it can be a virtual, or federated, DBMS, meaning that it can utilize other database engines for storage.

The object-relational impedance mismatch is a set of conceptual and technical difficulties that are often encountered when a relational database management system (RDBMS) is being served by an application program written in an object-oriented programming language or style, particularly because objects or class definitions must be mapped to database tables defined by a relational schema.

ArcSDE is a server-software sub-system that aims to enable the usage of Relational Database Management Systems for spatial data. The spatial data may then be used as part of a geodatabase.

IBM Spectrum Protect is a data protection platform that gives enterprises a single point of control and administration for backup and recovery. It is the flagship product in the IBM Spectrum Protect family.

Toad (software) database management software

Toad is a database management toolset from Quest Software that database developers, database administrators, and data analysts use to manage both relational and non-relational databases using SQL. There are Toad products for developers and DBAs, which run on Oracle, SQL Server, IBM DB2, SAP and MySQL, as well as, a Toad product for data preparation, which supports most data platforms. Toad solutions enable data professionals to automate processes, minimize risks, and cut project delivery timelines.

Database security concerns the use of a broad range of information security controls to protect databases against compromises of their confidentiality, integrity and availability. It involves various types or categories of controls, such as technical, procedural/administrative and physical. Database security is a specialist topic within the broader realms of computer security, information security and risk management.

Virtuoso Universal Server

Virtuoso Universal Server is a middleware and database engine hybrid that combines the functionality of a traditional Relational database management system (RDBMS), Object-relational database (ORDBMS), virtual database, RDF, XML, free-text, web application server and file server functionality in a single system. Rather than have dedicated servers for each of the aforementioned functionality realms, Virtuoso is a "universal server"; it enables a single multithreaded server process that implements multiple protocols. The free and open source edition of Virtuoso Universal Server is also known as OpenLink Virtuoso. The software has been developed by OpenLink Software with Kingsley Uyi Idehen and Orri Erling as the chief software architects.

EnterpriseDB company

EnterpriseDB (EDB), a privately held company based in Massachusetts, provides software and services based on the open-source database PostgreSQL. EDB develops and integrates performance, security, and manageability enhancements into PostgreSQL to support enterprise-class workloads for its database, EDB Postgres Advanced Server. EDB has also developed database compatibility for Oracle to facilitate the migration of workloads from Oracle to EDB Postgres and to support the operation of many Oracle workloads on EDB Postgres.

An embedded database system is a database management system (DBMS) which is tightly integrated with an application software that requires access to stored data, such that the database system is "hidden" from the application’s end-user and requires little or no ongoing maintenance. It is actually a broad technology category that includes

Vinzant Software is a privately held company that is based in Hobart, IN. Vinzant Software develops and markets enterprise job scheduling solutions for most platforms including Windows, Unix, Linux, AS400 and MPE/ix. It was founded in 1988 by David Vinzant and has solely focused on job scheduling since 1995.

Pervasive PSQL is an ACID-compliant database management system (DBMS) developed by Pervasive Software. It is optimized for embedding in applications and used in several different types of packaged software applications offered by independent software vendors (ISVs) and original equipment manufacturers (OEMs). It is available for software as a service (SaaS) deployment due to a file-based architecture enabling partitioning of data for multitenancy needs.

ALTIBASE is a hybrid database, relational open source database management system manufactured by Altibase Corporation. The software comes with a hybrid architecture which allows it to access both memory-resident and disk-resident tables using single interface. It supports both synchronous and asynchronous replication and offers real-time ACID compliance. Support is also offered for a variety of SQL standards and programming languages. Other important capabilities include data import and export, data encryption for security, multiple data access command sets, materialized view and temporary tables, and others.

The following outline is provided as an overview of and topical guide to MySQL:

References

This article is based on material taken from the Free On-line Dictionary of Computing prior to 1 November 2008 and incorporated under the "relicensing" terms of the GFDL, version 1.3 or later.