Automatic Storage Management

Last updated

Automatic Storage Management (ASM) is a feature provided by Oracle Corporation within the Oracle Database from release Oracle 10g (revision 1) onwards. ASM aims to simplify the management of database datafiles, control files and log files. To do so, it provides tools to manage file systems and volumes directly inside the database, allowing database administrators (DBAs) to control volumes and disks with familiar SQL statements in standard Oracle environments. Thus DBAs do not need extra skills in specific file systems or volume managers (which usually operate at the level of the operating system).

Contents

Features

Architecture overview

ASM creates extents out of datafiles, log-files, system files, control files and other database structures. The system then spreads these extents across all disks in a "diskgroup". One can think of a diskgroup in ASM as a Logical Volume Manager volume group — with an ASM file corresponding to a logical volume. In addition to the existing Oracle background processes, ASM introduces two new ones - OSMB and RBAL. OSMB opens and creates disks in a diskgroup. RBAL provides the functionality of moving data between disks in a diskgroup.

Implementation and usage

Automatic Storage Management (ASM) simplifies administration of Oracle-related files by allowing the administrator to reference disk groups (rather than individual disks and files) which ASM manages. ASM extends the Oracle Managed Files (OMF) functionality [1] that also includes striping and mirroring to provide balanced and secure storage. DBAs can use the ASM functionality in combination with existing raw and cooked file-systems[ when defined as? ], along with OMF and manually managed files.

An ASM instance controls the ASM functionality. It isn't a full database instance[ when defined as? ], it provides just the memory structures, and as such is very small and lightweight.

The main components of ASM are disk groups, each of which comprise several physical disks controlled as a single unit. The physical disks are known as ASM disks, while the files that reside on the disks are known as ASM files. The locations and names for the files are controlled by ASM, but user-friendly aliases and directory structures can be defined by the DBA for ease of reference.

The level of redundancy and the granularity of the striping can be controlled using templates. Oracle Corporation provides default templates for each file-type stored by ASM, but additional templates can be defined as needed.

Failure groups are defined within a disk group to support the required level of redundancy. For two-way mirroring, a disk group might contain two failure groups, in which case individual files are written to two locations.

Oracle ASM Dynamic Volume Manager provides the foundation for the ASM Cluster File System (ACFS). [2]

In summary, ASM provides the following functionality:

Redundancy

One can configure ASM diskgroups to have no redundancy (external), two-way mirroring (normal), or three-way mirroring (high). In the case of normal and high mirrors, good practice suggests having fail groups that talk to different controllers for performance and fail-safe reasons.

See also

Related Research Articles

Database Organized collection of data

In computing, a database is an organized collection of data stored and accessed electronically. Small databases can be stored on a file system, while large databases are hosted on computer clusters or cloud storage. The design of databases spans formal techniques and practical considerations including data modeling, efficient data representation and storage, query languages, security and privacy of sensitive data, and distributed computing issues including supporting concurrent access and fault tolerance.

RAID is a data storage virtualization technology that combines multiple physical disk drive components into one or more logical units for the purposes of data redundancy, performance improvement, or both. This was in contrast to the previous concept of highly reliable mainframe disk drives referred to as "single large expensive disk" (SLED).

Data striping

In computer data storage, data striping is the technique of segmenting logically sequential data, such as a file, so that consecutive segments are stored on different physical storage devices.

Lustre is a type of parallel distributed file system, generally used for large-scale cluster computing. The name Lustre is a portmanteau word derived from Linux and cluster. Lustre file system software is available under the GNU General Public License and provides high performance file systems for computer clusters ranging in size from small workgroup clusters to large-scale, multi-site systems. Since June 2005, Lustre has consistently been used by at least half of the top ten, and more than 60 of the top 100 fastest supercomputers in the world, including the world's No. 1 ranked TOP500 supercomputer in June 2020, Fugaku, as well as previous top supercomputers such as Titan and Sequoia.

High-availability clusters are groups of computers that support server applications that can be reliably utilized with a minimum amount of down-time. They operate by using high availability software to harness redundant computers in groups or clusters that provide continued service when system components fail. Without clustering, if a server running a particular application crashes, the application will be unavailable until the crashed server is fixed. HA clustering remedies this situation by detecting hardware/software faults, and immediately restarting the application on another system without requiring administrative intervention, a process known as failover. As part of this process, clustering software may configure the node before starting the application on it. For example, appropriate file systems may need to be imported and mounted, network hardware may have to be configured, and some supporting applications may need to be running as well.

The device mapper is a framework provided by the Linux kernel for mapping physical block devices onto higher-level virtual block devices. It forms the foundation of the logical volume manager (LVM), software RAIDs and dm-crypt disk encryption, and offers additional features such as file system snapshots.

GPFS is a high-performance clustered file system software developed by IBM. It can be deployed in shared-disk or shared-nothing distributed parallel modes, or a combination of these. It is used by many of the world's largest commercial companies, as well as some of the supercomputers on the Top 500 List. For example, it is the filesystem of the Summit at Oak Ridge National Laboratory which was the #1 fastest supercomputer in the world in the November 2019 top500 list of supercomputers. Summit is a 200 Petaflops system composed of more than 9,000 POWER9 processors and 27,000 NVIDIA Volta GPUs. The storage filesystem called Alpine has 250 PB of storage using Spectrum Scale on IBM ESS storage hardware, capable of approximately 2.5TB/s of sequential I/O and 2.2TB/s of random I/O.

The IBM SAN Volume Controller (SVC) is a block storage virtualization appliance that belongs to the IBM System Storage product family. SVC implements an indirection, or "virtualization", layer in a Fibre Channel storage area network (SAN).

In database computing, Oracle Real Application Clusters (RAC) — an option for the Oracle Database software produced by Oracle Corporation and introduced in 2001 with Oracle9i — provides software for clustering and high availability in Oracle database environments. Oracle Corporation includes RAC with the Enterprise Edition, provided the nodes are clustered using Oracle Clusterware.

A grid file system is a computer file system whose goal is improved reliability and availability by taking advantage of many smaller file storage areas.

In Oracle databases, Flashback tools allow administrators and users to view and manipulate past states of an instance's data without (destructively) recovering to a fixed point in time.

Although all RAID implementations differ from the specification to some extent, some companies and open-source projects have developed non-standard RAID implementations that differ substantially from the standard. Additionally, there are non-RAID drive architectures, providing configurations of multiple hard drives not referred to by RAID acronyms.

OpenSAF is an open-source service-orchestration system for automating computer application deployment, scaling, and management. OpenSAF is consistent with, and expands upon, Service Availability Forum (SAF) and SCOPE Alliance standards.

In the Oracle RDBMS environment, redo logs comprise files in a proprietary format which log a history of all changes made to the database. Each redo log file consists of redo records. A redo record, also called a redo entry, holds a group of change vectors, each of which describes or represents a change made to a single block in the database.

Ceph is an open-source software storage platform, implements object storage on a single distributed computer cluster, and provides 3-in-1 interfaces for object-, block- and file-level storage. Ceph aims primarily for completely distributed operation without a single point of failure, scalability to the exabyte level, and to be freely available. Since version 12 Ceph does not rely on other filesystems and can directly manage HDDs and SSDs with its own storage backend BlueStore and can completely self reliantly expose a POSIX filesystem.

Oracle Clusterware is the cross-platform cluster software required to run the Real Application Clusters (RAC) option for Oracle Database. It provides the basic clustering services at the operating-system level that enable Oracle Database software to run in clustering mode. In earlier versions of Oracle, RAC required a vendor-supplied clusterware like Sun Cluster or Veritas Cluster Server.

Resilient File System (ReFS), codenamed "Protogon", is a Microsoft proprietary file system introduced with Windows Server 2012 with the intent of becoming the "next generation" file system after NTFS.

Oracle Cloud File System (CloudFS) is a storage management suite developed by Oracle Corporation. CloudFS consists of a cluster file system called ASM Cluster File System (ACFS), and a cluster volume manager called ASM Dynamic Volume Manager (ADVM) initially released in August 2007.

ONTAP or Data ONTAP or Clustered Data ONTAP (cDOT) or Data ONTAP 7-Mode is NetApp's proprietary operating system used in storage disk arrays such as NetApp FAS and AFF, ONTAP Select and Cloud Volumes ONTAP. With the release of version 9.0, NetApp decided to simplify the Data ONTAP name and removed word "Data" from it and remove 7-Mode image, therefore, ONTAP 9 is successor from Clustered Data ONTAP 8.

ZFS File system

ZFS combines a file system with a volume manager. It began as part of the Sun Microsystems Solaris operating system in 2001. Large parts of Solaris – including ZFS – were published under an open source license as OpenSolaris for around 5 years from 2005, before being placed under a closed source license when Oracle Corporation acquired Sun in 2009/2010. During 2005 to 2010, the open source version of ZFS was ported to Linux, Mac OS X and FreeBSD. In 2010, the illumos project forked a recent version of OpenSolaris, to continue its development as an open source project, including ZFS. In 2013, OpenZFS was founded to coordinate the development of open source ZFS. OpenZFS maintains and manages the core ZFS code, while organizations using ZFS maintain the specific code and validation processes required for ZFS to integrate within their systems. OpenZFS is widely used in Unix-like systems.

References

  1. "Database Administrator's Guide".
  2. Gopalakrishnan, K. (10 August 2011). Oracle Database 11g Oracle Real Application Clusters Handbook. Oracle Press (2 ed.). McGraw Hill Professional (published 2011). ISBN   9780071752626 . Retrieved 2015-01-05. Oracle ASM Dynamic Volume Manager is the foundation for ASM Cluster File System (ACFS). ACFS is a general-purpose cluster file system and supports non-Oracle applications.