Autonomic computing

Last updated

Autonomic computing (AC) is distributed computing resources with self-managing characteristics, adapting to unpredictable changes while hiding intrinsic complexity to operators and users. Initiated by IBM in 2001, this initiative ultimately aimed to develop computer systems capable of self-management, to overcome the rapidly growing complexity of computing systems management, and to reduce the barrier that complexity poses to further growth. [1]

Contents

Description

The AC system concept is designed to make adaptive decisions, using high-level policies. It will constantly check and optimize its status and automatically adapt itself to changing conditions. An autonomic computing framework is composed of autonomic components (AC) interacting with each other. An AC can be modeled in terms of two main control schemes (local and global) with sensors (for self-monitoring), effectors (for self-adjustment), knowledge and planner/adapter for exploiting policies based on self- and environment awareness. This architecture is sometimes referred to as Monitor-Analyze-Plan-Execute (MAPE).

Driven by such vision, a variety of architectural frameworks based on "self-regulating" autonomic components has been recently proposed. A very similar trend has recently characterized significant research in the area of multi-agent systems. However, most of these approaches are typically conceived with centralized or cluster-based server architectures in mind and mostly address the need of reducing management costs rather than the need of enabling complex software systems or providing innovative services. Some autonomic systems involve mobile agents interacting via loosely coupled communication mechanisms. [2]

Autonomy-oriented computation is a paradigm proposed by Jiming Liu in 2001 that uses artificial systems imitating social animals' collective behaviours to solve difficult computational problems. For example, ant colony optimization could be studied in this paradigm. [3]

Problem of growing complexity

Forecasts suggest that the computing devices in use will grow at 38% per year [4] and the average complexity of each device is increasing. [4] Currently, this volume and complexity is managed by highly skilled humans; but the demand for skilled IT personnel is already outstripping supply, with labour costs exceeding equipment costs by a ratio of up to 18:1. [5] Computing systems have brought great benefits of speed and automation but there is now an overwhelming economic need to automate their maintenance.

In a 2003 IEEE Computer article, Kephart and Chess [1] warn that the dream of interconnectivity of computing systems and devices could become the "nightmare of pervasive computing" in which architects are unable to anticipate, design and maintain the complexity of interactions. They state the essence of autonomic computing is system self-management, freeing administrators from low-level task management while delivering better system behavior.

A general problem of modern distributed computing systems is that their complexity, and in particular the complexity of their management, is becoming a significant limiting factor in their further development. Large companies and institutions are employing large-scale computer networks for communication and computation. The distributed applications running on these computer networks are diverse and deal with many tasks, ranging from internal control processes to presenting web content to customer support.

Additionally, mobile computing is pervading these networks at an increasing speed: employees need to communicate with their companies while they are not in their office. They do so by using laptops, personal digital assistants, or mobile phones with diverse forms of wireless technologies to access their companies' data.

This creates an enormous complexity in the overall computer network which is hard to control manually by human operators. Manual control is time-consuming, expensive, and error-prone. The manual effort needed to control a growing networked computer-system tends to increase very quickly.

80% of such problems in infrastructure happen at the client specific application and database layer.[ citation needed ] Most 'autonomic' service providers[ who? ] guarantee only up to the basic plumbing layer (power, hardware, operating system, network and basic database parameters).

Characteristics of autonomic systems

A possible solution could be to enable modern, networked computing systems to manage themselves without direct human intervention. The Autonomic Computing Initiative (ACI) aims at providing the foundation for autonomic systems. It is inspired by the autonomic nervous system of the human body. [6] This nervous system controls important bodily functions (e.g. respiration, heart rate, and blood pressure) without any conscious intervention.

In a self-managing autonomic system, the human operator takes on a new role: instead of controlling the system directly, he/she defines general policies and rules that guide the self-management process. For this process, IBM defined the following four types of property referred to as self-star (also called self-*, self-x, or auto-*) properties. [7]

  1. Self-configuration: Automatic configuration of components;
  2. Self-healing: Automatic discovery, and correction of faults; [8]
  3. Self-optimization: Automatic monitoring and control of resources to ensure the optimal functioning with respect to the defined requirements;
  4. Self-protection: Proactive identification and protection from arbitrary attacks.

Others such as Poslad [7] and Nami and Sharifi [9] have expanded on the set of self-star as follows:

  1. Self-regulation: A system that operates to maintain some parameter, e.g., Quality of service, within a reset range without external control;
  2. Self-learning: Systems use machine learning techniques such as unsupervised learning which does not require external control;
  3. Self-awareness (also called Self-inspection and Self-decision): System must know itself. It must know the extent of its own resources and the resources it links to. A system must be aware of its internal components and external links in order to control and manage them;
  4. Self-organization: System structure driven by physics-type models without explicit pressure or involvement from outside the system;
  5. Self-creation (also called Self-assembly, Self-replication): System driven by ecological and social type models without explicit pressure or involvement from outside the system. A system's members are self-motivated and self-driven, generating complexity and order in a creative response to a continuously changing strategic demand;
  6. Self-management (also called self-governance): A system that manages itself without external intervention. What is being managed can vary dependent on the system and application. Self -management also refers to a set of self-star processes such as autonomic computing rather than a single self-star process;
  7. Self-description (also called self-explanation or Self-representation): A system explains itself. It is capable of being understood (by humans) without further explanation.

IBM has set forth eight conditions that define an autonomic system: [10] [11]

The system must

  1. know itself in terms of what resources it has access to, what its capabilities and limitations are and how and why it is connected to other systems;
  2. be able to automatically configure and reconfigure itself depending on the changing computing environment;
  3. be able to optimize its performance to ensure the most efficient computing process;
  4. be able to work around encountered problems by either repairing itself or routing functions away from the trouble;
  5. detect, identify and protect itself against various types of attacks to maintain overall system security and integrity;
  6. adapt to its environment as it changes, interacting with neighboring systems and establishing communication protocols;
  7. rely on open standards and cannot exist in a proprietary environment;
  8. anticipate the demand on its resources while staying transparent to users.

Even though the purpose and thus the behaviour of autonomic systems vary from system to system, every autonomic system should be able to exhibit a minimum set of properties to achieve its purpose:

  1. Automatic: This essentially means being able to self-control its internal functions and operations. As such, an autonomic system must be self-contained and able to start-up and operate without any manual intervention or external help. Again, the knowledge required to bootstrap the system (Know-how) must be inherent to the system.
  2. Adaptive: An autonomic system must be able to change its operation (i.e., its configuration, state and functions). This will allow the system to cope with temporal and spatial changes in its operational context either long term (environment customisation/optimisation) or short term (exceptional conditions such as malicious attacks, faults, etc.).
  3. Aware: An autonomic system must be able to monitor (sense) its operational context as well as its internal state in order to be able to assess if its current operation serves its purpose. Awareness will control adaptation of its operational behaviour in response to context or state changes.

Evolutionary levels

IBM defined five evolutionary levels, or the autonomic deployment model, for the deployment of autonomic systems:

Design patterns

The design complexity of Autonomic Systems can be simplified by utilizing design patterns such as the model–view–controller (MVC) pattern to improve concern separation by encapsulating functional concerns. [13]

Control loops

A basic concept that will be applied in Autonomic Systems are closed control loops. This well-known concept stems from Process Control Theory. Essentially, a closed control loop in a self-managing system monitors some resource (software or hardware component) and autonomously tries to keep its parameters within a desired range.

According to IBM, hundreds or even thousands of these control loops are expected to work in a large-scale self-managing computer system.

Conceptual model

AutonomicSystemModel.png

A fundamental building block of an autonomic system is the sensing capability (Sensors Si), which enables the system to observe its external operational context. Inherent to an autonomic system is the knowledge of the Purpose (intention) and the Know-how to operate itself (e.g., bootstrapping, configuration knowledge, interpretation of sensory data, etc.) without external intervention. The actual operation of the autonomic system is dictated by the Logic, which is responsible for making the right decisions to serve its Purpose, and influence by the observation of the operational context (based on the sensor input).

This model highlights the fact that the operation of an autonomic system is purpose-driven. This includes its mission (e.g., the service it is supposed to offer), the policies (e.g., that define the basic behaviour), and the "survival instinct". If seen as a control system this would be encoded as a feedback error function or in a heuristically assisted system as an algorithm combined with set of heuristics bounding its operational space.

See also

Related Research Articles

<span class="mw-page-title-main">Database</span> Organized collection of data in computing

In computing, a database is an organized collection of data or a type of data store based on the use of a database management system (DBMS), the software that interacts with end users, applications, and the database itself to capture and analyze the data. The DBMS additionally encompasses the core facilities provided to administer the database. The sum total of the database, the DBMS and the associated applications can be referred to as a database system. Often the term "database" is also used loosely to refer to any of the DBMS, the database system or an application associated with the database.

Computerized batch processing is a method of running software programs called jobs in batches automatically. While users are required to submit the jobs, no other interaction by the user is required to process the batch. Batches may automatically be run at scheduled times as well as being run contingent on the availability of computer resources.

<span class="mw-page-title-main">Embedded system</span> Computer system with a dedicated function

An embedded system is a specialized computer system—a combination of a computer processor, computer memory, and input/output peripheral devices—that has a dedicated function within a larger mechanical or electronic system. It is embedded as part of a complete device often including electrical or electronic hardware and mechanical parts. Because an embedded system typically controls physical operations of the machine that it is embedded within, it often has real-time computing constraints. Embedded systems control many devices in common use. In 2009, it was estimated that ninety-eight percent of all microprocessors manufactured were used in embedded systems.

<span class="mw-page-title-main">System on a chip</span> Micro-electronic component

A system on a chip or system-on-chip is an integrated circuit that integrates most or all components of a computer or other electronic system. These components almost always include on-chip central processing unit (CPU), memory interfaces, input/output devices and interfaces, and secondary storage interfaces, often alongside other components such as radio modems and a graphics processing unit (GPU) – all on a single substrate or microchip. SoCs may contain digital and also analog, mixed-signal and often radio frequency signal processing functions.

In computer science, algorithmic efficiency is a property of an algorithm which relates to the amount of computational resources used by the algorithm. Algorithmic efficiency can be thought of as analogous to engineering productivity for a repeating or continuous process.

Computer science is the study of the theoretical foundations of information and computation and their implementation and application in computer systems. One well known subject classification system for computer science is the ACM Computing Classification System devised by the Association for Computing Machinery.

<span class="mw-page-title-main">Theoretical computer science</span> Subfield of computer science and mathematics

Theoretical computer science is a subfield of computer science and mathematics that focuses on the abstract and mathematical foundations of computation.

Bio-inspired computing, short for biologically inspired computing, is a field of study which seeks to solve computer science problems using models of biology. It relates to connectionism, social behavior, and emergence. Within computer science, bio-inspired computing relates to artificial intelligence and machine learning. Bio-inspired computing is a major subset of natural computation.

Self-management is the process by which computer systems manage their own operation without human intervention. Self-management technologies are expected to pervade the next generation of network management systems.

Autonomic networking follows the concept of Autonomic Computing, an initiative started by IBM in 2001. Its ultimate aim is to create self-managing networks to overcome the rapidly growing complexity of the Internet and other networks and to enable their further growth, far beyond the size of today.

Employee scheduling software automates the process of creating and maintaining a schedule. Automating the scheduling of employees increases productivity and allows organizations with hourly workforces to re-allocate resources to non-scheduling activities. Such software will usually track vacation time, sick time, compensation time, and alert when there are conflicts. As scheduling data is accumulated over time, it may be extracted for payroll or to analyze past activity. Although employee scheduling software may or may not make optimization decisions, it does manage and coordinate the tasks. Today's employee scheduling software often includes mobile applications. Mobile scheduling further increased scheduling productivity and eliminated inefficient scheduling steps. It may also include functionality including applicant tracking and on-boarding, time and attendance, and automatic limits on overtime. Such functionality can help organizations with issues like employee retention, compliance with labor laws, and other workforce management challenges.

In control theory a self-tuning system is capable of optimizing its own internal running parameters in order to maximize or minimize the fulfilment of an objective function; typically the maximization of efficiency or error minimization.

<span class="mw-page-title-main">Virtualization</span> Methods for dividing computing resources

In computing, virtualization (v12n) is a series of technologies that allows dividing of physical computing resources into a series of virtual machines, operating systems, processes or containers.

<span class="mw-page-title-main">Computer cluster</span> Set of computers configured in a distributed computing system

A computer cluster is a set of computers that work together so that they can be viewed as a single system. Unlike grid computers, computer clusters have each node set to perform the same task, controlled and scheduled by software. The newest manifestation of cluster computing is cloud computing.

<span class="mw-page-title-main">Cloud computing</span> Form of shared internet-based computing

Cloud computing is the on-demand availability of computer system resources, especially data storage and computing power, without direct active management by the user. Large clouds often have functions distributed over multiple locations, each of which is a data center. Cloud computing relies on sharing of resources to achieve coherence and typically uses a pay-as-you-go model, which can help in reducing capital expenses but may also lead to unexpected operating expenses for users.

Lateral computing is a lateral thinking approach to solving computing problems. Lateral thinking has been made popular by Edward de Bono. This thinking technique is applied to generate creative ideas and solve problems. Similarly, by applying lateral-computing techniques to a problem, it can become much easier to arrive at a computationally inexpensive, easy to implement, efficient, innovative or unconventional solution.

Techila Distributed Computing Engine is a commercial grid computing software product. It speeds up simulation, analysis and other computational applications by enabling scalability across the IT resources in user's on-premises data center and in the user's own cloud account. Techila Distributed Computing Engine is developed and licensed by Techila Technologies Ltd, a privately held company headquartered in Tampere, Finland. The product is also available as an on-demand solution in Google Cloud Launcher, the online marketplace created and operated by Google. According to IDC, the solution enables organizations to create HPC infrastructure without the major capital investments and operating expenses required by new HPC hardware.

<span class="mw-page-title-main">Federated architecture</span> Pattern in enterprise architecture that allows interoperability and information sharing

Federated architecture (FA) is a pattern in enterprise architecture that allows interoperability and information sharing between semi-autonomous de-centrally organized lines of business (LOBs), information technology systems and applications.

This glossary of artificial intelligence is a list of definitions of terms and concepts relevant to the study of artificial intelligence (AI), its subdisciplines, and related fields. Related glossaries include Glossary of computer science, Glossary of robotics, and Glossary of machine vision.

References

  1. 1 2 Kephart, J.O.; Chess, D.M. (2003), "The vision of autonomic computing", Computer , 36: 41–52, CiteSeerX   10.1.1.70.613 , doi:10.1109/MC.2003.1160055
  2. Padovitz, Amir; Arkady Zaslavsky; Seng W. Loke (2003). "Awareness and agility for autonomic distributed systems: Platform-independent and publish-subscribe event-based communication for mobile agents". 14th International Workshop on Database and Expert Systems Applications, 2003. Proceedings. pp. 669–673. doi:10.1109/DEXA.2003.1232098. ISBN   978-0-7695-1993-7. S2CID   15846232.
  3. Jin, Xiaolong; Liu, Jiming (2004), "From Individual Based Modeling to Autonomy Oriented Computation", Agents and Computational Autonomy, Lecture Notes in Computer Science, vol. 2969, p. 151, doi:10.1007/978-3-540-25928-2_13, ISBN   978-3-540-22477-8
  4. 1 2 Horn. "Autonomic Computing:IBM's Perspective on the State of Information Technology" (PDF). Archived from the original (PDF) on September 16, 2011.
  5. 'Trends in technology', survey, Berkeley University of California, USA, March 2002
  6. "What is Ubiquitous Computing (Pervasive Computing)?".
  7. 1 2 Poslad, Stefan (2009). Autonomous systems and Artificial Life, In: Ubiquitous Computing Smart Devices, Smart Environments and Smart Interaction. Wiley. pp. 317–341. ISBN   978-0-470-03560-3. Archived from the original on 2014-12-10. Retrieved 2015-03-17.
  8. S-Cube Network. "Self-Healing System".
  9. Nami, M.R.; Sharifi, M. (2007). "A survey of autonomic computing systems". Intelligent Information Processing III. Third International Conference on Autonomic and Autonomous Systems (ICAS'07). IFIP International Federation for Information Processing. Vol. 228. pp. 26–30. doi: 10.1007/978-0-387-44641-7_11 . ISBN   978-0-387-44639-4. S2CID   6974127.
  10. "IBM Research | Autonomic Computing | Overview | The 8 Elements". Archived from the original on 2004-08-12. Retrieved 2021-12-27.
  11. "What is Autonomic Computing? Webopedia Definition". 22 June 2004.
  12. "IBM Unveils New Autonomic Computing Deployment Model". IBM . 2002-10-21.
  13. Curry, Edward; Grace, Paul (2008), "Flexible Self-Management Using the Model–View–Controller Pattern", IEEE Software, 25 (3): 84, doi:10.1109/MS.2008.60, S2CID   583784