Temporal isolation among virtual machines

Last updated

Temporal isolation or performance isolation among virtual machine (VMs) refers to the capability of isolating the temporal behavior (or limiting the temporal interferences) of multiple VMs among each other, despite them running on the same physical host and sharing a set of physical resources such as processors, memory, and disks.

Contents

Introduction to the problem

One of the key advantages of using virtualization in server consolidation, is the possibility to seamlessly "pack" multiple under-utilized systems into a single physical host, thus achieving a better overall utilization of the available hardware resources. In fact, an entire operating system (OS), along with the applications running within, can be run in a virtual machine (VM).

However, when multiple VMs concurrently run on the same physical host, they share the available physical resources, including CPU(s), network adapter(s), disk(s) and memory. This adds a level of unpredictability in the performance that may be exhibited by each individual VM, as compared to what is expected. For example, a VM with a temporary compute-intensive peak might disturb the other running VMs, causing a significant and undesirable temporary drop in their performance. In a world of computing that is shifting towards cloud computing paradigms where resources (computing, storage, networking) may be remotely rented in virtualized form under precise service-level agreements, it would be highly desirable that the performance of the virtualized resources be as stable and predictable as possible.

Possible solutions

Multiple techniques may be used to face with the aforementioned problem. They aim to achieve some degree of temporal isolation across the concurrently running VMs, at the various critical levels of scheduling: CPU scheduling, network scheduling and disk scheduling.

For the CPU, it is possible to use proper scheduling techniques at the hypervisor level to contain the amount of computing each VM may impose on a shared physical CPU or core. For example, on the Xen hypervisor, the BVT, Credit-based and S-EDF schedulers have been proposed for controlling how the computing power is distributed among competing VMs. [1] To get stable performance in virtualized applications, it is necessary to use scheduler configurations that are not work-conserving. Also, on the KVM hypervisor, some have proposed using EDF-based scheduling strategies [2] to maintain stable and predictable performance of virtualized applications. [3] [4] Finally, with a multi-core or multi-processor physical host, it is possible to deploy each VM on a separate processor or core to temporally isolate the performance of various VMs.

For the network, it is possible to use traffic shaping techniques to limit the amount of traffic that each VM can impose on the host. Also, it is possible to install multiple network adapters on the same physical host, and configure the virtualization layer so that each VM may grant exclusive access to each one of them. For example, this is possible with the driver domains of the Xen hypervisor. Multi-queue network adapters exist which support multiple VMs at the hardware level, having separate packet queues associated to the different hosted VMs (by means of the IP addresses of the VMs), such as the Virtual Machine Device Queue (VMDq) devices by Intel. [5] Finally, real-time scheduling of the CPU may also be used for enhancing temporal isolation of network traffic from multiple VMs deployed on the same CPU. [6]

When using real-time scheduling for controlling the amount of CPU resources reserved for each VM, one challenging problem is properly accounting for the CPU time applicable to system-wide activities. For example, in the case of the Xen scheduler, the Dom0 and the driver domains services might be shared across multiple VMs accessing them. Similarly, in the case of the KVM hypervisor, the workload imposed on the host OS due to serving network traffic for each individual guest OS might not be easily distinguishable, because it mainly involves kernel-level device drivers and the networking infrastructure (on the host OS). Some techniques for mitigating such problems have been proposed for the Xen case. [7]

Along the lines of adaptive reservations, it is possible to apply feedback-control strategies to dynamically adapt the amount of resources reserved to each virtual machine to maintain stable performance for the virtualized application(s). [8] Following the trend of adaptiveness, in those cases in which a virtualized system is not fulfilling the expected performance levels (either due to unforeseen interferences of other concurrently running VMs, or due to a bad deployment strategy that simply picked up a machine with insufficient hardware resources), it is possible to live-migrate virtual machines while they are running, so as to host them on a more capable (or less loaded) physical host.

Related Research Articles

<span class="mw-page-title-main">Virtual machine</span> Software that emulates an entire computer

In computing, a virtual machine (VM) is the virtualization or emulation of a computer system. Virtual machines are based on computer architectures and provide the functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination of the two. Virtual machines differ and are organized by their function, shown here:

<span class="mw-page-title-main">Xen</span> Type-1 hypervisor

Xen is a free and open-source type-1 hypervisor, providing services that allow multiple computer operating systems to execute on the same computer hardware concurrently. It was originally developed by the University of Cambridge Computer Laboratory and is now being developed by the Linux Foundation with support from Intel, Citrix, Arm Ltd, Huawei, AWS, Alibaba Cloud, AMD, Bitdefender and EPAM Systems.

A hypervisor, also known as a virtual machine monitor (VMM) or virtualizer, is a type of computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called a host machine, and each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Unlike an emulator, the guest executes most instructions on the native hardware. Multiple instances of a variety of operating systems may share the virtualized hardware resources: for example, Linux, Windows, and macOS instances can all run on a single physical x86 machine. This contrasts with operating-system–level virtualization, where all instances must share a single kernel, though the guest operating systems can differ in user space, such as different Linux distributions with the same kernel.

Platform virtualization software, specifically emulators and hypervisors, are software packages that emulate the whole physical computer machine, often providing multiple virtual machines on one physical platform. The table below compares basic information about platform virtualization hypervisors.

<span class="mw-page-title-main">OpenVZ</span> Operating-system level virtualization technology

OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.

<span class="mw-page-title-main">VMware ESXi</span> Enterprise-class, type-1 hypervisor for deploying and serving virtual computers

VMware ESXi is an enterprise-class, type-1 hypervisor developed by VMware, a subsidiary of Broadcom, for deploying and serving virtual computers. As a type-1 hypervisor, ESXi is not a software application that is installed on an operating system (OS); instead, it includes and integrates vital OS components, such as a kernel.

Hardware virtualization is the virtualization of computers as complete hardware platforms, certain logical abstractions of their componentry, or only the functionality required to run various operating systems. Virtualization emulates the hardware environment of its host architecture, allowing multiple OSes to run unmodified and in isolation. At its origins, the software that controlled virtualization was called a "control program", but the terms "hypervisor" or "virtual machine monitor" became preferred over time.

<span class="mw-page-title-main">Hyper-V</span> Native hypervisor by Microsoft

Hyper-V is a native hypervisor developed by Microsoft; it can create virtual machines on x86-64 systems running Windows. It is included in Pro and Enterprise editions of Windows NT as an optional feature to be manually enabled. A server computer running Hyper-V can be configured to expose individual virtual machines to one or more networks.

Infrastructure as a service (IaaS) is a cloud computing service model where a cloud services vendor provides computing resources such as storage, network, servers, and virtualization. This service frees users from maintaining their own data center, but they must install and maintain the operating system and application software. Iaas provides users high-level APIs to control details of underlying network infrastructure such as backup, data partitioning, scaling, security and physical computing resources. Services can be scaled on-demand by the user. According to the Internet Engineering Task Force (IETF), such infrastructure is the most basic cloud-service model. IaaS can be hosted in a public cloud, a private cloud, or a hybrid cloud.

A virtual security switch is a software Ethernet switch with embedded security controls within it that runs within virtual environments such as VMware vSphere, Citrix XenDesktop, Microsoft Hyper-V and Virtual Iron. The primary purpose of a virtual security switch is to provide security measures such as isolation, control and content inspection between virtual machines.

<span class="mw-page-title-main">Virtualization</span> Methods for dividing computing resources

In computing, virtualization (v12n) is a series of technologies that allows dividing of physical computing resources into a series of virtual machines, operating systems, processes or containers.

<span class="mw-page-title-main">XtratuM</span> Hypervisor

XtratuM is a bare-metal hypervisor specially designed for embedded real-time systems available for the instruction sets LEON2/3/4, ARM v7 and V8 processors and RISC-V processor.

An embedded hypervisor is a hypervisor that supports the requirements of embedded systems.

Live migration, also called migration, refers to the process of moving a running virtual machine (VM) or application between different physical machines without disconnecting the client or application. Memory, storage, and network connectivity of the virtual machine are transferred from the original guest machine to the destination. The time between stopping the VM or application on the source and resuming it on destination is called 'downtime'. When the downtime of a VM during live migration is small enough that it is not noticeable by the end user, it is called a 'seamless' live migration.

<span class="mw-page-title-main">OpenNebula</span> Cloud-computing platform for managing heterogeneous distributed infrastructure

OpenNebula is an open source cloud computing platform for managing heterogeneous data center, public cloud and edge computing infrastructure resources. OpenNebula manages on-premises and remote virtual infrastructure to build private, public, or hybrid implementations of infrastructure as a service (IaaS) and multi-tenant Kubernetes deployments. The two primary uses of the OpenNebula platform are data center virtualization and cloud deployments based on the KVM hypervisor, LXD/LXC system containers, and AWS Firecracker microVMs. The platform is also capable of offering the cloud infrastructure necessary to operate a cloud on top of existing VMware infrastructure. In early June 2020, OpenNebula announced the release of a new Enterprise Edition for corporate users, along with a Community Edition. OpenNebula CE is free and open-source software, released under the Apache License version 2. OpenNebula CE comes with free access to patch releases containing critical bug fixes but with no access to the regular EE maintenance releases. Upgrades to the latest minor/major version is only available for CE users with non-commercial deployments or with significant open source contributions to the OpenNebula Community. OpenNebula EE is distributed under a closed-source license and requires a commercial Subscription.

Earliest deadline first (EDF) or least time to go is a dynamic priority scheduling algorithm used in real-time operating systems to place processes in a priority queue. Whenever a scheduling event occurs the queue will be searched for the process closest to its deadline. This process is the next to be scheduled for execution.

VM-aware storage (VAS) is computer data storage designed specifically for managing storage for virtual machines (VMs) within a data center. The goal is to provide storage that is simpler to use with functionality better suited for VMs compared with general-purpose storage. VM-aware storage allows storage to be managed as an integrated part of managing VMs rather than as logical unit numbers (LUNs) or volumes that are separately configured and managed.

Google Compute Engine (GCE) is the infrastructure as a service (IaaS) component of Google Cloud Platform which is built on the global infrastructure that runs Google's search engine, Gmail, YouTube and other services. Google Compute Engine enables users to launch virtual machines (VMs) on demand. VMs can be launched from the standard images or custom images created by users. Google Compute Engine can be accessed via the Developer Console, RESTful API or command-line interface (CLI).

In virtualization, single root input/output virtualization (SR-IOV) is a specification that allows the isolation of PCI Express resources for manageability and performance reasons.

A system virtual machine is a virtual machine (VM) that provides a complete system platform and supports the execution of a complete operating system (OS). These usually emulate an existing architecture, and are built with the purpose of either providing a platform to run programs where the real hardware is not available for use, or of having multiple instances of virtual machines leading to more efficient use of computing resources, both in terms of energy consumption and cost effectiveness, or both. A VM was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real machine".

References

  1. Ludmila Cherkasova; Diwaker Gupta; Amin Vahdat (3 September 2007), "Comparison of the Three CPU Schedulers in Xen" (PDF), Performance Evaluation Review. Vol 35, Number 2, retrieved 30 June 2010
  2. Fabio Checconi, Tommaso Cucinotta, Dario Faggioli, Giuseppe Lipari, Hierarchical Multiprocessor CPU Reservations for the Linux Kernel, Proceedings of the 5th International Workshop on Operating Systems Platforms for Embedded Real-Time Applications (OSPERT 2009), Dublin, Ireland, June 2009
  3. Tommaso Cucinotta, Gaetano Anastasi, Luca Abeni, Respecting temporal constraints in virtualised services, Proceedings of the 2nd IEEE International Workshop on Real-Time Service-Oriented Architecture and Applications (RTSOAA 2009), Seattle, Washington, July 2009
  4. Tommaso Cucinotta, Gaetano Anastasi, Luca Abeni, Real-Time Virtual Machines, Proceedings of the 29th Real-Time System Symposium (RTSS 2008) -- Work in Progress Session, Barcelona, December 2008
  5. Shefali Chinni, Radhakrishna Hiremane, Virtual Machine Device Queues, Intel Virtualization Technology White Paper, 2007
  6. Tommaso Cucinotta, Dhaval Giani, Dario Faggioli and Fabio Checconi, Providing Performance Guarantees to Virtual Machines using Real-Time Scheduling, Proceedings of the 5th Workshop on Virtualization and High-Performance Cloud Computing (VHPC 2010), Ischia (Naples), Italy, August 2010.
  7. Diwaker Gupta, Lucy Cherkasova, Robert Gardner, Amin Vahdat, Enforcing Performance Isolation Across Virtual Machines in Xen, Proceedings of the 7th International Middleware Conference (Middleware 2006), Lecture Notes in Computer Science, Volume 4290/2006, pp.342-362, Melbourne, Australia, November 2006
  8. Ripal Nathuji; Aman Kansal & Alireza Ghaffarkhah (April 2010), "Q-Clouds: Managing Performance Interference Effects for QoS-Aware Clouds", Proc. of the 5th European conference on Computer systems (EuroSys 2010), Paris, France