Virtual machine

Last updated

In computing, a virtual machine (VM) is an emulation of a computer system. Virtual machines are based on computer architectures and provide functionality of a physical computer. Their implementations may involve specialized hardware, software, or a combination.

Emulator system that emulates a real system such that the behavior closely resembles the behavior of the real system

In computing, an emulator is hardware or software that enables one computer system to behave like another computer system. An emulator typically enables the host system to run software or use peripheral devices designed for the guest system. Emulation refers to the ability of a computer program in an electronic device to emulate another program or device. Many printers, for example, are designed to emulate Hewlett-Packard LaserJet printers because so much software is written for HP printers. If a non-HP printer emulates an HP printer, any software written for a real HP printer will also run in the non-HP printer emulation and produce equivalent printing. Since at least the 1990s, many video game enthusiasts have used emulators to play classic arcade games from the 1980s using the games' original 1980s machine code and data, which is interpreted by a current-era system.

Contents

There are different kinds of virtual machines, each with different functions:

Full virtualization

In computer science, virtualization is a modern technique developed in late 1990s and is different from simulation and emulation. Virtualization employs techniques used to create instances of an environment, as opposed to simulation, which models the environment; or emulation, which replicates the target environment such as certain kind of virtual machine environment. Full virtualization requires that every salient feature of the hardware be reflected into one of several virtual machines – including the full instruction set, input/output operations, interrupts, memory access, and whatever other elements are used by the software that runs on the bare machine, and that is intended to run in a virtual machine. In such an environment, any software capable of execution on the raw hardware can be run in the virtual machine and, in particular, any operating systems. The obvious test of full virtualization is whether an operating system intended for stand-alone use can successfully run inside a virtual machine.

Operating system Software that manages computer hardware resources

An operating system (OS) is system software that manages computer hardware, software resources, and provides common services for computer programs.

A hypervisor or virtual machine monitor (VMM) is computer software, firmware or hardware that creates and runs virtual machines. A computer on which a hypervisor runs one or more virtual machines is called a host machine, and each virtual machine is called a guest machine. The hypervisor presents the guest operating systems with a virtual operating platform and manages the execution of the guest operating systems. Multiple instances of a variety of operating systems may share the virtualized hardware resources: for example, Linux, Windows, and macOS instances can all run on a single physical x86 machine. This contrasts with operating-system-level virtualization, where all instances must share a single kernel, though the guest operating systems can differ in user space, such as different Linux distributions with the same kernel.

Some virtual machines, such as QEMU, are designed to also emulate different architectures and allow execution of software applications and operating systems written for another CPU or architecture. Operating-system-level virtualization allows the resources of a computer to be partitioned via the kernel. The terms are not universally interchangeable.

QEMU Free virtualization and emulation software

QEMU is a free and open-source emulator that performs hardware virtualization.

Definitions

A "virtual machine" was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real computer machine." [1] Current use includes virtual machines that have no direct correspondence to any real hardware. [2] The physical, "real-world" hardware running the VM is generally referred to as the 'host', and the virtual machine emulated on that machine is generally referred to as the 'guest'. A host can emulate several guests, each of which can emulate different operating systems and hardware platforms.

The Popek and Goldberg virtualization requirements are a set of conditions sufficient for a computer architecture to support system virtualization efficiently. They were introduced by Gerald J. Popek and Robert P. Goldberg in their 1974 article "Formal Requirements for Virtualizable Third Generation Architectures". Even though the requirements are derived under simplifying assumptions, they still represent a convenient way of determining whether a computer architecture supports efficient virtualization and provide guidelines for the design of virtualized computer architectures.

System virtual machines

The desire to run multiple operating systems was the initial motive for virtual machines, so as to allow time-sharing among several single-tasking operating systems. In some respects, a system virtual machine can be considered a generalization of the concept of virtual memory that historically preceded it. IBM's CP/CMS, the first systems to allow full virtualization, implemented time sharing by providing each user with a single-user operating system, the Conversational Monitor System (CMS). Unlike virtual memory, a system virtual machine entitled the user to write privileged instructions in their code. This approach had certain advantages, such as adding input/output devices not allowed by the standard system. [2]

Virtual memory Operating System level memory management technique

In computing, virtual memory is a memory management technique that provides an "idealized abstraction of the storage resources that are actually available on a given machine" which "creates the illusion to users of a very large (main) memory."

CP/CMS is a discontinued time-sharing operating system of the late 60s and early 70s, known for its excellent performance and advanced features. It had three distinct versions:

The Conversational Monitor System is a simple interactive single-user operating system. CMS was originally developed as part of IBM's CP/CMS operating system, which went into production use in 1967. CMS is part of IBM's VM family, which runs on IBM mainframe computers. VM was first announced in 1972, and is still in use today as z/VM.

As technology evolves virtual memory for purposes of virtualization, new systems of memory overcommitment may be applied to manage memory sharing among multiple virtual machines on one computer operating system. It may be possible to share memory pages that have identical contents among multiple virtual machines that run on the same physical machine, what may result in mapping them to the same physical page by a technique termed kernel same-page merging (KSM). This is especially useful for read-only pages, such as those holding code segments, which is the case for multiple virtual machines running the same or similar software, software libraries, web servers, middleware components, etc. The guest operating systems do not need to be compliant with the host hardware, thus making it possible to run different operating systems on the same computer (e.g., Windows, Linux, or prior versions of an operating system) to support future software. [3]

Memory overcommitment is a concept in computing that covers the assignment of more memory to virtual computing devices than the physical machine they are hosted on actually has. This is possible because virtual machines do not necessarily use as much memory at any one point as they are assigned, creating a buffer. If four virtual machines each have 1 GB of memory on a physical machine with 4 GB of memory, but those virtual machines are only using 500 MB, it is possible to create additional virtual machines that take advantage of the 500 MB each existing machine is leaving free. Memory swapping is then used to handle spikes in memory usage. The disadvantage of this approach is that memory swap files are slower to read from than 'actual' memory, which can lead to performance drops.

In computing, kernel same-page merging is a kernel feature that makes it possible for a hypervisor system to share identical memory pages amongst different processes or virtualized guests. While not directly linked, Kernel-based Virtual Machine (KVM) can use KSM to merge memory pages occupied by virtual machines.

Microsoft Windows is a group of several graphical operating system families, all of which are developed, marketed and sold by Microsoft. Each family caters to a certain sector of the computing industry. Active Microsoft Windows families include Windows NT and Windows IoT; these may encompass subfamilies, e.g. Windows Server or Windows Embedded Compact. Defunct Microsoft Windows families include Windows 9x, Windows Mobile and Windows Phone.

The use of virtual machines to support separate guest operating systems is popular in regard to embedded systems. A typical use would be to run a real-time operating system simultaneously with a preferred complex operating system, such as Linux or Windows. Another use would be for novel and unproven software still in the developmental stage, so it runs inside a sandbox. Virtual machines have other advantages for operating system development and may include improved debugging access and faster reboots. [4]

Multiple VMs running their own guest operating system are frequently engaged for server consolidation. [5]

Process virtual machines

A process VM, sometimes called an application virtual machine, or Managed Runtime Environment (MRE), runs as a normal application inside a host OS and supports a single process. It is created when that process is started and destroyed when it exits. Its purpose is to provide a platform-independent programming environment that abstracts away details of the underlying hardware or operating system and allows a program to execute in the same way on any platform.

A process VM provides a high-level abstraction  that of a high-level programming language (compared to the low-level ISA abstraction of the system VM). Process VMs are implemented using an interpreter; performance comparable to compiled programming languages can be achieved by the use of just-in-time compilation.[ citation needed ]

This type of VM has become popular with the Java programming language, which is implemented using the Java virtual machine. Other examples include the Parrot virtual machine and the .NET Framework, which runs on a VM called the Common Language Runtime. All of them can serve as an abstraction layer for any computer language.

A special case of process VMs are systems that abstract over the communication mechanisms of a (potentially heterogeneous) computer cluster. Such a VM does not consist of a single process, but one process per physical machine in the cluster. They are designed to ease the task of programming concurrent applications by letting the programmer focus on algorithms rather than the communication mechanisms provided by the interconnect and the OS. They do not hide the fact that communication takes place, and as such do not attempt to present the cluster as a single machine.[ citation needed ]

Unlike other process VMs, these systems do not provide a specific programming language, but are embedded in an existing language; typically such a system provides bindings for several languages (e.g., C and Fortran).[ citation needed ] Examples are Parallel Virtual Machine (PVM) and Message Passing Interface (MPI). They are not strictly virtual machines because the applications running on top still have access to all OS services and are therefore not confined to the system model.

History

Both system virtual machines and process virtual machines date to the 1960s and continue to be areas of active development.

System virtual machines grew out of time-sharing, as notably implemented in the Compatible Time-Sharing System (CTSS). Time-sharing allowed multiple users to use a computer concurrently: each program appeared to have full access to the machine, but only one program was executed at the time, with the system switching between programs in time slices, saving and restoring state each time. This evolved into virtual machines, notably via IBM's research systems: the M44/44X, which used partial virtualization, and the CP-40 and SIMMON, which used full virtualization, and were early examples of hypervisors. The first widely available virtual machine architecture was the CP-67/CMS (see History of CP/CMS for details). An important distinction was between using multiple virtual machines on one host system for time-sharing, as in M44/44X and CP-40, and using one virtual machine on a host system for prototyping, as in SIMMON. Emulators, with hardware emulation of earlier systems for compatibility, date back to the IBM System/360 in 1963, [6] [7] while the software emulation (then-called "simulation") predates it.

Process virtual machines arose originally as abstract platforms for an intermediate language used as the intermediate representation of a program by a compiler; early examples date to around 1966. An early 1966 example was the O-code machine, a virtual machine that executes O-code (object code) emitted by the front end of the BCPL compiler. This abstraction allowed the compiler to be easily ported to a new architecture by implementing a new back end that took the existing O-code and compiled it to machine code for the underlying physical machine. The Euler language used a similar design, with the intermediate language named P (portable). [8] This was popularized around 1970 by Pascal, notably in the Pascal-P system (1973) and Pascal-S compiler (1975), in which it was termed p-code and the resulting machine as a p-code machine. This has been influential, and virtual machines in this sense have been often generally called p-code machines. In addition to being an intermediate language, Pascal p-code was also executed directly by an interpreter implementing the virtual machine, notably in UCSD Pascal (1978); this influenced later interpreters, notably the Java virtual machine (JVM). Another early example was SNOBOL4 (1967), which was written in the SNOBOL Implementation Language (SIL), an assembly language for a virtual machine, which was then targeted to physical machines by transpiling to their native assembler via a macro assembler. [9] Macros have since fallen out of favor, however, so this approach has been less influential. Process virtual machines were a popular approach to implementing early microcomputer software, including Tiny BASIC and adventure games, from one-off implementations such as Pyramid 2000 to a general-purpose engine like Infocom's z-machine, which Graham Nelson argues is "possibly the most portable virtual machine ever created". [10]

Significant advances occurred in the implementation of Smalltalk-80, [11] particularly the Deutsch/Schiffmann implementation [12] which pushed just-in-time (JIT) compilation forward as an implementation approach that uses process virtual machine. [13] Later notable Smalltalk VMs were VisualWorks, the Squeak Virtual Machine, [14] and Strongtalk. [15] A related language that produced a lot of virtual machine innovation was the Self programming language, [16] which pioneered adaptive optimization [17] and generational garbage collection. These techniques proved commercially successful in 1999 in the HotSpot Java virtual machine. [18] Other innovations include having a register-based virtual machine, to better match the underlying hardware, rather than a stack-based virtual machine, which is a closer match for the programming language; in 1995, this was pioneered by the Dis virtual machine for the Limbo language. OpenJ9 is an alternative for HotSpot JVM in OpenJDK and is an open source eclipse project claiming better startup and less resource consumption compared to HotSpot.

Full virtualization

Logical diagram of full virtualization Hardware Virtualization (copy).svg
Logical diagram of full virtualization

In full virtualization, the virtual machine simulates enough hardware to allow an unmodified "guest" OS (one designed for the same instruction set) to be run in isolation. This approach was pioneered in 1966 with the IBM CP-40 and CP-67, predecessors of the VM family.

Examples outside the mainframe field include Parallels Workstation, Parallels Desktop for Mac, VirtualBox, Virtual Iron, Oracle VM, Virtual PC, Virtual Server, Hyper-V, VMware Workstation, VMware Server (discontinued, formerly called GSX Server), VMware ESXi, QEMU, Adeos, Mac-on-Linux, Win4BSD, Win4Lin Pro, and Egenera vBlade technology.

Hardware-assisted virtualization

In hardware-assisted virtualization, the hardware provides architectural support that facilitates building a virtual machine monitor and allows guest OSes to be run in isolation. [19] Hardware-assisted virtualization was first introduced on the IBM System/370 in 1972,[ citation needed ] for use with VM/370, the first virtual machine operating system offered by IBM as an official product.

In 2005 and 2006, Intel and AMD provided additional hardware to support virtualization. Sun Microsystems (now Oracle Corporation) added similar features in their UltraSPARC T-Series processors in 2005. Examples of virtualization platforms adapted to such hardware include KVM, VMware Workstation, VMware Fusion, Hyper-V, Windows Virtual PC, Xen, Parallels Desktop for Mac, Oracle VM Server for SPARC, VirtualBox and Parallels Workstation.

In 2006, first-generation 32- and 64-bit x86 hardware support was found to rarely offer performance advantages over software virtualization. [20]

Operating-system-level virtualization

In operating-system-level virtualization, a physical server is virtualized at the operating system level, enabling multiple isolated and secure virtualized servers to run on a single physical server. The "guest" operating system environments share the same running instance of the operating system as the host system. Thus, the same operating system kernel is also used to implement the "guest" environments, and applications running in a given "guest" environment view it as a stand-alone system. The pioneer implementation was FreeBSD jails; other examples include Docker, Solaris Containers, OpenVZ, Linux-VServer, LXC, AIX Workload Partitions, Parallels Virtuozzo Containers, and iCore Virtual Accounts.

See also

Related Research Articles

UCSD Pascal Pascal programming language system

UCSD Pascal is a Pascal programming language system that runs on the UCSD p-System, a portable, highly machine-independent operating system. UCSD Pascal was first released in 1977. It was developed at the University of California, San Diego (UCSD).

Computer operating systems (OSes) provide a set of functions needed and used by most application programs on a computer, and the links needed to control and synchronize computer hardware. On the first computers, with no operating system, every program needed the full hardware specification to run correctly and perform standard tasks, and its own drivers for peripheral devices like printers and punched paper card readers. The growing complexity of hardware and application programs eventually made operating systems a necessity for everyday use.

A computing platform or digital platform is the environment in which a piece of software is executed. It may be the hardware or the operating system (OS), even a web browser and associated application programming interfaces, or other underlying software, as long as the program code is executed with it. Computing platforms have different abstraction levels, including a computer architecture, an OS, or runtime libraries. A computing platform is the stage on which computer programs can run.

VM (operating system) Family of IBM operating systems

VM is a family of IBM virtual machine operating systems used on IBM mainframes System/370, System/390, zSeries, System z and compatible systems, including the Hercules emulator for personal computers.

HotSpot, released as Java HotSpot Performance Engine, is a Java virtual machine for desktop and server computers, maintained and distributed by Oracle Corporation. It features improved performance via methods such as just-in-time compilation and adaptive optimization.

In computing, hardware-assisted virtualization is a platform virtualization approach that enables efficient full virtualization using help from hardware capabilities, primarily from the host processors. Full virtualization is used to simulate a complete hardware environment, or virtual machine, in which an unmodified guest operating system effectively executes in complete isolation. Hardware-assisted virtualization was added to x86 processors in 2005 and 2006 (respectively).

The IBM M44/44X was an experimental computer system from the mid-1960s, designed and operated at IBM's Thomas J. Watson Research Center at Yorktown Heights, New York. It was based on an IBM 7044, and simulated multiple 7044 virtual machines, using both hardware and software. Key team members were Dave Sayre and Rob Nelson. This was a groundbreaking machine, used to explore paging, the virtual machine concept, and computer performance measurement. It was purely a research system, and was cited in 1981 by Peter Denning as an outstanding example of experimental computer science.

Hardware virtualization is the virtualization of computers as complete hardware platforms, certain logical abstractions of their componentry, or only the functionality required to run various operating systems. Virtualization hides the physical characteristics of a computing platform from the users, presenting instead an abstract computing platform. At its origins, the software that controlled virtualization was called a "control program", but the terms "hypervisor" or "virtual machine monitor" became preferred over time.

VMmark is a freeware virtual machine benchmark software suite from VMware, Inc. The suite measures the performance of virtualized servers while running under load on a set of physical hardware. VMmark was independently developed by VMware.

A virtual firewall (VF) is a network firewall service or appliance running entirely within a virtualized environment and which provides the usual packet filtering and monitoring provided via a physical network firewall. The VF can be realized as a traditional software firewall on a guest virtual machine already running, a purpose-built virtual security appliance designed with virtual network security in mind, a virtual switch with additional security capabilities, or a managed kernel process running within the host hypervisor.

Charon is the brand name of a group of software products able to emulate several CPU architectures. The emulators available under this brand mostly cover the Digital Equipment DEC hardware platforms PDP-11, VAX, and AlphaServer, which support many of the legacy operating systems, including Tru64 and OpenVMS. The product range also includes virtualization solutions for HP 3000 using MPE/iX and SPARC. Charon software products have been developed by the Swiss software company Stromasys SA, which has its headquarters in Cointrin, near Geneva.

VM-aware storage (VAS) is computer data storage designed specifically for managing storage for virtual machines (VMs) within a data center. The goal is to provide storage that is simpler to use with functionality better suited for VMs compared with general-purpose storage. VM-aware storage allows storage to be managed as an integrated part of managing VMs rather than as logical unit numbers (LUNs) or volumes that are separately configured and managed.

In computing, a system virtual machine is a virtual machine provides a complete system platform which supports the execution of a complete operating system (OS). These usually emulate an existing architecture, and are built with the purpose of either providing a platform to run programs where the real hardware is not available for use, or of having multiple instances of virtual machines leading to more efficient use of computing resources, both in terms of energy consumption and cost effectiveness, or both. A VM was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real machine".

Unikernel specialised, single address space machine images

A unikernel is a specialised, single address space machine image constructed by using library operating systems. A developer selects, from a modular stack, the minimal set of libraries which correspond to the OS constructs required for their application to run. These libraries are then compiled with the application and configuration code to build sealed, fixed-purpose images (unikernels) which run directly on a hypervisor or hardware without an intervening OS such as Linux or Windows.

References

  1. Popek, Gerald J.; Goldberg, Robert P. (1974). "Formal requirements for virtualizable third generation architectures" (PDF). Communications of the ACM . 17 (7): 412–421. doi:10.1145/361011.361073.
  2. 1 2 Smith, James E.; Nair, Ravi (2005). "The Architecture of Virtual Machines". Computer . 38 (5): 32–38, 395–396. doi:10.1109/MC.2005.173.
  3. Oliphant, Patrick. "Virtual Machines". VirtualComputing. Archived from the original on 2016-07-29. Retrieved 2015-09-23. Some people use that capability to set up a separate virtual machine running Windows on a Mac, giving them access to the full range of applications available for both platforms.
  4. "Super Fast Server Reboots – Another reason Virtualization rocks". vmwarez.com. 2006-05-09. Archived from the original on 2006-06-14. Retrieved 2013-06-14.
  5. "Server Consolidation and Containment With Virtual Infrastructure" (PDF). VMware. 2007. Archived (PDF) from the original on 2013-12-28. Retrieved 2015-09-29.
  6. Pugh, Emerson W. (1995). Building IBM: Shaping an Industry and Its Technology. MIT. p. 274. ISBN   978-0-262-16147-3.
  7. Pugh, Emerson W.; et al. (1991). IBM's 360 and Early 370 Systems. MIT. pp. 160–161. ISBN   978-0-262-16123-7.
  8. Wirth, Niklaus Emil; Weber, Helmut (1966). EULER: a generalization of ALGOL, and its formal definition: Part II, Communications of the Association for Computing Machinery. 9. New York: ACM. pp. 89–99.
  9. Griswold, Ralph E. The Macro Implementation of SNOBOL4. San Francisco, CA: W. H. Freeman and Company, 1972 ( ISBN   0-7167-0447-1), Chapter 1.
  10. Nelson, Graham A. "About Interpreters". Inform website. Archived from the original on 2009-12-03. Retrieved 2009-11-07.
  11. Goldberg, Adele; Robson, David (1983). Smalltalk-80: The Language and its Implementation . Addison-Wesley Series in Computer Science. Addison-Wesley. ISBN   978-0-201-11371-6.
  12. Deutsch, L. Peter; Schiffman, Allan M. (1984). "Efficient implementation of the Smalltalk-80 system". POPL. Salt Lake City, Utah: ACM. doi:10.1145/800017.800542. ISBN   0-89791-125-3.
  13. Aycock, John (2003). "A brief history of just-in-time". ACM Comput. Surv. 35 (2): 97–113. doi:10.1145/857076.857077.
  14. Ingalls, Jr., Daniel "Dan" Henry Holmes; Kaehler, Ted; Maloney, John; Wallace, Scott; Kay, Alan Curtis (1997). "Back to the future: the story of Squeak, a practical Smalltalk written in itself". OOPSLA '97: Proceedings of the 12th ACM SIGPLAN conference on Object-oriented programming, systems, languages, and applications. New York, NY, USA: ACM Press. pp. 318–326. doi:10.1145/263698.263754. ISBN   0-89791-908-4.
  15. Bracha, Gilad; Griswold, David (1993). "Strongtalk: Typechecking Smalltalk in a Production Environment". Proceedings of the Eighth Annual Conference on Object-oriented Programming Systems, Languages, and Applications. OOPSLA '93. New York, NY, USA: ACM. pp. 215–230. doi:10.1145/165854.165893. ISBN   978-0-89791-587-8.
  16. Ungar, David Michael; Smith, Randall B. (December 1987). "Self: The power of simplicity". ACM SIGPLAN Notices. 22 (12): 227–242. doi:10.1145/38807.38828. ISSN   0362-1340.
  17. Hölzle, Urs; Ungar, David Michael (1994). "Optimizing dynamically-dispatched calls with run-time type feedback". PLDI. Orlando, Florida, United States: ACM. pp. 326–336. doi:10.1145/178243.178478. ISBN   0-89791-662-X.
  18. Paleczny, Michael; Vick, Christopher; Click, Cliff (2001). "The Java HotSpot server compiler". Proceedings of the Java Virtual Machine Research and Technology Symposium on Java Virtual Machine Research and Technology Symposium. 1. Monterey, California: USENIX Association.
  19. Uhlig, Rich; Neiger, Gil; Rodgers, Dion; Santoni, Amy L.; Martins, Fernando C. M.; Anderson, Andrew V.; Bennett, Steven M.; Kägi, Alain; Leung, Felix H.; Smith, Larry (May 2005). "Intel virtualization technology". Computer. 38 (5): 48–56. doi:10.1109/MC.2005.163.
  20. Adams, Keith; Agesen, Ole (2006-10-21). A Comparison of Software and Hardware Techniques for x86 Virtualization (PDF). ASPLOS’06 21–25 October 2006. San Jose, California, USA. Archived (PDF) from the original on 2010-08-20. Surprisingly, we find that the first-generation hardware support rarely offers performance advantages over existing software techniques. We ascribe this situation to high VMM/guest transition costs and a rigid programming model that leaves little room for software flexibility in managing either the frequency or cost of these transitions.

Further reading