This article needs additional citations for verification .(November 2020) |
OS-level virtualization is an operating system (OS) virtualization paradigm in which the kernel allows the existence of multiple isolated user space instances, including containers (LXC, Solaris Containers, AIX WPARs, HP-UX SRP Containers, Docker, Podman), zones (Solaris Containers), virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels (DragonFly BSD), and jails (FreeBSD jail and chroot). [1] Such instances may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources (connected devices, files and folders, network shares, CPU power, quantifiable hardware capabilities) of that computer. Programs running inside a container can only see the container's contents and devices assigned to the container.
On Unix-like operating systems, this feature can be seen as an advanced implementation of the standard chroot mechanism, which changes the apparent root folder for the current running process and its children. In addition to isolation mechanisms, the kernel often provides resource-management features to limit the impact of one container's activities on other containers. Linux containers are all based on the virtualization, isolation, and resource management mechanisms provided by the Linux kernel, notably Linux namespaces and cgroups. [2]
Although the word container most commonly refers to OS-level virtualization, it is sometimes used to refer to fuller virtual machines operating in varying degrees of concert with the host OS,[ citation needed ] such as Microsoft's Hyper-V containers.[ citation needed ] For an overview of virtualization since 1960, see Timeline of virtualization technologies.
On ordinary operating systems for personal computers, a computer program can see (even though it might not be able to access) all the system's resources. They include:
The operating system may be able to allow or deny access to such resources based on which program requests them and the user account in the context in which it runs. The operating system may also hide those resources, so that when the computer program enumerates them, they do not appear in the enumeration results. Nevertheless, from a programming point of view, the computer program has interacted with those resources and the operating system has managed an act of interaction.
With operating-system-virtualization, or containerization, it is possible to run programs within containers, to which only parts of these resources are allocated. A program expecting to see the whole computer, once run inside a container, can only see the allocated resources and believes them to be all that is available. Several containers can be created on each operating system, to each of which a subset of the computer's resources is allocated. Each container may contain any number of computer programs. These programs may run concurrently or separately, and may even interact with one another.
Containerization has similarities to application virtualization: In the latter, only one computer program is placed in an isolated container and the isolation applies to file system only.
Operating-system-level virtualization is commonly used in virtual hosting environments, where it is useful for securely allocating finite hardware resources among a large number of mutually-distrusting users. System administrators may also use it for consolidating server hardware by moving services on separate hosts into containers on the one server.
Other typical scenarios include separating several programs to separate containers for improved security, hardware independence, and added resource management features. [3] The improved security provided by the use of a chroot mechanism, however, is not perfect. [4] Operating-system-level virtualization implementations capable of live migration can also be used for dynamic load balancing of containers between nodes in a cluster.
Operating-system-level virtualization usually imposes less overhead than full virtualization because programs in OS-level virtual partitions use the operating system's normal system call interface and do not need to be subjected to emulation or be run in an intermediate virtual machine, as is the case with full virtualization (such as VMware ESXi, QEMU, or Hyper-V) and paravirtualization (such as Xen or User-mode Linux). This form of virtualization also does not require hardware support for efficient performance.
Operating-system-level virtualization is not as flexible as other virtualization approaches since it cannot host a guest operating system different from the host one, or a different guest kernel. For example, with Linux, different distributions are fine, but other operating systems such as Windows cannot be hosted. Operating systems using variable input systematics are subject to limitations within the virtualized architecture. Adaptation methods including cloud-server relay analytics maintain the OS-level virtual environment within these applications. [5]
Solaris partially overcomes the limitation described above with its branded zones feature, which provides the ability to run an environment within a container that emulates an older Solaris 8 or 9 version in a Solaris 10 host. Linux branded zones (referred to as "lx" branded zones) are also available on x86-based Solaris systems, providing a complete Linux user space and support for the execution of Linux applications; additionally, Solaris provides utilities needed to install Red Hat Enterprise Linux 3.x or CentOS 3.x Linux distributions inside "lx" zones. [6] [7] However, in 2010 Linux branded zones were removed from Solaris; in 2014 they were reintroduced in Illumos, which is the open source Solaris fork, supporting 32-bit Linux kernels. [8]
Some implementations provide file-level copy-on-write (CoW) mechanisms. (Most commonly, a standard file system is shared between partitions, and those partitions that change the files automatically create their own copies.) This is easier to back up, more space-efficient and simpler to cache than the block-level copy-on-write schemes common on whole-system virtualizers. Whole-system virtualizers, however, can work with non-native file systems and create and roll back snapshots of the entire system state.
Mechanism | Operating system | License | Actively developed since or between | Features | |||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
File system isolation | Copy on write | Disk quotas | I/O rate limiting | Memory limits | CPU quotas | Network isolation | Nested virtualization | Partition checkpointing and live migration | Root privilege isolation | ||||
chroot | Most UNIX-like operating systems | Varies by operating system | 1982 | Partial [a] | No | No | No | No | No | No | Yes | No | No |
Docker | Linux, [10] Windows x64 [11] macOS [12] | Apache License 2.0 | 2013 | Yes | Yes | Not directly | Yes (since 1.10) | Yes | Yes | Yes | Yes | Only in experimental mode with CRIU | Yes (since 1.10) |
Linux-VServer (security context) | Linux, Windows Server 2016 | GNU GPLv2 | 2001 | Yes | Yes | Yes | Yes [b] | Yes | Yes | Partial [c] | ? | No | Partial [d] |
lmctfy | Linux | Apache License 2.0 | 2013–2015 | Yes | Yes | Yes | Yes [b] | Yes | Yes | Partial [c] | ? | No | Partial [d] |
LXC | Linux | GNU GPLv2 | 2008 | Yes [14] | Yes | Partial [e] | Partial [f] | Yes | Yes | Yes | Yes | Yes | Yes [14] |
Singularity | Linux | BSD Licence | 2015 [15] | Yes [16] | Yes | Yes | No | No | No | No | No | No | Yes [17] |
OpenVZ | Linux | GNU GPLv2 | 2005 | Yes | Yes [18] | Yes | Yes [g] | Yes | Yes | Yes [h] | Partial [i] | Yes | Yes [j] |
Virtuozzo | Linux, Windows | Trialware | 2000 [22] | Yes | Yes | Yes | Yes [k] | Yes | Yes | Yes [h] | Partial [l] | Yes | Yes |
Solaris Containers (Zones) | illumos (OpenSolaris), Solaris | CDDL, Proprietary | 2004 | Yes | Yes (ZFS) | Yes | Partial [m] | Yes | Yes | Yes [n] [25] [26] | Partial [o] | Partial [p] [q] | Yes [r] |
FreeBSD jail | FreeBSD, DragonFly BSD | BSD License | 2000 [28] | Yes | Yes (ZFS) | Yes [s] | Yes | Yes [29] | Yes | Yes [30] | Yes | Partial [31] [32] | Yes [33] |
vkernel | DragonFly BSD | BSD Licence | 2006 [34] | Yes [35] | Yes [35] | — | ? | Yes [36] | Yes [36] | Yes [37] | ? | ? | Yes |
sysjail | OpenBSD, NetBSD | BSD License | 2006–2009 | Yes | No | No | No | No | No | Yes | No | No | ? |
WPARs | AIX | Commercial proprietary software | 2007 | Yes | No | Yes | Yes | Yes | Yes | Yes [t] | No | Yes [39] | ? |
iCore Virtual Accounts | Windows XP | Freeware | 2008 | Yes | No | Yes | No | No | No | No | ? | No | ? |
Sandboxie | Windows | GNU GPLv3 | 2004 | Yes | Yes | Partial | No | No | No | Partial | No | No | Yes |
systemd-nspawn | Linux | GNU LGPLv2.1+ | 2010 | Yes | Yes | Yes [40] [41] | Yes [40] [41] | Yes [40] [41] | Yes [40] [41] | Yes | ? | ? | Yes |
Turbo | Windows | Freemium | 2012 | Yes | No | No | No | No | No | Yes | No | No | Yes |
rkt (rocket) | Linux | Apache License 2.0 | 2014 [42] –2018 | Yes | Yes | Yes | Yes | Yes | Yes | Yes | ? | ? | Yes |
Linux containers not listed above include:
chroot
is an operation on Unix and Unix-like operating systems that changes the apparent root directory for the current running process and its children. A program that is run in such a modified environment cannot name files outside the designated directory tree. The term "chroot" may refer to the chroot(2) system call or the chroot(8) wrapper program. The modified environment is called a chroot jail.
IPFilter is an open-source software package that provides firewall services and network address translation (NAT) for many Unix-like operating systems. The author and software maintainer is Darren Reed. IPFilter supports both IPv4 and IPv6 protocols, and is a stateful firewall.
These tables provide a comparison of operating systems, of computer devices, as listing general and technical information for a number of widely used and currently available PC or handheld operating systems. The article "Usage share of operating systems" provides a broader, and more general, comparison of operating systems that includes servers, mainframes and supercomputers.
DTrace is a comprehensive dynamic tracing framework originally created by Sun Microsystems for troubleshooting kernel and application problems on production systems in real time. Originally developed for Solaris, it has since been released under the free Common Development and Distribution License (CDDL) in OpenSolaris and its descendant illumos, and has been ported to several other Unix-like systems.
Solaris Containers is an implementation of operating system-level virtualization technology for x86 and SPARC systems, first released publicly in February 2004 in build 51 beta of Solaris 10, and subsequently in the first full release of Solaris 10, 2005. It is present in illumos distributions, such as OpenIndiana, SmartOS, Tribblix and OmniOS, and in the official Oracle Solaris 11 release.
OpenVZ is an operating-system-level virtualization technology for Linux. It allows a physical server to run multiple isolated operating system instances, called containers, virtual private servers (VPSs), or virtual environments (VEs). OpenVZ is similar to Solaris Containers and LXC.
In computing, virtualization is the use of a computer to simulate another computer. The following is a chronological list of virtualization technologies.
Kernel-based Virtual Machine (KVM) is a free and open-source virtualization module in the Linux kernel that allows the kernel to function as a hypervisor. It was merged into the mainline Linux kernel in version 2.6.20, which was released on February 5, 2007. KVM requires a processor with hardware virtualization extensions, such as Intel VT or AMD-V. KVM has also been ported to other operating systems such as FreeBSD and illumos in the form of loadable kernel modules.
chattr is the command in Linux that allows a user to set certain attributes of a file. lsattr is the command that displays the attributes of a file.
In computing, virtualization (v12n) is a series of technologies that allows dividing of physical computing resources into a series of virtual machines, operating systems, processes or containers.
libvirt is an open-source API, daemon and management tool for managing platform virtualization. It can be used to manage KVM, Xen, VMware ESXi, QEMU and other virtualization technologies. These APIs are widely used in the orchestration layer of hypervisors in the development of a cloud-based solution.
Illumos is a partly free and open-source Unix operating system. It has been developed since 2010 and based on OpenSolaris—after the discontinuation of that product by Oracle—and comprises a kernel, device drivers, system libraries, and utility software for system administration. The core is now the base for many different open-sourced Illumos distributions, in a similar way in which the Linux kernel is used in different Linux distributions.
Linux Containers (LXC) is an operating system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel.
SmartOS is a free and open-source SVR4 hypervisor based on the UNIX operating system that combines OpenSolaris technology with bhyve and KVM virtualization. Its core kernel contributes to the illumos project. It features several technologies: Crossbow, DTrace, bhyve, KVM, ZFS, and Zones. Unlike other illumos distributions, SmartOS employs NetBSD pkgsrc package management. SmartOS is designed to be particularly suitable for building clouds and generating appliances. It was originally developed for and by Joyent, who announced in April 2022 that they had sold their business supporting and developing of Triton Datacenter and SmartOS to MNX Solutions. It is open-source and free for anyone to use.
Docker is a set of platform as a service (PaaS) products that use OS-level virtualization to deliver software in packages called containers. The service has both free and premium tiers. The software that hosts the containers is called Docker Engine. It was first released in 2013 and is developed by Docker, Inc.
OpenZFS is an open-source implementation of the ZFS file system and volume manager initially developed by Sun Microsystems for the Solaris operating system, and is now maintained by the OpenZFS Project. Similar to the original ZFS, the implementation supports features like data compression, data deduplication, copy-on-write clones, snapshots, RAID-Z, and virtual devices that can create filesystems that span multiple disks.
A system virtual machine is a virtual machine (VM) that provides a complete system platform and supports the execution of a complete operating system (OS). These usually emulate an existing architecture, and are built with the purpose of either providing a platform to run programs where the real hardware is not available for use, or of having multiple instances of virtual machines leading to more efficient use of computing resources, both in terms of energy consumption and cost effectiveness, or both. A VM was originally defined by Popek and Goldberg as "an efficient, isolated duplicate of a real machine".
Virtuozzo is a software company that develops virtualization and cloud management software for cloud computing providers, managed services providers and internet hosting service providers. The company's software enables service providers to offer Infrastructure as a service, Container-as-a-Service, Platform as a service, Kubernetes-as-a-Service, WordPress-as-a-Service and other solutions.
A virtual kernel architecture (vkernel) is an operating system virtualisation paradigm where kernel code can be compiled to run in the user space, for example, to ease debugging of various kernel-level components, in addition to general-purpose virtualisation and compartmentalisation of system resources. It is used by DragonFly BSD in its vkernel implementation since DragonFly 1.7, having been first revealed in September 2006, and first released in the stable branch with DragonFly 1.8 in January 2007.
There are many other OS-level virtualization systems such as: Linux OpenVZ, Linux-VServer, FreeBSD Jails, AIX Workload Partitions (WPARs), HP-UX Containers (SRP), Solaris Containers, among others.
LXC now has support for user namespaces. [...] LXC is no longer running as root so even if an attacker manages to escape the container, he'd find himself having the privileges of a regular user on the host.
Jails were first introduced in FreeBSD 4.0 in 2000
treats the disk image as copy-on-write.