Original author(s) | Matthew Dillon |
---|---|
Developer(s) | DragonFly BSD |
Initial release | DragonFly BSD § 1.8 (30 January 2007 ) [1] [2] |
Repository | sys/vkernel.h , dev/virtual/vkernel/ , vm/vm_vmspace.c , … |
Written in | C |
Operating system | DragonFly BSD |
Type | OS-level virtualisation, virtualised userspace kernel |
Licence | BSD Licence |
Website | vkernel(7) |
A virtual kernel architecture (vkernel) is an operating system virtualisation paradigm where kernel code can be compiled to run in the user space, for example, to ease debugging of various kernel-level components, [3] [4] [5] in addition to general-purpose virtualisation and compartmentalisation of system resources. It is used by DragonFly BSD in its vkernel implementation since DragonFly 1.7, [2] having been first revealed in September 2006 , [3] [6] and first released in the stable branch with DragonFly 1.8 in January 2007 . [1] [7] [8] [9]
The long-term goal, in addition to easing kernel development, is to make it easier to support internet-connected computer clusters without compromising local security. [3] [4]
Similar concepts exist in other operating systems as well; in Linux, a similar virtualisation concept is known as user-mode Linux; [10] [7] whereas in NetBSD since the summer of 2007, it has been the initial focus of the rump kernel infrastructure. [11]
The virtual kernel concept is nearly the exact opposite of the unikernel concept — with vkernel, kernel components get to run in userspace to ease kernel development and debugging, supported by a regular operating system kernel; whereas with a unikernel, userspace-level components get to run directly in kernel space for extra performance, supported by baremetal hardware or a hardware virtualisation stack. However, both vkernels and unikernels can be used for similar tasks as well, for example, to self-contain software to a virtualised environment with low overhead. In fact, NetBSD's rump kernel, originally having a focus of running kernel components in userspace, has since shifted into the unikernel space as well (going after the anykernel moniker for supporting both paradigms).
The vkernel concept is different from a FreeBSD jail in that a jail is only meant for resource isolation, and cannot be used to develop and test new kernel functionality in the userland, because each jail is sharing the same kernel. [7] (DragonFly, however, still has FreeBSD jail support as well. [7] )
In DragonFly, the vkernel can be thought of as a first-class computer architecture, comparable to i386 or amd64, and, according to Matthew Dillon circa 2007, can be used as a starting point for porting DragonFly BSD to new architectures. [12]
DragonFly's vkernel is supported by the host kernel through new system calls that help manage virtual memory address space (vmspace) — vmspace_create()
et al., [3] [9] [13] as well as extensions to several existing system calls like mmap
's madvise
— mcontrol
. [9] [14] [15]
Processor affinity, or CPU pinning or "cache affinity", enables the binding and unbinding of a process or a thread to a central processing unit (CPU) or a range of CPUs, so that the process or thread will execute only on the designated CPU or CPUs rather than any CPU. This can be viewed as a modification of the native central queue scheduling algorithm in a symmetric multiprocessing operating system. Each item in the queue has a tag indicating its kin processor. At the time of resource allocation, each task is allocated to its kin processor in preference to others.
DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in June 2003 and announced it on the FreeBSD mailing lists on 16 July 2003.
Light Weight Kernel Threads (LWKT) is a computer science term and from DragonFly BSD in particular. LWKTs differ from normal kernel threads in that they can preempt normal kernel threads. According to Matt Dillon, DragonFlyBSD creator:
The LWKT scheduler is responsible for actually running a thread. It uses a fixed priority scheme, but the fixed priorities are differentiating major subsystems, not user processes. For example, hardware interrupt threads have the highest priority, followed by software interrupts, kernel-only threads, then finally user threads. A user thread either runs at user-kernel priority, or a user thread runs at user priority.
DragonFly does preempt, it just does it very carefully and only under particular circumstances. An LWKT interrupt thread can preempt most other threads, for example. This mimics what FreeBSD-4.x already did with its spl/run-interrupt-in-context-of-current-process mechanism. What DragonFly does *NOT* do is allow a non-interrupt kernel thread to preempt another non-interrupt kernel thread.
The mainframe z/OS Operating system supports a similar mechanism, called SRB.
SRB's represent requests to execute a system service routine. SRB's are typically created when one address space detects an event that affects a different address space; they provide one of several mechanisms for asynchronous inter-address space communication for programs running on z/OS.
An SRB is similar to a Process Control Block (PCB), in that it identifies a unit of work to the system. Unlike a PCB, an SRB cannot "own" storage areas. In a multiprocessor environment, the SRB routine, after being scheduled, can be dispatched on another processor and can run concurrently with the scheduling program. The scheduling program can continue to do other processing in parallel with the SRB routine. Only programs running in kernel mode can create an SRB.
The Windows Operating System knows a similar light weight thread mechanism named "fibers". Fibers are scheduled by an application program. The port of the CICS Transaction Server to the Windows platform uses fibers, somewhat analogous to the use of "enclaves" under z/OS.
In UNIX, "kernel threads" have two threads, one is the core thread, one is the user thread.
Matthew Dillon is an American software engineer known for Amiga software, contributions to FreeBSD and for starting and leading the DragonFly BSD project since 2003.
sysfs is a pseudo file system provided by the Linux kernel that exports information about various kernel subsystems, hardware devices, and associated device drivers from the kernel's device model to user space through virtual files. In addition to providing information about various devices and kernel subsystems, exported virtual files are also used for their configuration.
The device mapper is a framework provided by the Linux kernel for mapping physical block devices onto higher-level virtual block devices. It forms the foundation of the logical volume manager (LVM), software RAIDs and dm-crypt disk encryption, and offers additional features such as file system snapshots.
sysctl is a software mechanism in some Unix-like operating systems that reads and modifies the attributes of the system kernel such as its version number, maximum limits, and security settings. It is available both as a system call for compiled programs, and an administrator command for interactive use and scripting. Linux additionally exposes sysctl as a virtual file system.
OS-level virtualization is an operating system (OS) virtualization paradigm in which the kernel allows the existence of multiple isolated user space instances, called containers, zones, virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels, or jails. Such instances may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources of that computer. However, programs running inside of a container can only see the container's contents and devices assigned to the container.
These tables compare free software / open-source operating systems. Where not all of the versions support a feature, the first version which supports it is listed.
In the context of free and open-source software, proprietary software only available as a binary executable is referred to as a blob or binary blob. The term usually refers to a device driver module loaded into the kernel of an open-source operating system, and is sometimes also applied to code running outside the kernel, such as system firmware images, microcode updates, or userland programs. The term blob was first used in database management systems to describe a collection of binary data stored as a single entity.
The Berkeley Packet Filter is a network tap and packet filter which permits computer network packets to be captured and filtered at the operating system level. It provides a raw interface to data link layers, permitting raw link-layer packets to be sent and received, and allows a userspace process to supply a filter program that specifies which packets it wants to receive. For example, a tcpdump process may want to receive only packets that initiate a TCP connection. BPF returns only packets that pass the filter that the process supplies. This avoids copying unwanted packets from the operating system kernel to the process, greatly improving performance. The filter program is in the form of instructions for a virtual machine, which are interpreted, or compiled into machine code by a just-in-time (JIT) mechanism and executed, in the kernel.
A kernel debugger is a debugger present in some operating system kernels to ease debugging and kernel development by the kernel developers. A kernel debugger might be a stub implementing low-level operations, with a full-blown debugger such as GNU Debugger (gdb), running on another machine, sending commands to the stub over a serial line or a network connection, or it might provide a command line that can be used directly on the machine being debugged.
In operating systems, a giant lock, also known as a big-lock or kernel-lock, is a lock that may be used in the kernel to provide concurrency control required by symmetric multiprocessing (SMP) systems.
The Linux kernel is a free and open source, UNIX-like kernel that is used in many computer systems worldwide. The kernel was created by Linus Torvalds in 1991 and was soon adopted as the kernel for the GNU operating system (OS) which was created to be a free replacement for Unix. Since the late 1990s, it has been included in many operating system distributions, many of which are called Linux. One such Linux kernel operating system is Android which is used in many mobile and embedded devices.
"Everything is a file" is an idea that Unix, and its derivatives, handle input/output to and from resources such as documents, hard-drives, modems, keyboards, printers and even some inter-process and network communications as simple streams of bytes exposed through the filesystem name space. Exceptions include semaphores, processes and threads.
NetBSD is a free and open-source Unix-like operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was forked. It continues to be actively developed and is available for many platforms, including servers, desktops, handheld devices, and embedded systems.
The NetBSD rump kernel is the first implementation of the "anykernel" concept where drivers either can be compiled into or run in the monolithic kernel or in user space on top of a light-weight kernel. The NetBSD drivers can be used on top of the rump kernel on a wide range of POSIX operating systems, such as the Hurd, Linux, NetBSD, DragonFly BSD, Solaris kernels and even Cygwin, along with the file system utilities built with the rump libraries. The rump kernels can also run without POSIX directly on top of the Xen hypervisor, an L4 microkernel using the Genode OS Framework or even on "OS-less" bare metal.
ftrace is a tracing framework for the Linux kernel. Although its original name, Function Tracer, came from ftrace's ability to record information related to various function calls performed while the kernel is running, ftrace's tracing capabilities cover a much broader range of kernel's internal operations.
kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. When triggered, kdump exports a memory image that can be analyzed for the purposes of debugging and determining the cause of a crash. The dumped image of main memory, exported as an Executable and Linkable Format (ELF) object, can be accessed either directly through /proc/vmcore during the handling of a kernel crash, or it can be automatically saved to a locally accessible file system, to a raw device, or to a remote system accessible over network.
UML runs the Linux kernel as a userspace process, analogous to DragonFly BSD's virtual kernels … allow for a quick recompile-restart cycle and can be debugged easily with standard gdb.
Introduce RUMPs - Runnable Userspace Meta-Programs