Vkernel

Last updated
vkernel
Original author(s) Matthew Dillon
Developer(s) DragonFly BSD
Initial release DragonFly BSD § 1.8 (30 January 2007;15 years ago (2007-01-30)) [1] [2]
Repository sys/vkernel.h , dev/virtual/vkernel/ , vm/vm_vmspace.c , …
Written in C
Operating system DragonFly BSD
Type OS-level virtualisation, virtualised userspace kernel
Licence BSD Licence
Website vkernel(7)

A virtual kernel architecture (vkernel) is an operating system virtualisation paradigm where kernel code can be compiled to run in the user space, for example, to ease debugging of various kernel-level components, [3] [4] [5] in addition to general-purpose virtualisation and compartmentalisation of system resources. It is used by DragonFly BSD in its vkernel implementation since DragonFly 1.7, [2] having been first revealed in September 2006(16 years ago), [3] [6] and first released in the stable branch with DragonFly 1.8 in January 2007(15 years ago). [1] [7] [8] [9] The long-term goal, in addition to easing kernel development, is to make it easier to support internet-connected computer clusters without compromising local security. [3] [4] Similar concepts exist in other operating systems as well; in Linux, a similar virtualisation concept is known as user-mode Linux; [10] [7] whereas in NetBSD since the summer of 2007, it has been the initial focus of the rump kernel infrastructure. [11]

Contents

The virtual kernel concept is nearly the exact opposite of the unikernel concept — with vkernel, kernel components get to run in userspace to ease kernel development and debugging, supported by a regular operating system kernel; whereas with a unikernel, userspace-level components get to run directly in kernel space for extra performance, supported by baremetal hardware or a hardware virtualisation stack. However, both vkernels and unikernels can be used for similar tasks as well, for example, to self-contain software to a virtualised environment with low overhead. In fact, NetBSD's rump kernel, originally having a focus of running kernel components in userspace, has since shifted into the unikernel space as well (going after the anykernel moniker for supporting both paradigms).

The vkernel concept is different from a FreeBSD jail in that a jail is only meant for resource isolation, and cannot be used to develop and test new kernel functionality in the userland, because each jail is sharing the same kernel. [7] (DragonFly, however, still has FreeBSD jail support as well. [7] )

In DragonFly, the vkernel can be thought of as a first-class computer architecture, comparable to i386 or amd64, and, according to Matthew Dillon circa 2007, can be used as a starting point for porting DragonFly BSD to new architectures. [12]

DragonFly's vkernel is supported by the host kernel through new system calls that help manage virtual memory address space (vmspace) — vmspace_create() et al., [3] [9] [13] as well as extensions to several existing system calls like mmap 's madvisemcontrol. [9] [14] [15]

See also

Related Research Articles

Processor affinity, or CPU pinning or "cache affinity", enables the binding and unbinding of a process or a thread to a central processing unit (CPU) or a range of CPUs, so that the process or thread will execute only on the designated CPU or CPUs rather than any CPU. This can be viewed as a modification of the native central queue scheduling algorithm in a symmetric multiprocessing operating system. Each item in the queue has a tag indicating its kin processor. At the time of resource allocation, each task is allocated to its kin processor in preference to others.

<span class="mw-page-title-main">DragonFly BSD</span> BSD operating system

DragonFly BSD is a free and open-source Unix-like operating system forked from FreeBSD 4.8. Matthew Dillon, an Amiga developer in the late 1980s and early 1990s and FreeBSD developer between 1994 and 2003, began working on DragonFly BSD in June 2003 and announced it on the FreeBSD mailing lists on 16 July 2003.

Light Weight Kernel Threads (LWKT) is a computer science term and from DragonFlyBSD in particular. LWKTs differ from normal kernel threads in that they can preempt normal kernel threads. According to Matt Dillon, DragonFlyBSD creator:

The LWKT scheduler is responsible for actually running a thread. It uses a fixed priority scheme, but the fixed priorities are differentiating major subsystems, not user processes. For example, hardware interrupt threads have the highest priority, followed by software interrupts, kernel-only threads, then finally user threads. A user thread either runs at user-kernel priority, or a user thread runs at user priority.

DragonFly does preempt, it just does it very carefully and only under particular circumstances. An LWKT interrupt thread can preempt most other threads, for example. This mimics what FreeBSD-4.x already did with its spl/run-interrupt-in-context-of-current-process mechanism. What DragonFly does *NOT* do is allow a non-interrupt kernel thread to preempt another non-interrupt kernel thread.

The mainframe z/OS Operating system supports a similar mechanism, called SRB.

SRB's represent requests to execute a system service routine. SRB's are typically created when one address space detects an event that affects a different address space; they provide one of several mechanisms for asynchronous inter-address space communication for programs running on z/OS.

An SRB is similar to a Process Control Block (PCB), in that it identifies a unit of work to the system. Unlike a PCB, an SRB cannot "own" storage areas. In a multiprocessor environment, the SRB routine, after being scheduled, can be dispatched on another processor and can run concurrently with the scheduling program. The scheduling program can continue to do other processing in parallel with the SRB routine. Only programs running in kernel mode can create an SRB.

The Windows Operating System knows a similar light weight thread mechanism named "fibers". Fibers are scheduled by an application program. The port of the CICS Transaction Server to the Windows platform uses fibers, somewhat analogous to the use of "enclaves" under z/OS.

In UNIX, "kernel threads" have two threads, one is the core thread, one is the user thread.

<span class="mw-page-title-main">Matthew Dillon</span> American software engineer

Matthew Dillon is an American software engineer known for Amiga software, contributions to FreeBSD and for starting and leading the DragonFly BSD project since 2003.

sysfs is a pseudo file system provided by the Linux kernel that exports information about various kernel subsystems, hardware devices, and associated device drivers from the kernel's device model to user space through virtual files. In addition to providing information about various devices and kernel subsystems, exported virtual files are also used for their configuration.

The device mapper is a framework provided by the Linux kernel for mapping physical block devices onto higher-level virtual block devices. It forms the foundation of the logical volume manager (LVM), software RAIDs and dm-crypt disk encryption, and offers additional features such as file system snapshots.

sysctl Unix-like software that manages kernel attributes

sysctl is a software utility of some Unix-like operating systems that reads and modifies the attributes of the system kernel such as its version number, maximum limits, and security settings. It is available both as a system call for compiled programs, and an administrator command for interactive use and scripting. Linux additionally exposes sysctl as a virtual file system.

OS-level virtualization is an operating system (OS) paradigm in which the kernel allows the existence of multiple isolated user space instances, called containers, zones, virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels, or jails. Such instances may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources of that computer. However, programs running inside of a container can only see the container's contents and devices assigned to the container.

These tables compare free software / open-source operating systems. Where not all of the versions support a feature, the first version which supports it is listed.

In the context of free and open-source software, proprietary software only available as a binary executable is referred to as a blob or binary blob. The term usually refers to a device driver module loaded into the kernel of an open-source operating system, and is sometimes also applied to code running outside the kernel, such as system firmware images, microcode updates, or userland programs. The term blob was first used in database management systems to describe a collection of binary data stored as a single entity.

The Berkeley Packet Filter (BPF) is a technology used in certain computer operating systems for programs that need to, among other things, analyze network traffic. It provides a raw interface to data link layers, permitting raw link-layer packets to be sent and received. In addition, if the driver for the network interface supports promiscuous mode, it allows the interface to be put into that mode so that all packets on the network can be received, even those destined to other hosts.

A kernel debugger is a debugger present in some operating system kernels to ease debugging and kernel development by the kernel developers. A kernel debugger might be a stub implementing low-level operations, with a full-blown debugger such as GNU Debugger (gdb), running on another machine, sending commands to the stub over a serial line or a network connection, or it might provide a command line that can be used directly on the machine being debugged.

In operating systems, a giant lock, also known as a big-lock or kernel-lock, is a lock that may be used in the kernel to provide concurrency control required by symmetric multiprocessing (SMP) systems.

HAMMER is a high-availability 64-bit file system developed by Matthew Dillon for DragonFly BSD using B+ trees. Its major features include infinite NFS-exportable snapshots, master–multislave operation, configurable history retention, fsckless-mount, and checksums to deal with data corruption. HAMMER also supports data block deduplication, meaning that identical data blocks will be stored only once on a file system. A successor, HAMMER2, was announced in 2011 and became the default in Dragonfly 5.2.

<span class="mw-page-title-main">NetBSD</span> Open-source Unix-like operating system

NetBSD is a free and open-source Unix-like operating system based on the Berkeley Software Distribution (BSD). It was the first open-source BSD descendant officially released after 386BSD was forked. It continues to be actively developed and is available for many platforms, including servers, desktops, handheld devices, and embedded systems.

<span class="mw-page-title-main">Rump kernel</span> Software run in userspace that offers kernel functionality

The NetBSD rump kernel is the first implementation of the "anykernel" concept where drivers either can be compiled into or run in the monolithic kernel or in user space on top of a light-weight kernel. The NetBSD drivers can be used on top of the rump kernel on a wide range of POSIX operating systems, such as the Hurd, Linux, NetBSD, DragonFly BSD, Solaris kernels and even Cygwin, along with the file system utilities built with the rump libraries. The rump kernels can also run without POSIX directly on top of the Xen hypervisor, an L4 microkernel using the Genode OS Framework or even on "OS-less" bare metal.

ftrace is a tracing framework for the Linux kernel. Although its original name, Function Tracer, came from ftrace's ability to record information related to various function calls performed while the kernel is running, ftrace's tracing capabilities cover a much broader range of kernel's internal operations.

kdump is a feature of the Linux kernel that creates crash dumps in the event of a kernel crash. When triggered, kdump exports a memory image that can be analyzed for the purposes of debugging and determining the cause of a crash. The dumped image of main memory, exported as an Executable and Linkable Format (ELF) object, can be accessed either directly through /proc/vmcore during the handling of a kernel crash, or it can be automatically saved to a locally accessible file system, to a raw device, or to a remote system accessible over network.

<span class="mw-page-title-main">Unikernel</span> Specialised, single address space machine images

A unikernel is a specialised, single address space machine image constructed by using library operating systems. A developer selects, from a modular stack, the minimal set of libraries which correspond to the OS constructs required for the application to run. These libraries are then compiled with the application and configuration code to build sealed, fixed-purpose images (unikernels) which run directly on a hypervisor or hardware without an intervening OS such as Linux or Windows.

References

  1. 1 2 "Release Notes for DragonFly 1.8.0". DragonFly BSD. 2007-01-30. Retrieved 2019-04-08.
  2. 1 2 Sascha Wildner (2007-01-08). "vkernel, vcd, vkd, vke — virtual kernel architecture". DragonFly Miscellaneous Information Manual. DragonFly BSD.
  3. 1 2 3 4 Matthew Dillon (2006-09-02). "Cache coherency, clustering, and Kernel virtualization". kernel@ (Mailing list). DragonFly BSD.
  4. 1 2 Jeremy C. Reed, ed. (2007-02-10). "Answers from Matt Dillon about DragonFly's virtual kernel". BSD Newsletter .com. Reed Media .net. Archived from the original on 2007-02-24.
  5. Daniel Lorch (2009-06-10). "Porting the HAMMER File System to Linux" (PDF). UML runs the Linux kernel as a userspace process, analogous to DragonFly BSD's virtual kernels … allow for a quick recompile-restart cycle and can be debugged easily with standard gdb.
  6. Matthew Dillon (2006). "sys/vkernel.h". BSD Cross Reference. DragonFly BSD.
  7. 1 2 3 4 David Chisnall (2007-06-15). "DragonFly BSD: UNIX for Clusters?". InformIT . Prentice Hall Professional . Retrieved 2019-03-06.
  8. Aggelos Economopoulos (2010-03-29). "A peek at the DragonFly Virtual Kernel (part 1)". LWN.net . Retrieved 2019-04-08.
  9. 1 2 3 Aggelos Economopoulos (2010-04-16). "A peek at the DragonFly Virtual Kernel (part 2)". LWN.net . Retrieved 2019-04-08.
  10. Koen Vervloesem (2010-04-21). "DragonFly BSD 2.6: towards a free clustering operating system". LWN.net . Retrieved 2019-03-07.
  11. Antti Kantee (2007-08-05). "CVS commit: src/sys/rump". source-changes@ (Mailing list). NetBSD. Introduce RUMPs - Runnable Userspace Meta-Programs
  12. Jeremy Andrews (2007-08-06). "Interview: Matthew Dillon". KernelTrap . Archived from the original on 2012-02-07. Retrieved 2019-03-03.
  13. Matthew Dillon (2006). "vm/vm_vmspace.c § sys_vmspace_create()". BSD Cross Reference. DragonFly BSD.
  14. "madvise, posix_madvise, mcontrol — give advice about use of memory". DragonFly System Calls Manual. DragonFly BSD.
  15. "kern/syscalls.master". BSD Cross Reference. DragonFly BSD.