Sysjail

Last updated

sysjail is a defunct user-land virtualiser for systems supporting the systrace library - as of version 1.0 limited to OpenBSD, NetBSD and MirOS. Its original design was inspired by FreeBSD jail, a similar utility (although part of the kernel) for FreeBSD. sysjail was developed and released in 2006 by Kristaps Dzonsons (aka Johnson), a research assistant in Game theory at the Stockholm School of Economics, and Maikls Deksters. [1]

sysjail was re-written from scratch in 2007 to support emulated processes in jails, limited (initially) to Linux emulation.

The project was officially discontinued on 3 March 2009 due to flaws inherent to syscall wrapper-based security architectures. The restrictions of sysjail could be evaded by exploiting race conditions between the wrapper's security checks and kernel's execution of the syscalls. [2]

Related Research Articles

Mach is a kernel developed at Carnegie Mellon University by Richard Rashid and Avie Tevanian to support operating system research, primarily distributed and parallel computing. Mach is often mentioned as one of the earliest examples of a microkernel. However, not all versions of Mach are microkernels. Mach's derivatives are the basis of the operating system kernel in GNU Hurd and of Apple's XNU kernel used in macOS, iOS, iPadOS, tvOS, and watchOS.

System call Mechanism used by an application program to request service from the kernel of the operating system

In computing, a system call is the programmatic way in which a computer program requests a service from the kernel of the operating system on which it is executed. This may include hardware-related services, creation and execution of new processes, and communication with integral kernel services such as process scheduling. System calls provide an essential interface between a process and the operating system.

Light Weight Kernel Threads (LWKT) is a computer science term and from DragonFlyBSD in particular. LWKTs differ from normal kernel threads in that they can preempt normal kernel threads. According to Matt Dillon, DragonFlyBSD creator:

The LWKT scheduler is responsible for actually running a thread. It uses a fixed priority scheme, but the fixed priorities are differentiating major subsystems, not user processes. For example, hardware interrupt threads have the highest priority, followed by software interrupts, kernel-only threads, then finally user threads. A user thread either runs at user-kernel priority, or a user thread runs at user priority.

DragonFly does preempt, it just does it very carefully and only under particular circumstances. An LWKT interrupt thread can preempt most other threads, for example. This mimics what FreeBSD-4.x already did with its spl/run-interrupt-in-context-of-current-process mechanism. What DragonFly does *NOT* do is allow a non-interrupt kernel thread to preempt another non-interrupt kernel thread.

The mainframe z/OS Operating system supports a similar mechanism, called SRB.

SRB's represent requests to execute a system service routine. SRB's are typically created when one address space detects an event that affects a different address space; they provide one of several mechanisms for asynchronous inter-address space communication for programs running on z/OS.

An SRB is similar to a Process Control Block (PCB), in that it identifies a unit of work to the system. Unlike a PCB, an SRB cannot "own" storage areas. In a multiprocessor environment, the SRB routine, after being scheduled, can be dispatched on another processor and can run concurrently with the scheduling program. The scheduling program can continue to do other processing in parallel with the SRB routine. Only programs running in kernel mode can create an SRB.

The Windows Operating System knows a similar light weight thread mechanism named "fibers". Fibers are scheduled by an application program. The port of the CICS Transaction Server to the Windows platform uses fibers, somewhat analogous to the use of "enclaves" under z/OS.

In UNIX, "kernel threads" have two threads, one is the core thread, one is the user thread.

A chroot on Unix operating systems is an operation that changes the apparent root directory for the current running process and its children. A program that is run in such a modified environment cannot name files outside the designated directory tree. The term "chroot" may refer to the chroot(2) system call or the chroot(8) wrapper program. The modified environment is called a chroot jail.

Systrace is a computer security utility which limits an application's access to the system by enforcing access policies for system calls. This can mitigate the effects of buffer overflows and other security vulnerabilities. It was developed by Niels Provos and runs on various Unix-like operating systems.

These tables provide a comparison of operating systems, of computer devices, as listing general and technical information for a number of widely used and currently available PC or handheld operating systems. The article "Usage share of operating systems" provides a broader, and more general, comparison of operating systems that includes servers, mainframes and supercomputers.

DTrace Dynamic tracing framework

DTrace is a comprehensive dynamic tracing framework originally created by Sun Microsystems for troubleshooting kernel and application problems on production systems in real time. Originally developed for Solaris, it has since been released under the free Common Development and Distribution License (CDDL) in OpenSolaris and its descendant illumos, and has been ported to several other Unix-like systems.

The jail mechanism is an implementation of FreeBSD's OS-level virtualisation that allows system administrators to partition a FreeBSD-derived computer system into several independent mini-systems called jails, all sharing the same kernel, with very little overhead. It is implemented through a system call, jail(2), as well as a userland utility, jail(8), plus, depending on the system, a number of other utilities. The functionality was committed into FreeBSD in 1999 by Poul-Henning Kamp after some period of production use by a hosting provider, and was first released with FreeBSD 4.0, thus being supported on a number of FreeBSD descendants, including DragonFly BSD, to this day.

seccomp is a computer security facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit , sigreturn , read and write to already-open file descriptors. Should it attempt any other system calls, the kernel will terminate the process with SIGKILL or SIGSYS. In this sense, it does not virtualize the system's resources but isolates the process from them entirely.

sysctl Unix-like software that manages kernel attributes

sysctl is a software utility of some Unix-like operating systems that reads and modifies the attributes of the system kernel such as its version number, maximum limits, and security settings. It is available both as a system call for compiled programs, and an administrator command for interactive use and scripting. Linux additionally exposes sysctl as a virtual file system.

OS-level virtualization is an operating system paradigm in which the kernel allows the existence of multiple isolated user space instances. Such instances, called containers, Zones, virtual private servers (OpenVZ), partitions, virtual environments (VEs), virtual kernels, or jails, may look like real computers from the point of view of programs running in them. A computer program running on an ordinary operating system can see all resources of that computer. However, programs running inside of a container can only see the container's contents and devices assigned to the container.

In computing, ioctl is a system call for device-specific input/output operations and other operations which cannot be expressed by regular system calls. It takes a parameter specifying a request code; the effect of a call depends completely on the request code. Request codes are often device-specific. For instance, a CD-ROM device driver which can instruct a physical device to eject a disc would provide an ioctl request code to do so. Device-independent request codes are sometimes used to give userspace access to kernel functions which are only used by core system software or still under development.

The OpenBSD operating system focuses on security and the development of security features. According to author Michael W. Lucas, OpenBSD "is widely regarded as the most secure operating system available anywhere, under any licensing terms."

Linux Security Modules (LSM) is a framework allowing the Linux kernel to support without bias a variety of computer security models. LSM is licensed under the terms of the GNU General Public License and is a standard part of the Linux kernel since Linux 2.6. AppArmor, SELinux, Smack, and TOMOYO Linux are the currently approved security modules in the official kernel.

A proprietary device driver is a closed-source device driver published only in binary code. In the context of free and open-source software, a closed-source device driver is referred to as a blob or binary blob. The term usually refers to a closed-source kernel module loaded into the kernel of an open-source operating system, and is sometimes also applied to code running outside the kernel, such as system firmware images, microcode updates, or userland programs. The term blob was first used in database management systems to describe a collection of binary data stored as a single entity.

FreeBSD Free Unix-like operating system

FreeBSD is a free and open-source Unix-like operating system descended from the Berkeley Software Distribution (BSD), which was based on Research Unix. The first version of FreeBSD was released in 1993. In 2005, FreeBSD was the most popular open-source BSD operating system, accounting for more than three-quarters of all installed simply, permissively licensed BSD systems.

ptrace is a system call found in Unix and several Unix-like operating systems. By using ptrace one process can control another, enabling the controller to inspect and manipulate the internal state of its target. ptrace is used by debuggers and other code-analysis tools, mostly as aids to software development.

LXC

Linux Containers (LXC) is an operating-system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel.

Supervisor Mode Access Prevention (SMAP) is a feature of some CPU implementations such as the Intel Broadwell microarchitecture that allows supervisor mode programs to optionally set user-space memory mappings so that access to those mappings from supervisor mode will cause a trap. This makes it harder for malicious programs to "trick" the kernel into using instructions or data from a user-space program.

A virtual kernel architecture (vkernel) is an operating system virtualisation paradigm where kernel code can be compiled to run in the user space, for example, to ease debugging of various kernel-level components, in addition to general-purpose virtualisation and compartmentalisation of system resources. It is used by DragonFly BSD in its vkernel implementation since DragonFly 1.7, having been first revealed in September 2006, and first released in the stable branch with DragonFly 1.8 in January 2007. The long-term goal, in addition to easing kernel development, is to make it easier to support internet-connected computer clusters without compromising local security. Similar concepts exist in other operating systems as well; in Linux, a similar virtualisation concept is known as user-mode Linux; whereas in NetBSD since the summer of 2007, it has been the initial focus of the rump kernel infrastructure.

References