Seccomp

Last updated
seccomp
Original author(s) Andrea Arcangeli
Initial releaseMarch 8, 2005;18 years ago (2005-03-08)
Written in C
Operating system Linux
Type Sandboxing
License GNU General Public License
Website code.google.com/archive/p/seccompsandbox/wikis/overview.wiki

seccomp (short for secure computing mode[ citation needed ]) is a computer security facility in the Linux kernel. seccomp allows a process to make a one-way transition into a "secure" state where it cannot make any system calls except exit(), sigreturn(), read() and write() to already-open file descriptors. Should it attempt any other system calls, the kernel will either just log the event or terminate the process with SIGKILL or SIGSYS. [1] [2] In this sense, it does not virtualize the system's resources but isolates the process from them entirely.

Contents

seccomp mode is enabled via the prctl(2) system call using the PR_SET_SECCOMP argument, or (since Linux kernel 3.17 [3] ) via the seccomp(2) system call. [4] seccomp mode used to be enabled by writing to a file, /proc/self/seccomp, but this method was removed in favor of prctl(). [5] In some kernel versions, seccomp disables the RDTSC x86 instruction, which returns the number of elapsed processor cycles since power-on, used for high-precision timing. [6]

seccomp-bpf is an extension to seccomp [7] that allows filtering of system calls using a configurable policy implemented using Berkeley Packet Filter rules. It is used by OpenSSH [8] and vsftpd as well as the Google Chrome/Chromium web browsers on ChromeOS and Linux. [9] (In this regard seccomp-bpf achieves similar functionality, but with more flexibility and higher performance, to the older systrace—which seems to be no longer supported for Linux.)

Some consider seccomp comparable to OpenBSD pledge(2) and FreeBSD capsicum(4)[ citation needed ].

History

seccomp was first devised by Andrea Arcangeli in January 2005 for use in public grid computing and was originally intended as a means of safely running untrusted compute-bound programs. It was merged into the Linux kernel mainline in kernel version 2.6.12, which was released on March 8, 2005. [10]

Software using seccomp or seccomp-bpf

Related Research Articles

<span class="mw-page-title-main">DTrace</span> Dynamic tracing framework for kernel and applications

DTrace is a comprehensive dynamic tracing framework originally created by Sun Microsystems for troubleshooting kernel and application problems on production systems in real time. Originally developed for Solaris, it has since been released under the free Common Development and Distribution License (CDDL) in OpenSolaris and its descendant illumos, and has been ported to several other Unix-like systems.

In computer security, a sandbox is a security mechanism for separating running programs, usually in an effort to mitigate system failures and/or software vulnerabilities from spreading. The isolation metaphor is taken from the idea of children who do not play well together, so each is given his or her own sandbox to play in alone. It is often used to execute untested or untrusted programs or code, possibly from unverified or untrusted third parties, suppliers, users or websites, without risking harm to the host machine or operating system. A sandbox typically provides a tightly controlled set of resources for guest programs to run in, such as storage and memory scratch space. Network access, the ability to inspect the host system, or read from input devices are usually disallowed or heavily restricted.

<span class="mw-page-title-main">QEMU</span> Free virtualization and emulation software

QEMU is a free and open-source emulator. It emulates a computer's processor through dynamic binary translation and provides a set of different hardware and device models for the machine, enabling it to run a variety of guest operating systems. It can interoperate with Kernel-based Virtual Machine (KVM) to run virtual machines at near-native speed. QEMU can also do emulation for user-level processes, allowing applications compiled for one architecture to run on another.

<span class="mw-page-title-main">Kernel-based Virtual Machine</span> Virtualization module in the Linux kernel

Kernel-based Virtual Machine (KVM) is a free and open-source virtualization module in the Linux kernel that allows the kernel to function as a hypervisor. It was merged into the mainline Linux kernel in version 2.6.20, which was released on February 5, 2007. KVM requires a processor with hardware virtualization extensions, such as Intel VT or AMD-V. KVM has also been ported to other operating systems such as FreeBSD and illumos in the form of loadable kernel modules.

The Berkeley Packet Filter is a network tap and packet filter which permits computer network packets to be captured and filtered at the operating system level. It provides a raw interface to data link layers, permitting raw link-layer packets to be sent and received, and allows a userspace process to supply a filter program that specifies which packets it wants to receive. For example, a tcpdump process may want to receive only packets that initiate a TCP connection. BPF returns only packets that pass the filter that the process supplies. This avoids copying unwanted packets from the operating system kernel to the process, greatly improving performance. The filter program is in the form of instructions for a virtual machine, which are interpreted, or compiled into machine code by a just-in-time (JIT) mechanism and executed, in the kernel.

<span class="mw-page-title-main">Google Chrome</span> Web browser developed by Google

Google Chrome is a web browser developed by Google. It was first released in 2008 for Microsoft Windows, built with free software components from Apple WebKit and Mozilla Firefox. Versions were later released for Linux, macOS, iOS, and also for Android, where it is the default browser. The browser is also the main component of ChromeOS, where it serves as the platform for web applications.

Google Native Client (NaCl) is a discontinued sandboxing technology for running either a subset of Intel x86, ARM, or MIPS native code, or a portable executable, in a sandbox. It allows safely running native code from a web browser, independent of the user operating system, allowing web apps to run at near-native speeds, which aligns with Google's plans for ChromeOS. It may also be used for securing browser plugins, and parts of other applications or full applications such as ZeroVM.

<span class="mw-page-title-main">Linux kernel</span> Operating system kernel

The Linux kernel is a free and open-source, monolithic, modular, multitasking, Unix-like operating system kernel. It was originally written in 1991 by Linus Torvalds for his i386-based PC, and it was soon adopted as the kernel for the GNU operating system, which was written to be a free (libre) replacement for Unix.

Pwn2Own is a computer hacking contest held annually at the CanSecWest security conference. First held in April 2007 in Vancouver, the contest is now held twice a year, most recently in March 2023. Contestants are challenged to exploit widely used software and mobile devices with previously unknown vulnerabilities. Winners of the contest receive the device that they exploited and a cash prize. The Pwn2Own contest serves to demonstrate the vulnerability of devices and software in widespread use while also providing a checkpoint on the progress made in security since the previous year.

<span class="mw-page-title-main">LXC</span> Operating system-level virtualization for Linux

Linux Containers (LXC) is an operating-system-level virtualization method for running multiple isolated Linux systems (containers) on a control host using a single Linux kernel.

<span class="mw-page-title-main">Peppermint OS</span> Linux computer operating system

Peppermint OS is a Linux distribution based on Debian and Devuan Stable, and formerly based on Ubuntu. It uses the Xfce desktop environment. It aims to provide a familiar environment for newcomers to Linux, which requires relatively low hardware resources to run.

Second Level Address Translation (SLAT), also known as nested paging, is a hardware-assisted virtualization technology which makes it possible to avoid the overhead associated with software-managed shadow page tables.

<span class="mw-page-title-main">Android-x86</span> Unofficial port of the Android mobile operating system

Android-x86 is an open source project that makes an unofficial porting of the Android mobile operating system developed by the Open Handset Alliance to run on devices powered by x86 processors, rather than RISC-based ARM chips.

perf is a performance analyzing tool in Linux, available from Linux kernel version 2.6.31 in 2009. Userspace controlling utility, named perf, is accessed from the command line and provides a number of subcommands; it is capable of statistical profiling of the entire system.

Namespaces are a feature of the Linux kernel that partition kernel resources such that one set of processes sees one set of resources while another set of processes sees a different set of resources. The feature works by having the same namespace for a set of resources and processes, but those namespaces refer to distinct resources. Resources may exist in multiple spaces. Examples of such resources are process IDs, host-names, user IDs, file names, some names associated with network access, and Inter-process communication.

<span class="mw-page-title-main">Snap (software)</span> Software deployment system for Linux by Canonical

Snap is a software packaging and deployment system developed by Canonical for operating systems that use the Linux kernel and the systemd init system. The packages, called snaps, and the tool for using them, snapd, work across a range of Linux distributions and allow upstream software developers to distribute their applications directly to users. Snaps are self-contained applications running in a sandbox with mediated access to the host system. Snap was originally released for cloud applications but was later ported to also work for Internet of Things devices and desktop applications.

<span class="mw-page-title-main">Flatpak</span> Linux software deployment utility

Flatpak, formerly known as xdg-app, is a utility for software deployment and package management for Linux. It is advertised as offering a sandbox environment in which users can run application software in isolation from the rest of the system.

<span class="mw-page-title-main">Kernel page-table isolation</span>

Kernel page-table isolation is a Linux kernel feature that mitigates the Meltdown security vulnerability and improves kernel hardening against attempts to bypass kernel address space layout randomization (KASLR). It works by better isolating user space and kernel space memory. KPTI was merged into Linux kernel version 4.15, and backported to Linux kernels 4.14.11, 4.9.75, and 4.4.110. Windows and macOS released similar updates. KPTI does not address the related Spectre vulnerability.

io_uring is a Linux kernel system call interface for storage device asynchronous I/O operations addressing performance issues with similar interfaces provided by functions like read /write or aio_read /aio_write etc. for operations on data accessed by file descriptors.

eBPF Safe dynamic programs and tools

eBPF is a technology that can run programs in a privileged context such as the operating system kernel.

References

  1. Corbet, Jonathan (2015-09-02). "A seccomp overview". lwn . Retrieved 2017-10-05.
  2. "Documentation/prctl/seccomp_filter.txt" . Retrieved 2017-10-05.
  3. "Linux kernel 3.17, Section 11. Security". kernelnewbies.org. 2013-10-05. Retrieved 2015-03-31.
  4. "seccomp: add "seccomp" syscall". kernel/git/torvalds/linux.git - Linux kernel source tree. kernel.org. 2014-06-25. Retrieved 2014-08-22.
  5. Arcangeli, Andrea (2007-06-14). "[PATCH 1 of 2] move seccomp from /proc to a prctl" . Retrieved 2013-08-02.
  6. Tinnes, Julien (2009-05-28). "Time-stamp counter disabling oddities in the Linux kernel". cr0 blog. Retrieved 2013-08-02.
  7. Corbet, Jonathan (2012-01-11). "Yet another new approach to seccomp". lwn . Retrieved 2013-08-02.
  8. 1 2 "Openssh 6.0 release notes" . Retrieved 2013-10-14.
  9. Tinnes, Julien (2012-11-19). "A safer playground for your Linux and Chrome OS renderers". The Chromium Blog. Retrieved 2013-08-02.
  10. "[PATCH] seccomp: secure computing support". Linux kernel history. Kernel.org git repositories. 2005-03-08. Archived from the original on 2013-04-15. Retrieved 2013-08-02.
  11. "Seccomp filter in Android O". Android Developers Blog.
  12. "systemd.exec — Execution environment configuration". freedesktop.org. Retrieved 2017-10-14.
  13. Otubo, Eduardo (2017-09-15). "QEMU Sandboxing new model pull request". qemu-devel mailing list archive.
  14. van de Ven, Arjan (2009-02-28). "Re: [stable][PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole". Linux Kernel Mailing List. Retrieved 2013-08-02.
  15. Torvalds, Linus (2009-02-28). "Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole". Linux Kernel Mailing List. Retrieved 2013-08-02.
  16. Gutschke, Markus (2009-05-06). "Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole" . Retrieved 2013-08-02.
  17. Gutschke, Markus (2009-05-06). "Re: [PATCH 2/2] x86-64: seccomp: fix 32/64 syscall hole". Linux Kernel Mailing List. Retrieved 2013-08-02.
  18. "Firejail". Firejail. Retrieved 2016-11-26.
  19. Evans, Chris (2012-07-04). "Chrome 20 on Linux and Flash sandboxing" . Retrieved 2013-08-02.
  20. Tinnes, Julien (2012-09-06). "Introducing Chrome's next-generation Linux sandbox". cr0 blog. Retrieved 2013-08-02.
  21. "Snap security policy". Archived from the original on 2017-02-04. Retrieved 2017-02-03.
  22. Evans, Chris (2012-04-09). "vsftpd-3.0.0 and seccomp filter sandboxing is here!" . Retrieved 2013-08-02.
  23. "MBOX" . Retrieved 2014-05-20.
  24. "LXD an "hypervisor" for containers (based on liblxc)". 4 November 2014. Retrieved 2014-11-08.
  25. "Where We're Going With LXD" . Retrieved 2014-11-08.
  26. Destuynder, Guillaume (2012-09-13). "Firefox Seccomp sandbox". Mozilla Bugzilla. Retrieved 2015-01-13.
  27. Destuynder, Guillaume (2012-09-13). "Firefox Seccomp sandbox". Mozilla Wiki. Retrieved 2015-01-13.
  28. "Tor ChangeLog".
  29. "Lepton image compression: saving 22% losslessly from images at 15MB/s". Dropbox Tech Blog. Retrieved 2016-07-15.
  30. "Kafel: A language and library for specifying syscall filtering policies".
  31. "Subgraph OS". Subgraph. Retrieved 2016-12-18.
  32. "LoganCIJ16: Future of OS". YouTube. Archived from the original on 2021-12-21. Retrieved 2016-12-18.
  33. "The flatpak security model – part 1: The basics" . Retrieved 2017-01-21.
  34. "bubblewrap" . Retrieved 2018-04-14.
  35. "Chromium OS Sandboxing - the Chromium Projects".
  36. "Minijail [LWN.net]". lwn.net. Retrieved 2017-04-11.
  37. "core/trace/use_seccomp". dev.exherbo.org. Retrieved 2021-05-31.
  38. "File application Sandboxing". GitHub .
  39. "Zathura seccomp implementation".
  40. "Gnome tracker seccomp implementation".