Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program.
A computer program commonly changes its control flow to make decisions and use different parts of the code. Such transfers may be direct, in that the target address is written in the code itself, or indirect, in that the target address itself is a variable in memory or a CPU register. In a typical function call, the program performs a direct call, but returns to the caller function using the stack – an indirect backward-edge transfer. When a function pointer is called, such as from a virtual table, we say there is an indirect forward-edge transfer. [1] [2]
Attackers seek to inject code into a program to make use of its privileges or to extract data from its memory space. Before executable code was commonly made read-only, an attacker could arbitrarily change the code as it is run, targeting direct transfers or even do with no transfers at all. After W^X became widespread, an attacker wants to instead redirect execution to a separate, unprotected area containing the code to be run, making use of indirect transfers: one could overwrite the virtual table for a forward-edge attack or change the call stack for a backward-edge attack (return-oriented programming). CFI is designed to protect indirect transfers from going to unintended locations. [1]
Associated techniques include code-pointer separation (CPS), code-pointer integrity (CPI), stack canaries, shadow stacks, and vtable pointer verification. [3] [4] [5]
Related implementations are available in Clang (LLVM in general), [6] Microsoft's Control Flow Guard [7] [8] [9] and Return Flow Guard, [10] Google's Indirect Function-Call Checks [11] and Reuse Attack Protector (RAP). [12] [13]
This section needs expansion. You can help by adding to it. (September 2020) |
LLVM/Clang provides a "CFI" option that works in the forward edge by checking for errors in virtual tables and type casts. It depends on link-time optimization (LTO) to know what functions are supposed to be called in normal cases. [14] There is a separate "shadow call stack" scheme that defends on the backward edge by checking for call stack modifications, available only for aarch64. [15]
Google has shipped Android with the Linux kernel compiled by Clang with link-time optimization (LTO) and CFI since 2018. [16] SCS is available for Linux kernel as an option, including on Android. [17]
This section needs expansion. You can help by adding to it. (January 2021) |
Intel Control-flow Enforcement Technology (CET) detects compromises to control flow integrity with a shadow stack (SS) and indirect branch tracking (IBT). [18] [19]
The shadow stack stores a copy of the return address of each CALL in a specially-protected shadow stack. On a RET, the processor checks if the return address stored in the normal stack and shadow stack are equal. If the addresses are not equal, the processor generates an INT #21 (Control Flow Protection Fault).
Indirect branch tracking detects indirect JMP or CALL instructions with unauthorized targets. It is implemented by adding a new internal state machine in the processor. The behavior of indirect JMP and CALL instructions is changed so that they switch the state machine from IDLE to WAIT_FOR_ENDBRANCH. In the WAIT_FOR_ENDBRANCH state, the next instruction to be executed is required to be the new ENDBRANCH instruction (ENDBR32 in 32-bit mode or ENDBR64 in 64-bit mode), which changes the internal state machine from WAIT_FOR_ENDBRANCH back to IDLE. Thus every authorized target of an indirect JMP or CALL must begin with ENDBRANCH. If the processor is in a WAIT_FOR_ENDBRANCH state (meaning, the previous instruction was an indirect JMP or CALL), and the next instruction is not an ENDBRANCH instruction, the processor generates an INT #21 (Control Flow Protection Fault). On processors not supporting CET indirect branch tracking, ENDBRANCH instructions are interpreted as NOPs and have no effect.
Control Flow Guard (CFG) was first released for Windows 8.1 Update 3 (KB3000850) in November 2014. Developers can add CFG to their programs by adding the /guard:cf
linker flag before program linking in Visual Studio 2015 or newer. [20]
As of Windows 10 Creators Update (Windows 10 version 1703), the Windows kernel is compiled with CFG. [21] The Windows kernel uses Hyper-V to prevent malicious kernel code from overwriting the CFG bitmap. [22]
CFG operates by creating a per-process bitmap, where a set bit indicates that the address is a valid destination. Before performing each indirect function call, the application checks if the destination address is in the bitmap. If the destination address is not in the bitmap, the program terminates. [20] This makes it more difficult for an attacker to exploit a use-after-free by replacing an object's contents and then using an indirect function call to execute a payload. [23]
For all protected indirect function calls, the _guard_check_icall
function is called, which performs the following steps: [24]
There are several generic techniques for bypassing CFG:
eXtended Flow Guard (XFG) has not been officially released yet, but is available in the Windows Insider preview and was publicly presented at Bluehat Shanghai in 2019. [29]
XFG extends CFG by validating function call signatures to ensure that indirect function calls are only to the subset of functions with the same signature. Function call signature validation is implemented by adding instructions to store the target function's hash in register r10 immediately prior to the indirect call and storing the calculated function hash in the memory immediately preceding the target address's code. When the indirect call is made, the XFG validation function compares the value in r10 to the target function's stored hash. [30] [31]
In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.
The PDP-8 is a family of 12-bit minicomputers that was produced by Digital Equipment Corporation (DEC). It was the first commercially successful minicomputer, with over 50,000 units being sold over the model's lifetime. Its basic design follows the pioneering LINC but has a smaller instruction set, which is an expanded version of the PDP-5 instruction set. Similar machines from DEC are the PDP-12 which is a modernized version of the PDP-8 and LINC concepts, and the PDP-14 industrial controller system.
In computer science, threaded code is a programming technique where the code has a form that essentially consists entirely of calls to subroutines. It is often used in compilers, which may generate code in that form or be implemented in that form themselves. The code may be processed by an interpreter or it may simply be a sequence of machine code call instructions.
x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.
In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded at a location in each program's address space where it does not overlap with other memory in use by, for example, other shared libraries. PIC was also used on older computer systems that lacked an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.
In computer security, hardening is usually the process of securing a system by reducing its surface of vulnerability, which is larger when a system performs more functions; in principle a single-function system is more secure than a multipurpose one. Reducing available ways of attack typically includes changing default passwords, the removal of unnecessary software, unnecessary usernames or logins, and the disabling or removal of unnecessary services.
Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.
LLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.
In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.
In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.
The Microsoft Windows family of operating systems employ some specific exception handling mechanisms.
The task state segment (TSS) is a structure on x86-based computers which holds information about a task. It is used by the operating system kernel for task management. Specifically, the following information is stored in the TSS:
This article describes the calling conventions used when programming x86 architecture microprocessors.
In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.
Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.
Blocks are a non-standard extension added by Apple Inc. to Clang's implementations of the C, C++, and Objective-C programming languages that uses a lambda expression-like syntax to create closures within these languages. Blocks are supported for programs developed for Mac OS X 10.6+ and iOS 4.0+, although third-party runtimes allow use on Mac OS X 10.5 and iOS 2.2+ and non-Apple systems.
The Bellmac 32 is a microprocessor developed by Bell Labs' processor division in 1980, implemented using CMOS technology and was the first microprocessor that could move 32 bits in one clock cycle. The microprocessor contains 150,000 transistors and improved on the speed of CMOS design by using "domino circuits". It was designed with the C programming language in mind. After its creation, an improved version was produced called the Bellmac 32A, then cancelled along with its successor, the "Hobbit" C-language Reduced Instruction Set Processor (CRISP).
AArch64 or ARM64 is the 64-bit extension of the ARM architecture family.
A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).
ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), OpenCL.