Control-flow integrity

Last updated

Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program.

Contents

Background

A computer program commonly changes its control flow to make decisions and use different parts of the code. Such transfers may be direct, in that the target address is written in the code itself, or indirect, in that the target address itself is a variable in memory or a CPU register. In a typical function call, the program performs a direct call, but returns to the caller function using the stack an indirect backward-edge transfer. When a function pointer is called, such as from a virtual table, we say there is an indirect forward-edge transfer. [1] [2]

Attackers seek to inject code into a program to make use of its privileges or to extract data from its memory space. Before executable code was commonly made read-only, an attacker could arbitrarily change the code as it is run, targeting direct transfers or even do with no transfers at all. After W^X became widespread, an attacker wants to instead redirect execution to a separate, unprotected area containing the code to be run, making use of indirect transfers: one could overwrite the virtual table for a forward-edge attack or change the call stack for a backward-edge attack (return-oriented programming). CFI is designed to protect indirect transfers from going to unintended locations. [1]

Techniques

Associated techniques include code-pointer separation (CPS), code-pointer integrity (CPI), stack canaries, shadow stacks, and vtable pointer verification. [3] [4] [5]

Implementations

Related implementations are available in Clang (LLVM in general), [6] Microsoft's Control Flow Guard [7] [8] [9] and Return Flow Guard, [10] Google's Indirect Function-Call Checks [11] and Reuse Attack Protector (RAP). [12] [13]

LLVM/Clang

LLVM/Clang provides a "CFI" option that works in the forward edge by checking for errors in virtual tables and type casts. It depends on link-time optimization (LTO) to know what functions are supposed to be called in normal cases. [14] There is a separate "shadow call stack" scheme that defends on the backward edge by checking for call stack modifications, available only for aarch64. [15]

Google has shipped Android with the Linux kernel compiled by Clang with link-time optimization (LTO) and CFI since 2018. [16] SCS is available for Linux kernel as an option, including on Android. [17]

Intel Control-flow Enforcement Technology

Intel Control-flow Enforcement Technology (CET) detects compromises to control flow integrity with a shadow stack (SS) and indirect branch tracking (IBT). [18] [19]

The shadow stack stores a copy of the return address of each CALL in a specially-protected shadow stack. On a RET, the processor checks if the return address stored in the normal stack and shadow stack are equal. If the addresses are not equal, the processor generates an INT #21 (Control Flow Protection Fault).

Indirect branch tracking detects indirect JMP or CALL instructions with unauthorized targets. It is implemented by adding a new internal state machine in the processor. The behavior of indirect JMP and CALL instructions is changed so that they switch the state machine from IDLE to WAIT_FOR_ENDBRANCH. In the WAIT_FOR_ENDBRANCH state, the next instruction to be executed is required to be the new ENDBRANCH instruction (ENDBR32 in 32-bit mode or ENDBR64 in 64-bit mode), which changes the internal state machine from WAIT_FOR_ENDBRANCH back to IDLE. Thus every authorized target of an indirect JMP or CALL must begin with ENDBRANCH. If the processor is in a WAIT_FOR_ENDBRANCH state (meaning, the previous instruction was an indirect JMP or CALL), and the next instruction is not an ENDBRANCH instruction, the processor generates an INT #21 (Control Flow Protection Fault). On processors not supporting CET indirect branch tracking, ENDBRANCH instructions are interpreted as NOPs and have no effect.

Microsoft Control Flow Guard

Control Flow Guard (CFG) was first released for Windows 8.1 Update 3 (KB3000850) in November 2014. Developers can add CFG to their programs by adding the /guard:cf linker flag before program linking in Visual Studio 2015 or newer. [20]

As of Windows 10 Creators Update (Windows 10 version 1703), the Windows kernel is compiled with CFG. [21] The Windows kernel uses Hyper-V to prevent malicious kernel code from overwriting the CFG bitmap. [22]

CFG operates by creating a per-process bitmap, where a set bit indicates that the address is a valid destination. Before performing each indirect function call, the application checks if the destination address is in the bitmap. If the destination address is not in the bitmap, the program terminates. [20] This makes it more difficult for an attacker to exploit a use-after-free by replacing an object's contents and then using an indirect function call to execute a payload. [23]

Implementation details

For all protected indirect function calls, the _guard_check_icall function is called, which performs the following steps: [24]

  1. Convert the target address to an offset and bit number in the bitmap.
    1. The highest 3 bytes are the byte offset in the bitmap
    2. The bit offset is a 5-bit value. The first four bits are the 4th through 8th low-order bits of the address.
    3. The 5th bit of the bit offset is set to 0 if the destination address is aligned with 0x10 (last four bits are 0), and 1 if it is not.
  2. Examine the target's address value in the bitmap
    1. If the target address is in the bitmap, return without an error.
    2. If the target address is not in the bitmap, terminate the program.

Bypass techniques

There are several generic techniques for bypassing CFG:

  • Set the destination to code located in a non-CFG module loaded in the same process. [23] [25]
  • Find an indirect call that was not protected by CFG (either CALL or JMP). [23] [25] [26]
  • Use a function call with a different number of arguments than the call is designed for, causing a stack misalignment, and code execution after the function returns (patched in Windows 10). [27]
  • Use a function call with the same number of arguments, but one of pointers passed is treated as an object and writes to a pointer-based offset, allowing overwriting a return address. [28]
  • Overwrite the function call used by the CFG to validate the address (patched in March 2015) [26]
  • Set the CFG bitmap to all 1's, allowing all indirect function calls [26]
  • Use a controlled-write primitive to overwrite an address on the stack (since the stack is not protected by CFG) [26]

Microsoft eXtended Flow Guard

eXtended Flow Guard (XFG) has not been officially released yet, but is available in the Windows Insider preview and was publicly presented at Bluehat Shanghai in 2019. [29]

XFG extends CFG by validating function call signatures to ensure that indirect function calls are only to the subset of functions with the same signature. Function call signature validation is implemented by adding instructions to store the target function's hash in register r10 immediately prior to the indirect call and storing the calculated function hash in the memory immediately preceding the target address's code. When the indirect call is made, the XFG validation function compares the value in r10 to the target function's stored hash. [30] [31]

See also

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

<span class="mw-page-title-main">PDP-8</span> Minicomputer product line

The PDP-8 is a family of 12-bit minicomputers that was produced by Digital Equipment Corporation (DEC). It was the first commercially successful minicomputer, with over 50,000 units being sold over the model's lifetime. Its basic design follows the pioneering LINC but has a smaller instruction set, which is an expanded version of the PDP-5 instruction set. Similar machines from DEC are the PDP-12 which is a modernized version of the PDP-8 and LINC concepts, and the PDP-14 industrial controller system.

In computer science, threaded code is a programming technique where the code has a form that essentially consists entirely of calls to subroutines. It is often used in compilers, which may generate code in that form or be implemented in that form themselves. The code may be processed by an interpreter or it may simply be a sequence of machine code call instructions.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that, being placed somewhere in the primary memory, executes properly regardless of its absolute address. PIC is commonly used for shared libraries, so that the same library code can be loaded at a location in each program's address space where it does not overlap with other memory in use by, for example, other shared libraries. PIC was also used on older computer systems that lacked an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

In computer security, hardening is usually the process of securing a system by reducing its surface of vulnerability, which is larger when a system performs more functions; in principle a single-function system is more secure than a multipurpose one. Reducing available ways of attack typically includes changing default passwords, the removal of unnecessary software, unnecessary usernames or logins, and the disabling or removal of unnecessary services.

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

<span class="mw-page-title-main">LLVM</span> Compiler backend for multiple programming languages

LLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

The Microsoft Windows family of operating systems employ some specific exception handling mechanisms.

The task state segment (TSS) is a structure on x86-based computers which holds information about a task. It is used by the operating system kernel for task management. Specifically, the following information is stored in the TSS:

This article describes the calling conventions used when programming x86 architecture microprocessors.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.

Blocks are a non-standard extension added by Apple Inc. to Clang's implementations of the C, C++, and Objective-C programming languages that uses a lambda expression-like syntax to create closures within these languages. Blocks are supported for programs developed for Mac OS X 10.6+ and iOS 4.0+, although third-party runtimes allow use on Mac OS X 10.5 and iOS 2.2+ and non-Apple systems.

The Bellmac 32 is a microprocessor developed by Bell Labs' processor division in 1980, implemented using CMOS technology and was the first microprocessor that could move 32 bits in one clock cycle. The microprocessor contains 150,000 transistors and improved on the speed of CMOS design by using "domino circuits". It was designed with the C programming language in mind. After its creation, an improved version was produced called the Bellmac 32A, then cancelled along with its successor, the "Hobbit" C-language Reduced Instruction Set Processor (CRISP).

<span class="mw-page-title-main">AArch64</span> 64-bit extension of the ARM architecture

AArch64 or ARM64 is the 64-bit extension of the ARM architecture family.

A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP/Message Passing Interface (MPI), OpenCL.

References

  1. 1 2 Payer, Mattias. "Control-Flow Integrity: An Introduction". nebelwelt.net.
  2. Burow, Nathan; Carr, Scott A.; Nash, Joseph; Larsen, Per; Franz, Michael; Brunthaler, Stefan; Payer, Mathias (31 January 2018). "Control-Flow Integrity: Precision, Security, and Performance". ACM Computing Surveys. 50 (1): 1–33. doi: 10.1145/3054924 .
  3. Payer, Mathias; Kuznetsov, Volodymyr. "On differences between the CFI, CPS, and CPI properties". nebelwelt.net. Retrieved 2016-06-01.
  4. "Adobe Flash Bug Discovery Leads To New Attack Mitigation Method". Dark Reading. 10 November 2015. Retrieved 2016-06-01.
  5. Endgame. "Endgame to Present at Black Hat USA 2016". www.prnewswire.com. Retrieved 2016-06-01.
  6. "Control Flow Integrity — Clang 3.9 documentation". clang.llvm.org. Retrieved 2016-06-01.
  7. Pauli, Darren. "Microsoft's malware mitigator refreshed, but even Redmond says it's no longer needed". The Register . Retrieved 2016-06-01.
  8. Mimoso, Michael (2015-09-22). "Bypass Developed for Microsoft Memory Protection, Control Flow Guard". Threatpost | The first stop for security news. Retrieved 2016-06-01.
  9. Smith, Ms. (23 September 2015). "DerbyCon: Former BlueHat prize winner will bypass Control Flow Guard in Windows 10". Network World. Retrieved 2016-06-01.
  10. "Return Flow Guard". Tencent. 2 November 2016. Retrieved 2017-01-19.
  11. Tice, Caroline; Roeder, Tom; Collingbourne, Peter; Checkoway, Stephen; Erlingsson, Úlfar; Lozano, Luis; Pike, Geoff (2014-01-01). Enforcing Forward-Edge Control-Flow Integrity in GCC & LLVM. pp. 941–955. ISBN   9781931971157.
  12. Security, heise. "PaX Team stellt Schutz vor Code Reuse Exploits vor". Security (in German). Retrieved 2016-06-01.
  13. "Frequently Asked Questions About RAP" . Retrieved 2016-06-01.
  14. "Control Flow Integrity — Clang 17.0.0git documentation". clang.llvm.org.
  15. "ShadowCallStack — Clang 17.0.0git documentation". clang.llvm.org.
  16. "Clang LTO Patches Updated for the Linux Kernel - Phoronix".
  17. "ShadowCallStack". Android Open Source Project.
  18. "Control-flow Enforcement Technology Specification" (PDF). Intel Developer Zone. Archived from the original (PDF) on 2017-08-14. Retrieved 2021-01-05.
  19. "R.I.P ROP: CET Internals in Windows 20H1". Winsider Seminars & Solutions Inc. Retrieved 2021-01-05.
  20. 1 2 "Control Flow Guard". MSDN. Retrieved 2017-01-19.
  21. "Analysis of the Shadow Brokers release and mitigation with Windows 10 virtualization-based security". Microsoft Technet. 16 June 2017. Retrieved 2017-06-20.
  22. "Universally Bypassing CFG Through Mutability Abuse" (PDF). Alex Ionescu's Blog. Retrieved 2017-07-07.
  23. 1 2 3 Falcón, Francisco (2015-03-25). "Exploiting CVE-2015-0311, Part II: Bypassing Control Flow Guard on Windows 8.1 Update 3". Core Security. Retrieved 2017-01-19.
  24. "Control Flow Guard" (PDF). Trend Micro. Retrieved 2017-01-19.
  25. 1 2 "Windows 10 Control Flow Guard Internals" (PDF). Power of Community. Retrieved 2017-01-19.
  26. 1 2 3 4 "Bypass Control Flow Guard Comprehensively" (PDF). BlackHat. Retrieved 2017-01-19.
  27. "An interesting detail about Control Flow Guard". Bromium. Retrieved 2017-01-19.
  28. Thomas, Sam (18 August 2016). "Object Oriented Exploitation: New techniques in Windows mitigation bypass". Slideshare. Retrieved 2017-01-19.
  29. "Advancing Windows Security" . Retrieved 2021-05-19.
  30. "EXTENDED FLOW GUARD UNDER THE MICROSCOPE" . Retrieved 2021-05-19.
  31. "Exploit Development: Between a Rock and a (Xtended Flow) Guard Place: Examining XFG". 23 August 2020. Retrieved 2021-05-19.