Control-flow integrity

Last updated

Control-flow integrity (CFI) is a general term for computer security techniques that prevent a wide variety of malware attacks from redirecting the flow of execution (the control flow) of a program.

Contents

Background

A computer program commonly changes its control flow to make decisions and use different parts of the code. Such transfers may be direct, in that the target address is written in the code itself, or indirect, in that the target address itself is a variable in memory or a CPU register. In a typical function call, the program performs a direct call, but returns to the caller function using the stack an indirect backward-edge transfer. When a function pointer is called, such as from a virtual table, we say there is an indirect forward-edge transfer. [1] [2]

Attackers seek to inject code into a program to make use of its privileges or to extract data from its memory space. Before executable code was commonly made read-only, an attacker could arbitrarily change the code as it is run, targeting direct transfers or even do with no transfers at all. After W^X became widespread, an attacker wants to instead redirect execution to a separate, unprotected area containing the code to be run, making use of indirect transfers: one could overwrite the virtual table for a forward-edge attack or change the call stack for a backward-edge attack (return-oriented programming). CFI is designed to protect indirect transfers from going to unintended locations. [1]

Techniques

Associated techniques include code-pointer separation (CPS), code-pointer integrity (CPI), stack canaries, shadow stacks, and vtable pointer verification. [3] [4] [5] These protections can be classified into either coarse-grained or fine-grained based on the number of targets restricted. A coarse-grained forward-edge CFI implementation, could, for example, restrict the set of indirect call targets to any function that may be indirectly called in the program, while a fine-grained one would restrict each indirect call site to functions that have the same type as the function to be called. Similarly, for a backward edge scheme protecting returns, a coarse-grained implementation would only allow the procedure to return to a function of the same type (of which there could be many, especially for common prototypes), while a fine-grained one would enforce precise return matching (so it can return only to the function that called it).

Implementations

Related implementations are available in Clang (LLVM in general), [6] Microsoft's Control Flow Guard [7] [8] [9] and Return Flow Guard, [10] Google's Indirect Function-Call Checks [11] and Reuse Attack Protector (RAP). [12] [13]

LLVM/Clang

LLVM/Clang provides a "CFI" option that works in the forward edge by checking for errors in virtual tables and type casts. It depends on link-time optimization (LTO) to know what functions are supposed to be called in normal cases. [14] There is a separate "shadow call stack" scheme that defends on the backward edge by checking for call stack modifications, available only for aarch64. [15]

Google has shipped Android with the Linux kernel compiled by Clang with link-time optimization (LTO) and CFI since 2018. [16] SCS is available for Linux kernel as an option, including on Android. [17]

Intel Control-flow Enforcement Technology

Intel Control-flow Enforcement Technology (CET) detects compromises to control flow integrity with a shadow stack (SS) and indirect branch tracking (IBT). [18] [19]

The kernel must map a region of memory for the shadow stack not writable to user space programs except by special instructions. The shadow stack stores a copy of the return address of each CALL. On a RET, the processor checks if the return address stored in the normal stack and shadow stack are equal. If the addresses are not equal, the processor generates an INT #21 (Control Flow Protection Fault).

Indirect branch tracking detects indirect JMP or CALL instructions to unauthorized targets. It is implemented by adding a new internal state machine in the processor. The behavior of indirect JMP and CALL instructions is changed so that they switch the state machine from IDLE to WAIT_FOR_ENDBRANCH. In the WAIT_FOR_ENDBRANCH state, the next instruction to be executed is required to be the new ENDBRANCH instruction (ENDBR32 in 32-bit mode or ENDBR64 in 64-bit mode), which changes the internal state machine from WAIT_FOR_ENDBRANCH back to IDLE. Thus every authorized target of an indirect JMP or CALL must begin with ENDBRANCH. If the processor is in a WAIT_FOR_ENDBRANCH state (meaning, the previous instruction was an indirect JMP or CALL), and the next instruction is not an ENDBRANCH instruction, the processor generates an INT #21 (Control Flow Protection Fault). On processors not supporting CET indirect branch tracking, ENDBRANCH instructions are interpreted as NOPs and have no effect.

Microsoft Control Flow Guard

Control Flow Guard (CFG) was first released for Windows 8.1 Update 3 (KB3000850) in November 2014. Developers can add CFG to their programs by adding the /guard:cf linker flag before program linking in Visual Studio 2015 or newer. [20]

As of Windows 10 Creators Update (Windows 10 version 1703), the Windows kernel is compiled with CFG. [21] The Windows kernel uses Hyper-V to prevent malicious kernel code from overwriting the CFG bitmap. [22]

CFG operates by creating a per-process bitmap, where a set bit indicates that the address is a valid destination. Before performing each indirect function call, the application checks if the destination address is in the bitmap. If the destination address is not in the bitmap, the program terminates. [20] This makes it more difficult for an attacker to exploit a use-after-free by replacing an object's contents and then using an indirect function call to execute a payload. [23]

Implementation details

For all protected indirect function calls, the _guard_check_icall function is called, which performs the following steps: [24]

  1. Convert the target address to an offset and bit number in the bitmap.
    1. The highest 3 bytes are the byte offset in the bitmap
    2. The bit offset is a 5-bit value. The first four bits are the 4th through 8th low-order bits of the address.
    3. The 5th bit of the bit offset is set to 0 if the destination address is aligned with 0x10 (last four bits are 0), and 1 if it is not.
  2. Examine the target's address value in the bitmap
    1. If the target address is in the bitmap, return without an error.
    2. If the target address is not in the bitmap, terminate the program.

Bypass techniques

There are several generic techniques for bypassing CFG:

  • Set the destination to code located in a non-CFG module loaded in the same process. [23] [25]
  • Find an indirect call that was not protected by CFG (either CALL or JMP). [23] [25] [26]
  • Use a function call with a different number of arguments than the call is designed for, causing a stack misalignment, and code execution after the function returns (patched in Windows 10). [27]
  • Use a function call with the same number of arguments, but one of pointers passed is treated as an object and writes to a pointer-based offset, allowing overwriting a return address. [28]
  • Overwrite the function call used by the CFG to validate the address (patched in March 2015) [26]
  • Set the CFG bitmap to all 1's, allowing all indirect function calls [26]
  • Use a controlled-write primitive to overwrite an address on the stack (since the stack is not protected by CFG) [26]

Microsoft eXtended Flow Guard

eXtended Flow Guard (XFG) has not been officially released yet, but is available in the Windows Insider preview and was publicly presented at Bluehat Shanghai in 2019. [29]

XFG extends CFG by validating function call signatures to ensure that indirect function calls are only to the subset of functions with the same signature. Function call signature validation is implemented by adding instructions to store the target function's hash in register r10 immediately prior to the indirect call and storing the calculated function hash in the memory immediately preceding the target address's code. When the indirect call is made, the XFG validation function compares the value in r10 to the target function's stored hash. [30] [31]

See also

Related Research Articles

<span class="mw-page-title-main">Buffer overflow</span> Anomaly in computer security and programming

In programming and information security, a buffer overflow or buffer overrun is an anomaly whereby a program writes data to a buffer beyond the buffer's allocated memory, overwriting adjacent memory locations.

<span class="mw-page-title-main">PDP-8</span> Minicomputer product line

The PDP-8 is a family of 12-bit minicomputers that was produced by Digital Equipment Corporation (DEC). It was the first commercially successful minicomputer, with over 50,000 units being sold over the model's lifetime. Its basic design follows the pioneering LINC but has a smaller instruction set, which is an expanded version of the PDP-5 instruction set. Similar machines from DEC are the PDP-12 which is a modernized version of the PDP-8 and LINC concepts, and the PDP-14 industrial controller system.

x86 assembly language is the name for the family of assembly languages which provide some level of backward compatibility with CPUs back to the Intel 8008 microprocessor, which was launched in April 1972. It is used to produce object code for the x86 class of processors.

In computing, position-independent code (PIC) or position-independent executable (PIE) is a body of machine code that executes properly regardless of its memory address. PIC is commonly used for shared libraries, so that the same library code can be loaded at a location in each program's address space where it does not overlap with other memory in use by, for example, other shared libraries. PIC was also used on older computer systems that lacked an MMU, so that the operating system could keep applications away from each other even within the single address space of an MMU-less system.

Buffer overflow protection is any of various techniques used during software development to enhance the security of executable programs by detecting buffer overflows on stack-allocated variables, and preventing them from causing program misbehavior or from becoming serious security vulnerabilities. A stack buffer overflow occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, which could lead to program crashes, incorrect operation, or security issues.

<span class="mw-page-title-main">LLVM</span> Compiler backend for multiple programming languages

LLVM is a set of compiler and toolchain technologies that can be used to develop a frontend for any programming language and a backend for any instruction set architecture. LLVM is designed around a language-independent intermediate representation (IR) that serves as a portable, high-level assembly language that can be optimized with a variety of transformations over multiple passes. The name LLVM originally stood for Low Level Virtual Machine, though the project has expanded and the name is no longer officially an initialism.

In computer science, a tail call is a subroutine call performed as the final action of a procedure. If the target of a tail is the same subroutine, the subroutine is said to be tail recursive, which is a special case of direct recursion. Tail recursion is particularly useful, and is often easy to optimize in implementations.

<span class="mw-page-title-main">WDC 65C02</span> CMOS microprocessor in the 6502 family

The Western Design Center (WDC) 65C02 microprocessor is an enhanced CMOS version of the popular nMOS-based 8-bit MOS Technology 6502. It uses less power than the original 6502, fixes several problems, and adds new instructions. The power usage is on the order of 10 to 20 times less than the original 6502 running at the same speed; its reduced power consumption has made it useful in portable computer roles and industrial microcontroller systems. The 65C02 has also been used in some home computers, as well as in embedded applications, including medical-grade implanted devices.

In computer programming, the term hooking covers a range of techniques used to alter or augment the behaviour of an operating system, of applications, or of other software components by intercepting function calls or messages or events passed between software components. Code that handles such intercepted function calls, events or messages is called a hook.

The Microsoft Windows family of operating systems employ some specific exception handling mechanisms.

The task state segment (TSS) is a structure on x86-based computers which holds information about a task. It is used by the operating system kernel for task management. Specifically, the following information is stored in the TSS:

This article describes the calling conventions used when programming x86 architecture microprocessors.

In software, a stack buffer overflow or stack buffer overrun occurs when a program writes to a memory address on the program's call stack outside of the intended data structure, which is usually a fixed-length buffer. Stack buffer overflow bugs are caused when a program writes more data to a buffer located on the stack than what is actually allocated for that buffer. This almost always results in corruption of adjacent data on the stack, and in cases where the overflow was triggered by mistake, will often cause the program to crash or operate incorrectly. Stack buffer overflow is a type of the more general programming malfunction known as buffer overflow. Overfilling a buffer on the stack is more likely to derail program execution than overfilling a buffer on the heap because the stack contains the return addresses for all active function calls.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.

Blocks are a non-standard extension added by Apple Inc. to Clang's implementations of the C, C++, and Objective-C programming languages that uses a lambda expression-like syntax to create closures within these languages. Blocks are supported for programs developed for Mac OS X 10.6+ and iOS 4.0+, although third-party runtimes allow use on Mac OS X 10.5 and iOS 2.2+ and non-Apple systems.

The Bellmac 32 is a microprocessor developed by Bell Labs' processor division in 1980, implemented using CMOS technology and was the first microprocessor that could move 32 bits in one clock cycle. The microprocessor contains 150,000 transistors and improved on the speed of CMOS design by using "domino circuits". It was designed with the C programming language in mind. After its creation, an improved version was produced called the Bellmac 32A, then cancelled along with its successor, the "Hobbit" C-language Reduced Instruction Set Processor (CRISP).

<span class="mw-page-title-main">AArch64</span> 64-bit extension of the ARM architecture

AArch64 or ARM64 is the 64-bit Execution state of the ARM architecture family. It was first introduced with the Armv8-A architecture, and has had many extension updates.

A code sanitizer is a programming tool that detects bugs in the form of undefined or suspicious behavior by a compiler inserting instrumentation code at runtime. The class of tools was first introduced by Google's AddressSanitizer of 2012, which uses directly mapped shadow memory to detect memory corruption such as buffer overflows or accesses to a dangling pointer (use-after-free).

Sigreturn-oriented programming (SROP) is a computer security exploit technique that allows an attacker to execute code in presence of security measures such as non-executable memory and code signing. It was presented for the first time at the 35th IEEE Symposium on Security and Privacy in 2014 where it won the best student paper award. This technique employs the same basic assumptions behind the return-oriented programming (ROP) technique: an attacker controlling the call stack, for example through a stack buffer overflow, is able to influence the control flow of the program through simple instruction sequences called gadgets. The attack works by pushing a forged sigcontext structure on the call stack, overwriting the original return address with the location of a gadget that allows the attacker to call the sigreturn system call. Often just a single gadget is needed to successfully put this attack into effect. This gadget may reside at a fixed location, making this attack simple and effective, with a setup generally simpler and more portable than the one needed by the plain return-oriented programming technique.

<span class="mw-page-title-main">ROCm</span> Parallel computing platform: GPGPU libraries and application programming interface

ROCm is an Advanced Micro Devices (AMD) software stack for graphics processing unit (GPU) programming. ROCm spans several domains: general-purpose computing on graphics processing units (GPGPU), high performance computing (HPC), heterogeneous computing. It offers several programming models: HIP, OpenMP, and OpenCL.

References

  1. 1 2 Payer, Mattias. "Control-Flow Integrity: An Introduction". nebelwelt.net.
  2. Burow, Nathan; Carr, Scott A.; Nash, Joseph; Larsen, Per; Franz, Michael; Brunthaler, Stefan; Payer, Mathias (31 January 2018). "Control-Flow Integrity: Precision, Security, and Performance". ACM Computing Surveys. 50 (1): 1–33. doi: 10.1145/3054924 .
  3. Payer, Mathias; Kuznetsov, Volodymyr. "On differences between the CFI, CPS, and CPI properties". nebelwelt.net. Retrieved 2016-06-01.
  4. "Adobe Flash Bug Discovery Leads To New Attack Mitigation Method". Dark Reading. 10 November 2015. Retrieved 2016-06-01.
  5. Endgame. "Endgame to Present at Black Hat USA 2016". www.prnewswire.com (Press release). Retrieved 2016-06-01.
  6. "Control Flow Integrity — Clang 3.9 documentation". clang.llvm.org. Retrieved 2016-06-01.
  7. Pauli, Darren. "Microsoft's malware mitigator refreshed, but even Redmond says it's no longer needed". The Register . Retrieved 2016-06-01.
  8. Mimoso, Michael (2015-09-22). "Bypass Developed for Microsoft Memory Protection, Control Flow Guard". Threatpost | The first stop for security news. Retrieved 2016-06-01.
  9. Smith, Ms. (23 September 2015). "DerbyCon: Former BlueHat prize winner will bypass Control Flow Guard in Windows 10". Network World. Archived from the original on September 27, 2015. Retrieved 2016-06-01.
  10. "Return Flow Guard". Tencent. 2 November 2016. Retrieved 2017-01-19.
  11. Tice, Caroline; Roeder, Tom; Collingbourne, Peter; Checkoway, Stephen; Erlingsson, Úlfar; Lozano, Luis; Pike, Geoff (2014-01-01). Enforcing Forward-Edge Control-Flow Integrity in GCC & LLVM. pp. 941–955. ISBN   9781931971157.
  12. Security, heise (4 May 2016). "PaX Team stellt Schutz vor Code Reuse Exploits vor". Security (in German). Retrieved 2016-06-01.
  13. "Frequently Asked Questions About RAP" . Retrieved 2016-06-01.
  14. "Control Flow Integrity — Clang 17.0.0git documentation". clang.llvm.org.
  15. "ShadowCallStack — Clang 17.0.0git documentation". clang.llvm.org.
  16. "Clang LTO Patches Updated for the Linux Kernel - Phoronix".
  17. "ShadowCallStack". Android Open Source Project.
  18. "Control-flow Enforcement Technology Specification" (PDF). Intel Developer Zone. Archived from the original (PDF) on 2017-08-14. Retrieved 2021-01-05.
  19. "R.I.P ROP: CET Internals in Windows 20H1". Winsider Seminars & Solutions Inc. 5 January 2020. Retrieved 2021-01-05.
  20. 1 2 "Control Flow Guard". MSDN. Retrieved 2017-01-19.
  21. "Analysis of the Shadow Brokers release and mitigation with Windows 10 virtualization-based security". Microsoft Technet. 16 June 2017. Retrieved 2017-06-20.
  22. "Universally Bypassing CFG Through Mutability Abuse" (PDF). Alex Ionescu's Blog. Retrieved 2017-07-07.
  23. 1 2 3 Falcón, Francisco (2015-03-25). "Exploiting CVE-2015-0311, Part II: Bypassing Control Flow Guard on Windows 8.1 Update 3". Core Security. Retrieved 2017-01-19.
  24. "Control Flow Guard" (PDF). Trend Micro. Retrieved 2017-01-19.
  25. 1 2 "Windows 10 Control Flow Guard Internals" (PDF). Power of Community. Retrieved 2017-01-19.
  26. 1 2 3 4 "Bypass Control Flow Guard Comprehensively" (PDF). BlackHat. Retrieved 2017-01-19.
  27. "An interesting detail about Control Flow Guard". Bromium. Retrieved 2017-01-19.
  28. Thomas, Sam (18 August 2016). "Object Oriented Exploitation: New techniques in Windows mitigation bypass". Slideshare. Retrieved 2017-01-19.
  29. "Advancing Windows Security" . Retrieved 2021-05-19.
  30. "EXTENDED FLOW GUARD UNDER THE MICROSCOPE". 18 May 2021. Retrieved 2021-05-19.
  31. "Exploit Development: Between a Rock and a (Xtended Flow) Guard Place: Examining XFG". 23 August 2020. Retrieved 2021-05-19.