Pacman (security vulnerability)

Pacman
Date discovered	Publicly disclosed June 10, 2022;21 months ago
Date patched	Unable to be patched
Affected hardware	Apple M1 processors
Website	pacmanattack.com

Last updated March 28, 2024

Pacman^{[lower-alpha 1]} is a side-channel vulnerability in certain ARM CPUs that was made public by Massachusetts Institute of Technology security researchers on June 10, 2021. It affects the pointer authentication (PAC) mechanism in many ARMv8.3 chips, including Apple's M1 CPU.^[1] Pacman creates an 'oracle' that lets an attacker guess a pointer's PAC signature without crashing the program if the guess is wrong. PAC signatures are typically less than 16 bits wide, so an attacker can use the oracle to guess the signature in 2¹⁶ tries or fewer. It is unfixable without hardware changes because it is caused by the inherent design of CPU caches and branch predictors.

Impact and response

Pacman alone is not an exploitable vulnerability. PAC is a 'last line of defense'^[2] that detects when software running on the CPU is being exploited by a memory corruption attack and reacts by crashing the software before the attacker completes their exploit. Apple stated that they did not believe the vulnerability posed a serious threat to users because it requires specific conditions to be exploited.^[3]

Background

Pacman is similar to Spectre, abusing two key CPU optimizations to create a PAC oracle: branch prediction and memory caching.^[4]

PAC (Pointer Authentication Codes)

PAC is a security feature in ARMv8.3-based computer processors that mitigates against return-oriented programming by adding a cryptographic signature to the upper bits of pointers. Compilers emit PAC 'sign' instructions before storing pointers to memory, and 'verify' instructions after loading pointers from memory. If an attacker tampers with the pointer, the signature becomes invalid and the program crashes when the pointer is next accessed.^[5] PAC signatures are not cryptographically secure because they need to be small enough to fit into the unused upper bits of pointers. Therefore, if an attacker can reliably test whether a guessed signature is correct without crashing the program, they can brute-force the correct signature.^[6]

Branch prediction

Modern CPUs employ branch prediction to reduce the number of pipeline stalls caused by conditional branches. Branch prediction uses heuristics to guess the direction of a conditional branch and begin executing the predicted path – while the condition is still being evaluated. Instructions executed during this period are 'speculative', and the CPU holds their results in the re-order buffer (ROB) without writing them back to memory. Once the CPU finishes evaluating the condition and determines that its initial prediction was correct, it 'retires' the instructions in the ROB by writing their changes back to memory and propagating any exceptions produced. If the speculation was incorrect, the CPU flushes the ROB and resumes execution at the correct location.^[7]

Memory caching

CPU caches accelerate memory accesses by caching frequently accessed memory on the CPU die. This lowers the cost of memory accesses from hundreds of cycles to fewer than 10, by reducing the amount of time spent communicating with the physically separate northbridge and RAM chip. When an uncached address is loaded, the CPU immediately stashes the loaded data into the cache, evicting another entry if the cache is full. These changes are not held in the ROB because the presence or absence of an address in the cache is considered 'unobservable', so stashes and evictions that occur during speculative execution persist after the ROB has been flushed, even if that path was not ultimately taken.^[8]

Mechanism

Principle

Pacman tricks the CPU into checking the validity of a guessed PAC signature within a mispredicted branch so that exceptions produced by potentially incorrect guesses are discarded during the ROB flush.^[9] If the guess was incorrect, the exception thrown during speculative execution forces the CPU to stall, preventing further instructions from being speculatively executed. A Pacman gadget is a sequence of instructions of the following form:

if(condition):ptr=verify(attacker_tamperable_pointer)load(ptr)

Sequences of this form are common and can be found in most compiled programs supporting PAC.^[10] When the CPU mispredicts the condition, it begins speculatively executing the PAC verification instruction. If the attacker's guess was correct, the verification instruction succeeds and the CPU proceeds to load the address from memory; if the guess was incorrect, the verification instruction throws an exception. This exception is held in the ROB and then discarded once the CPU finds the condition to be false. The attacker then uses a hardware side-channel to determine whether the load instruction was executed, therefore determining whether their guessed signature was correct.^[11]

Attack

Ravichandran et al. demonstrate that the cache-based Prime and Probe technique can be used to determine whether the load instruction executed.^[11] The attacker determines if the load instruction in a Pacman gadget was executed by filling the cache with data, calling the gadget, and checking the latency of accessing the previously loaded addresses. If one of the addresses takes longer than before, it was evicted by the gadget and the attacker knows that their guess was correct. The attacker may then use this forged pointer elsewhere in the program to hijack it.

1. Train

The attacker calls the Pacman gadget many times with condition = true. The branch predictor is now trained to guess that the condition is true on subsequent calls. During this period, attacker_tamperable_pointer is its original value with a valid PAC signature.

2. Prime

The attacker fills the L1 cache by loading from addresses they control. The contents of these memory locations does not matter – the attacker just needs to be able to precisely measure their access latency.^[12]

Initial L1 cache
Address	Value
Set 0
...	...
...	...
...	...
...
Set 0x1FFF
Address	Value
...	...
...	...
...	...

Primed L1 cache
Address	Value
Set 0
0xab30000	AAAAAAAAA
0x1230000	AAAAAAAAA
0x1920000	AAAAAAAAA
...
Set 0x1FFF
Address	Value
0x2001FFF	AAAAAAAAA
0x6a21FFF	AAAAAAAAA
0x09b1FFF	AAAAAAAAA

3. Evict

The attacker overwrites attacker_tamperable_pointer with their target pointer and guess for the target pointer's PAC signature. They then call the Pacman gadget with condition = false, causing the branch to be mispredicted. The branch predictor will speculatively execute the contents of the if statement, before eventually flushing the pipeline and rolling back.^[13]

During this speculative execution, two things can occur:

The speculative execution proceeds to the load() instruction. This means that the verify() instruction did not fault, implying the guessed signature was correct. The load() instruction will then load the target pointer into cache, evicting an address in the attacker's eviction set.
Speculative execution faults on the verify instruction, preventing execution of the load(). This implies the guessed signature was wrong. Since this was speculatively executed within a mispredicted branch, the fault is not propagated to the program.

L1 Cache after running gadget if guess was correct
Set 0
Address	Value
0xab30000	AAAAAAAAA
0x1230000	AAAAAAAAA
0x1920000	AAAAAAAAA
...
Set 0x1FFF
Address	Value
pacman address	my data
0x6a21FFF	AAAAAAAAA
0x09b1FFF	AAAAAAAAA

L1 Cache after running gadget if guess was wrong
Set 0
Address	Value
0xab30000	AAAAAAAAA
0x1230000	AAAAAAAAA
0x1920000	AAAAAAAAA
...
Set 0x1FFF
Address	Value
0x2001FFF	AAAAAAAAA
0x6a21FFF	AAAAAAAAA
0x09b1FFF	AAAAAAAAA

4. Probe

The attacker measures the access time for each element in their eviction set. If one of the elements was evicted (i.e., the access is slow) then the guess was correct. If none of the elements were evicted (i.e., all accesses are fast) then the guess was wrong. This process can be repeated with different guesses until the correct signature is found.^[11]

Notes

↑ Stylized PACMAN.

Related Research Articles

ARM is a family of RISC instruction set architectures (ISAs) for computer processors. Arm Ltd. develops the ISAs and licenses them to other companies, who build the physical devices that use the instruction set. It also designs and licenses cores that implement these ISAs.

Speculative execution is an optimization technique where a computer system performs some task that may not be needed. Work is done before it is known whether it is actually needed, so as to prevent a delay that would have to be incurred by doing the work after it is known that it is needed. If it turns out the work was not needed after all, most changes made by the work are reverted and the results are ignored.

In the history of computer hardware, some early reduced instruction set computer central processing units used a very similar architectural solution, now called a classic RISC pipeline. Those CPUs were: MIPS, SPARC, Motorola 88000, and later the notional CPU DLX invented for education.

In computer architecture, a branch predictor is a digital circuit that tries to guess which way a branch will go before this is known definitively. The purpose of the branch predictor is to improve the flow in the instruction pipeline. Branch predictors play a critical role in achieving high performance in many modern pipelined microprocessor architectures.

In cryptography, a timing attack is a side-channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms. Every logical operation in a computer takes time to execute, and the time can differ based on the input; with precise measurements of the time for each operation, an attacker can work backwards to the input. Finding secrets through timing information may be significantly easier than using cryptanalysis of known plaintext, ciphertext pairs. Sometimes timing information is combined with cryptanalysis to increase the rate of information leakage.

Explicitly parallel instruction computing (EPIC) is a term coined in 1997 by the HP–Intel alliance to describe a computing paradigm that researchers had been investigating since the early 1980s. This paradigm is also called Independence architectures. It was the basis for Intel and HP development of the Intel Itanium architecture, and HP later asserted that "EPIC" was merely an old term for the Itanium architecture. EPIC permits microprocessors to execute software instructions in parallel by using the compiler, rather than complex on-die circuitry, to control parallel instruction execution. This was intended to allow simple performance scaling without resorting to higher clock frequencies.

Addressing modes are an aspect of the instruction set architecture in most central processing unit (CPU) designs. The various addressing modes that are defined in a given instruction set architecture define how the machine language instructions in that architecture identify the operand(s) of each instruction. An addressing mode specifies how to calculate the effective memory address of an operand by using information held in registers and/or constants contained within a machine instruction or elsewhere.

A branch is an instruction in a computer program that can cause a computer to begin executing a different instruction sequence and thus deviate from its default behavior of executing instructions in order. Branch may also refer to the act of switching execution to a different instruction sequence as a result of executing a branch instruction. Branch instructions are used to implement control flow in program loops and conditionals.

In electronics, computer science and computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor. A given ISA may be implemented with different microarchitectures; implementations may vary due to different goals of a given design or due to shifts in technology.

Memory disambiguation is a set of techniques employed by high-performance out-of-order execution microprocessors that execute memory access instructions out of program order. The mechanisms for performing memory disambiguation, implemented using digital logic inside the microprocessor core, detect true dependencies between memory operations at execution time and allow the processor to recover when a dependence has been violated. They also eliminate spurious memory dependencies and allow for greater instruction-level parallelism by allowing safe out-of-order execution of loads and stores.

Memory ordering describes the order of accesses to computer memory by a CPU. The term can refer either to the memory ordering generated by the compiler during compile time, or to the memory ordering generated by a CPU during runtime.

Runahead is a technique that allows a computer processor to speculatively pre-process instructions during cache miss cycles. The pre-processed instructions are used to generate instruction and data stream prefetches by executing instructions leading to cache misses before they would normally occur, effectively hiding memory latency. In runahead, the processor uses the idle execution resources to calculate instruction and data stream addresses using the available information that is independent of a cache miss. Once the processor has resolved the initial cache miss, all runahead results are discarded, and the processor resumes execution as normal. The primary use case of the technique is to mitigate the effects of the memory wall. The technique may also be used for other purposes, such as pre-computing branch outcomes to achieve highly accurate branch prediction.

Return-oriented programming (ROP) is a computer security exploit technique that allows an attacker to execute code in the presence of security defenses such as executable space protection and code signing.

<span class="mw-page-title-main">AArch64</span> 64-bit extension of the ARM architecture

AArch64 or ARM64 is the 64-bit extension of the ARM architecture family. It was first introduced with the Armv8-A architecture, and had many extension updates.

Latency oriented processor architecture is the microarchitecture of a microprocessor designed to serve a serial computing thread with a low latency. This is typical of most central processing units (CPU) being developed since the 1970s. These architectures, in general, aim to execute as many instructions as possible belonging to a single serial thread, in a given window of time; however, the time to execute a single instruction completely from fetch to retire stages may vary from a few cycles to even a few hundred cycles in some cases. Latency oriented processor architectures are the opposite of throughput-oriented processors which concern themselves more with the total throughput of the system, rather than the service latencies for all individual threads that they work on.

Meltdown is one of the two original transient execution CPU vulnerabilities. Meltdown affects Intel x86 microprocessors, IBM POWER processors, and some ARM-based microprocessors. It allows a rogue process to read all memory, even when it is not authorized to do so.

Lazy FPU state leak, also referred to as Lazy FP State Restore or LazyFP, is a security vulnerability affecting Intel Core CPUs. The vulnerability is caused by a combination of flaws in the speculative execution technology present within the affected CPUs and how certain operating systems handle context switching on the floating point unit (FPU). By exploiting this vulnerability, a local process can leak the content of the FPU registers that belong to another process. This vulnerability is related to the Spectre and Meltdown vulnerabilities that were publicly disclosed in January 2018.

Transient execution CPU vulnerabilities are vulnerabilities in a computer system in which a speculative execution optimization implemented in a microprocessor is exploited to leak secret data to an unauthorized party. The archetype is Spectre, and transient execution attacks like Spectre belong to the cache-attack category, one of several categories of side-channel attacks. Since January 2018 many different cache-attack vulnerabilities have been identified.

The ARM Cortex-A77 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin design centre. ARM announced an increase of 23% and 35% in integer and floating point performance, respectively. Memory bandwidth increased 15% relative to the A76.

SWAPGS, also known as Spectre variant 1, is a computer security vulnerability that utilizes the branch prediction used in modern microprocessors. Most processors use a form of speculative execution, this feature allows the processors to make educated guesses about the instructions that will most likely need to be executed in the near future. This speculation can leave traces in the cache, which attackers use to extract data using a timing attack, similar to side-channel exploitation of Spectre.

References

"PACMAN security vulnerability". developer.arm.com. Archived from the original on 2023-11-16.
"Pointer Authentication on ARMv8.3" (PDF). qualcomm.com.
Heinrich, Joe (1996). MIPS R10000 Microprocessor User's Manual (2.0 ed.). MIPS Technologies.
Page, Carly (2022-06-10). "MIT researchers uncover 'unpatchable' flaw in Apple M1 chips". TechCrunch.
Tomasulo, R. M. (January 1967). "An Efficient Algorithm for Exploiting Multiple Arithmetic Units". IBM Journal of Research and Development. 11 (1): 25–33. doi:10.1147/rd.111.0025. ISSN 0018-8646.
Ravichandran, Joseph; Na, Weon Taek; Lang, Jay; Yan, Mengjia (July 2023). "PACMAN: Attacking ARM Pointer Authentication With Speculative Execution". IEEE Micro. 43 (4): 11–18. doi:10.1109/MM.2023.3273189. ISSN 0272-1732.
Ravichandran, Joseph; Na, Weon Taek; Lang, Jay; Yan, Mengjia (18 June 2022). PACMAN: attacking ARM pointer authentication with speculative execution. Annual International Symposium on Computer Architecture. ACM. pp. 685–698. doi: 10.1145/3470496.3527429 . hdl: 1721.1/146470 . ISBN 978-1-4503-8610-4.
Liu, Fangfei; Yarom, Yuval; Ge, Qian; Heiser, Gernot; Lee, Ruby B. (May 2015). Last-Level Cache Side-Channel Attacks are Practical. IEEE Symposium on Security and Privacy. IEEE. pp. 605–622. doi: 10.1109/SP.2015.43 . ISBN 978-1-4673-6949-7.
Lee, Smith (January 1984). "Branch Prediction Strategies and Branch Target Buffer Design". Computer. 17 (1): 6–22. doi:10.1109/MC.1984.1658927. ISSN 0018-9162.

External links

Official website

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Stylized PACMAN.

[FOOTNOTEPacman_advisory2023-2] Pacman advisory 2023.

[FOOTNOTERavichandranNaLangYan202220–21-3] Ravichandran et al. 2022, pp. 20–21.

[FOOTNOTEPage2022-4] Page 2022.

[FOOTNOTERavichandranNaLangYan202312-5] Ravichandran et al. 2023, p. 12.

[FOOTNOTEPAC_ISA_Reference2023-6] PAC ISA Reference 2023.

[FOOTNOTERavichandranNaLangYan2022-7] Ravichandran et al. 2022.

[FOOTNOTETomasulo1967-8] Tomasulo 1967.

[FOOTNOTEHeinrich199613–14,_21–22-9] Heinrich 1996, pp. 13–14, 21–22.

[FOOTNOTELiuYaromGeHeiser2015-10] Liu et al. 2015.

[FOOTNOTERavichandranNaLangYan2022689-11] Ravichandran et al. 2022, p. 689.

[FOOTNOTERavichandranNaLangYan2022695-12] 1 2 3 Ravichandran et al. 2022, p. 695.

[FOOTNOTELiuYaromGeHeiser2015607-13] Liu et al. 2015, p. 607.

[FOOTNOTELee198416-14] Lee 1984, p. 16.

[lower-alpha 1]

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]