GoFetch

Last updated
GoFetch
Gofetch.svg
The GoFetch project logo
Date discovered20 March 2024;11 days ago (2024-03-20) [1]
DiscovererBoru Chen, Yingchen Wang, Pradyumna Shome, Christopher W. Fletcher, David Kohlbrenner, Riccardo Paccagnella, Daniel Genkin
Affected hardware Apple silicon CPUs (M1, M2, M3 and A14)
Website https://gofetch.fail/

GoFetch is a family of cryptographic attacks on recent Apple silicon CPUs that exploits the CPU's on-chip data memory-dependent prefetcher (DMP) to investigate the contents of memory. [2] [1] CPUs affected include the M1, M2, M3 and A14 series system-on-a-chip processors. [1]

The DMP looks at cache memory content for possible pointer values, and prefetches the data at those locations into cache if it sees memory access patterns that suggest following those pointers would be useful. [3] [4] The GoFetch attacks use those speculative cache fetches to undermine a number of different cryptographic algorithms by using memory access timings to exfiltrate data from those algorithms using timing attacks.

The authors of GoFetch state that they were unable to make their exploit work on the Intel Raptor Lake processor they tested due to its more limited DMP functionality. [1]

Related Research Articles

<span class="mw-page-title-main">Central processing unit</span> Central computer component which executes instructions

A central processing unit (CPU) is the most important processor in a given computer. Its electronic circuitry executes instructions of a computer program, such as arithmetic, logic, controlling, and input/output (I/O) operations. This role contrasts with that of external components, such as main memory and I/O circuitry, and specialized coprocessors such as graphics processing units (GPUs).

<span class="mw-page-title-main">Cache (computing)</span> Additional storage that enables faster access to main storage

In computing, a cache is a hardware or software component that stores data so that future requests for that data can be served faster; the data stored in a cache might be the result of an earlier computation or a copy of data stored elsewhere. A cache hit occurs when the requested data can be found in a cache, while a cache miss occurs when it cannot. Cache hits are served by reading data from the cache, which is faster than recomputing a result or reading from a slower data store; thus, the more requests that can be served from the cache, the faster the system performs.

<span class="mw-page-title-main">Non-uniform memory access</span> Computer memory design used in multiprocessing

Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory. The benefits of NUMA are limited to particular workloads, notably on servers where the data is often associated strongly with certain tasks or users.

<span class="mw-page-title-main">Harvard architecture</span> Computer architecture where code and data each have a separate bus

The Harvard architecture is a computer architecture with separate storage and signal pathways for instructions and data. It is often contrasted with the von Neumann architecture, where program instructions and data share the same memory and pathways.

In cryptography, a timing attack is a side-channel attack in which the attacker attempts to compromise a cryptosystem by analyzing the time taken to execute cryptographic algorithms. Every logical operation in a computer takes time to execute, and the time can differ based on the input; with precise measurements of the time for each operation, an attacker can work backwards to the input. Finding secrets through timing information may be significantly easier than using cryptanalysis of known plaintext, ciphertext pairs. Sometimes timing information is combined with cryptanalysis to increase the rate of information leakage.

In computer security, a side-channel attack is any attack based on extra information that can be gathered because of the fundamental way a computer protocol or algorithm is implemented, rather than flaws in the design of the protocol or algorithm itself or minor, but potentially devastating, mistakes or oversights in the implementation. Timing information, power consumption, electromagnetic leaks, and sound are examples of extra information which could be exploited to facilitate side-channel attacks.

A CPU cache is a hardware cache used by the central processing unit (CPU) of a computer to reduce the average cost to access data from the main memory. A cache is a smaller, faster memory, located closer to a processor core, which stores copies of the data from frequently used main memory locations. Most CPUs have a hierarchy of multiple cache levels, with different instruction-specific and data-specific caches at level 1. The cache memory is typically implemented with static random-access memory (SRAM), in modern CPUs by far the largest part of them by chip area, but SRAM is not always used for all levels, or even any level, sometimes some latter or all levels are implemented with eDRAM.

<span class="mw-page-title-main">Microarchitecture</span> Component of computer engineering

In electronics, computer science and computer engineering, microarchitecture, also called computer organization and sometimes abbreviated as µarch or uarch, is the way a given instruction set architecture (ISA) is implemented in a particular processor. A given ISA may be implemented with different microarchitectures; implementations may vary due to different goals of a given design or due to shifts in technology.

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

Computer security compromised by hardware failure is a branch of computer security applied to hardware. The objective of computer security includes protection of information and property from theft, corruption, or natural disaster, while allowing the information and property to remain accessible and productive to its intended users. Such secret information could be retrieved by different ways. This article focus on the retrieval of data thanks to misused hardware or hardware failure. Hardware could be misused or exploited to get secret data. This article collects main types of attack that can lead to data theft.

Heterogeneous computing refers to systems that use more than one kind of processor or core. These systems gain performance or energy efficiency not just by adding the same type of processors, but by adding dissimilar coprocessors, usually incorporating specialized processing capabilities to handle particular tasks.

Intel Software Guard Extensions (SGX) is a set of instruction codes implementing trusted execution environment that are built into some Intel central processing units (CPUs). They allow user-level and operating system code to define protected private regions of memory, called enclaves. SGX is designed to be useful for implementing secure remote computation, secure web browsing, and digital rights management (DRM). Other applications include concealment of proprietary algorithms and of encryption keys.

Cache prefetching is a technique used by computer processors to boost execution performance by fetching instructions or data from their original storage in slower memory to a faster local memory before it is actually needed. Most modern computer processors have fast and local cache memory in which prefetched data is held until it is required. The source for the prefetch operation is usually main memory. Because of their design, accessing cache memories is typically much faster than accessing main memory, so prefetching data and then accessing it from caches is usually many orders of magnitude faster than accessing it directly from main memory. Prefetching can be done with non-blocking cache control instructions.

<span class="mw-page-title-main">Meltdown (security vulnerability)</span> Microprocessor security vulnerability

Meltdown is one of the two original transient execution CPU vulnerabilities. Meltdown affects Intel x86 microprocessors, IBM POWER processors, and some ARM-based microprocessors. It allows a rogue process to read all memory, even when it is not authorized to do so.

<span class="mw-page-title-main">Spectre (security vulnerability)</span> Processor security vulnerability

Spectre is one of the two original transient execution CPU vulnerabilities, which involve microarchitectural timing side-channel attacks. These affect modern microprocessors that perform branch prediction and other forms of speculation. On most processors, the speculative execution resulting from a branch misprediction may leave observable side effects that may reveal private data to attackers. For example, if the pattern of memory accesses performed by such speculative execution depends on private data, the resulting state of the data cache constitutes a side channel through which an attacker may be able to extract information about the private data using a timing attack.

The ARM Cortex-A77 is a central processing unit implementing the ARMv8.2-A 64-bit instruction set designed by ARM Holdings' Austin design centre. ARM announced an increase of 23% and 35% in integer and floating point performance, respectively. Memory bandwidth increased 15% relative to the A76.

<span class="mw-page-title-main">Apple M1</span> Series of systems-on-a-chip designed by Apple Inc.

Apple M1 is a series of ARM-based system-on-a-chip (SoC) designed by Apple Inc. as a central processing unit (CPU) and graphics processing unit (GPU) for its Mac desktops and notebooks, and the iPad Pro and iPad Air tablets. The M1 chip initiated Apple's third change to the instruction set architecture used by Macintosh computers, switching from Intel to Apple silicon 14 years after they were switched from PowerPC to Intel, and 26 years after the transition from the original Motorola 68000 series to PowerPC. At the time of introduction in 2020, Apple said that the M1 had the world's fastest CPU core "in low power silicon" and the world's best CPU performance per watt. Its successor, Apple M2, was announced on June 6, 2022, at Worldwide Developers Conference (WWDC).

<span class="mw-page-title-main">Apple M2</span> System on a Chip (SoC) designed by Apple Inc.

Apple M2 is a series of ARM-based system on a chip (SoC) designed by Apple Inc. as a central processing unit (CPU) and graphics processing unit (GPU) for its Mac desktops and notebooks, the iPad Pro tablet, and the Vision Pro mixed reality headset. It is the second generation of ARM architecture intended for Apple's Mac computers after switching from Intel Core to Apple silicon, succeeding the M1. Apple announced the M2 on June 6, 2022, at Worldwide Developers Conference (WWDC), along with models of the MacBook Air and the 13-inch MacBook Pro using the M2. The M2 is made with TSMC's "Enhanced 5-nanometer technology" N5P process and contains 20 billion transistors, a 25% increase from the M1. Apple claims CPU improvements up to 18% and GPU improvements up to 35% compared to the M1.

<span class="mw-page-title-main">Pacman (security vulnerability)</span> Processor security vulnerability

Pacman is a side-channel vulnerability in certain ARM CPUs that was made public by Massachusetts Institute of Technology security researchers on June 10, 2021. It affects the pointer authentication (PAC) mechanism in many ARMv8.3 chips, including Apple's M1 CPU. Pacman creates an 'oracle' that lets an attacker guess a pointer's PAC signature without crashing the program if the guess is wrong. PAC signatures are typically less than 16 bits wide, so an attacker can use the oracle to guess the signature in 216 tries or fewer. It is unfixable without hardware changes because it is caused by the inherent design of CPU caches and branch predictors.

A data memory-dependent prefetcher (DMP) is a cache prefetcher that looks at cache memory content for possible pointer values, and prefetches the data at those locations into cache if it sees memory access patterns that suggest following those pointers would be useful.

References

  1. 1 2 3 4 "GoFetch: Breaking Constant-Time Cryptographic Implementations Using Data Memory-Dependent Prefetchers". gofetch.fail. Retrieved 2024-03-22.
  2. "Apple Silicon chip flaw can leak encryption keys, say researchers". AppleInsider. 2024-03-21. Retrieved 2024-03-22.
  3. "Augury: Using Data Memory-Dependent Prefetchers to Leak Data at Rest". www.prefetchers.info. 2022-05-02. Retrieved 2024-03-30.
  4. Vicarte, Jose Rodrigo Sanchez; Flanders, Michael; Paccagnella, Riccardo; Garrett-Grossman, Grant; Morrison, Adam; Fletcher, Christopher W.; Kohlbrenner, David (May 2022). Augury: Using Data Memory-Dependent Prefetchers to Leak Data at Rest. 2022 IEEE Symposium on Security and Privacy (SP). San Francisco, CA, USA: IEEE. pp. 1491–1505. doi:10.1109/SP46214.2022.9833570. ISBN   978-1-6654-1316-9.