Babak Falsafi

Last updated
Babak Falsafi Prof. Babak Falsafi.png
Babak Falsafi

Babak Falsafi is a Swiss computer scientist specializing in computer architecture and digital platform design. He is the founding director of EcoCloud at EPFL, an industrial/academic consortium investigating efficient and intelligent data-centric technologies. He is a professor in the School of Computer and Communication Sciences at EPFL. Prior to that he was a professor of electrical and computer engineering at Carnegie Mellon University, and an assistant professor of electrical and computer engineering at Purdue University. He holds a bachelor's degree in computer science, a bachelor's degree in electrical and computer engineering (both summa cum laude) with distinctions from SUNY Buffalo, and a master's degree and PhD in computer science from University Wisconsin - Madison.

He has made numerous contributions to computer system design and evaluation including a server architecture [1] which laid the foundation for Sun Microsystems' NUMA machines, [2] technologies to minimize (leakage) power in the memory system in the absence of activity (Supply Gating) [3] and in shared memory (Snoop Filtering) [4] prevalent in modern CPUs and multi-socket servers, and memory system accelerators [5] in modern (ARM) CPUs in mobile platforms. He has shown that hardware memory consistency models are neither necessary (in the 90's) [6] nor sufficient (a decade later) [7] to achieve high performance in multiprocessor systems. These results eventually led to fence speculation in modern (x86) CPUs. He argued and demonstrated that the slowdown in silicon efficiency (Dennard's Law) and density scaling (Moore's Law) would lead to Dark Silicon and specialization in servers. [8] These results led to a follow-on study on careful characterization of scale-out workloads on server platforms [9] which laid the foundation for the first generation of Cavium ARM server CPUs, ThunderX.

He is a recipient of an Alfred P. Sloan Research Fellowship, was named an ACM Fellow in 2015 [10] for contributions to multiprocessor and memory architecture design and evaluation and a Fellow of the Institute of Electrical and Electronics Engineers (IEEE) in 2012 [11] for contributions to multiprocessor architecture and memory systems.

Related Research Articles

<span class="mw-page-title-main">Non-uniform memory access</span> Computer memory design used in multiprocessing

Non-uniform memory access (NUMA) is a computer memory design used in multiprocessing, where the memory access time depends on the memory location relative to the processor. Under NUMA, a processor can access its own local memory faster than non-local memory. NUMA is beneficial for workloads with high memory locality of reference and low lock contention, because a processor may operate on a subset of memory mostly or entirely within its own cache node, reducing traffic on the memory bus.

<span class="mw-page-title-main">John L. Hennessy</span> American computer scientist

John Leroy Hennessy is an American computer scientist who is chairperson of Alphabet Inc. (Google). Hennessy is one of the founders of MIPS Technologies and Atheros, and also the tenth President of Stanford University. Hennessy announced that he would step down in the summer of 2016. He was succeeded as president by Marc Tessier-Lavigne. Marc Andreessen called him "the godfather of Silicon Valley."

<span class="mw-page-title-main">David Patterson (computer scientist)</span> American computer pioneer and academic (born 1947)

David Andrew Patterson is an American computer pioneer and academic who has held the position of professor of computer science at the University of California, Berkeley since 1976. He announced retirement in 2016 after serving nearly forty years, becoming a distinguished software engineer at Google. He currently is vice chair of the board of directors of the RISC-V Foundation, and the Pardee Professor of Computer Science, Emeritus at UC Berkeley.

Cache only memory architecture (COMA) is a computer memory organization for use in multiprocessors in which the local memories at each node are used as cache. This is in contrast to using the local memories as actual main memory, as in NUMA organizations.

Thread Level Speculation (TLS), also known as Speculative Multi-threading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.

Maurice Peter Herlihy is an American computer scientist active in the field of multiprocessor synchronization. Herlihy has contributed to areas including theoretical foundations of wait-free synchronization, linearizable data structures, applications of combinatorial topology to distributed computing, as well as hardware and software transactional memory. He is the An Wang Professor of Computer Science at Brown University, where he has been a member of the faculty since 1994.

<span class="mw-page-title-main">International Symposium on Computer Architecture</span>

The International Symposium on Computer Architecture (ISCA) is an annual academic conference on computer architecture, generally viewed as the top-tier in the field. Association for Computing Machinery's Special Interest Group on Computer Architecture and Institute of Electrical and Electronics Engineers Computer Society are technical sponsors.

In cache coherency protocol literature, Write-Once was the first MESI protocol defined. It has the optimization of executing write-through on the first write and a write-back on all subsequent writes, reducing the overall bus traffic in consecutive writes to the computer memory. It was first described by James R. Goodman in (1983). Cache coherence protocols are an important issue in Symmetric multiprocessing systems, where each CPU maintains a cache of the memory.

In multiprocessor computer systems, software lockout is the issue of performance degradation due to the idle wait times spent by the CPUs in kernel-level critical sections. Software lockout is the major cause of scalability degradation in a multiprocessor system, posing a limit on the maximum useful number of processors. To mitigate the phenomenon, the kernel must be designed to have its critical sections as short as possible, therefore decomposing each data structure in smaller substructures.

<span class="mw-page-title-main">James R. Goodman</span> American computer scientist

James Richard "Jim" Goodman retired as professor of computer science at the University of Auckland in Auckland, New Zealand, and emeritus professor at the University of Wisconsin–Madison.

<span class="mw-page-title-main">SGI Origin 2000</span> Series of server computers

The SGI Origin 2000 is a family of mid-range and high-end server computers developed and manufactured by Silicon Graphics (SGI). They were introduced in 1996 to succeed the SGI Challenge and POWER Challenge. At the time of introduction, these ran the IRIX operating system, originally version 6.4 and later, 6.5. A variant of the Origin 2000 with graphics capability is known as the Onyx2. An entry-level variant based on the same architecture but with a different hardware implementation is known as the Origin 200. The Origin 2000 was succeeded by the Origin 3000 in July 2000, and was discontinued on June 30, 2002.

<span class="mw-page-title-main">Kunle Olukotun</span> British-born Nigerian computer scientist

Oyekunle Ayinde "Kunle" Olukotun is a British-born Nigerian computer scientist who is the Cadence Design Systems Professor of the Stanford School of Engineering, Professor of Electrical Engineering and Computer Science at Stanford University and the director of the Stanford Pervasive Parallelism Lab. Olukotun is known as the “father of the multi-core processor”, and the leader of the Stanford Hydra Chip Multiprocessor research project. Olukotun's achievements include designing the first general-purpose multi-core CPU, innovating single-chip multiprocessor and multi-threaded processor design, and pioneering multicore CPUs and GPUs, transactional memory technology and domain-specific languages programming models. Olukotun's research interests include computer architecture, parallel programming environments and scalable parallel systems, domain specific languages and high-level compilers.

<span class="mw-page-title-main">Tachyon (software)</span>

Tachyon is a parallel/multiprocessor ray tracing software. It is a parallel ray tracing library for use on distributed memory parallel computers, shared memory computers, and clusters of workstations. Tachyon implements rendering features such as ambient occlusion lighting, depth-of-field focal blur, shadows, reflections, and others. It was originally developed for the Intel iPSC/860 by John Stone for his M.S. thesis at University of Missouri-Rolla. Tachyon subsequently became a more functional and complete ray tracing engine, and it is now incorporated into a number of other open source software packages such as VMD, and SageMath. Tachyon is released under a permissive license.

Margaret Martonosi is an American computer scientist who is currently the Hugh Trumbull Adams '35 Professor of Computer Science at Princeton University. Martonosi is noted for her research in computer architecture and mobile computing with a particular focus on power-efficiency.

In the electronics industry, dark silicon is the amount of circuitry of an integrated circuit that cannot be powered-on at the nominal operating voltage for a given thermal design power (TDP) constraint.

In computing, energy proportionality is a measure of the relationship between power consumed in a computer system, and the rate at which useful work is done. If the overall power consumption is proportional to the computer's utilization, then the machine is said to be energy proportional. Equivalently stated, for an idealized energy proportional computer, the overall energy per operation is constant for all possible workloads and operating conditions.

An AI accelerator, deep learning processor or neural processing unit (NPU) is a class of specialized hardware accelerator or computer system designed to accelerate artificial intelligence and machine learning applications, including artificial neural networks and computer vision. Typical applications include algorithms for robotics, Internet of Things, and other data-intensive or sensor-driven tasks. They are often manycore designs and generally focus on low-precision arithmetic, novel dataflow architectures or in-memory computing capability. As of 2024, a typical AI integrated circuit chip contains tens of billions of MOSFETs.

Mark D. Hill is a computer scientist and professor at the University of Wisconsin–Madison. He has been cited over 27,000 times.

Timothy M. Pinkston is an American computer engineer, researcher, educator and administrator whose work is focused in the area of computer architecture. He holds the George Pfleger Chair in Electrical and Computer Engineering and is a Professor of Electrical and Computer Engineering at University of Southern California (USC). He also serves in an administrative role as Vice Dean for Faculty Affairs at the USC Viterbi School of Engineering.

Trevor Mudge is a computer scientist, academic and researcher. He is the Bredt Family Chair of Computer Science and Engineering, and Professor of Electrical Engineering and Computer Science at the University of Michigan.

References

  1. Falsafi, Babak; Wood, David A. (1997). "Reactive NUMA: A design for unifying S-COMA and CC-NUMA". ACM Sigarch Computer Architecture News. 25 (2): 229–240. doi:10.1145/384286.264205.
  2. Hagersten, Eric; Koster, Michael (1999). "WildFire: A scalable path for SMPS". Proceedings Fifth International Symposium on High-Performance Computer Architecture. Hpca '99. pp. 172–181. CiteSeerX   10.1.1.21.8882 . doi:10.1109/HPCA.1999.744361. ISBN   9780769500041. S2CID   13915228.
  3. Powell, Michael; Yang, Se-Hyun; Falsafi, Babak; Roy, Kaushik; Vijaykumar, T. N. (2000). "Gated-Vdd". Proceedings of the 2000 international symposium on Low power electronics and design - ISLPED '00. pp. 90–95. doi:10.1145/344166.344526. ISBN   9781581131901. S2CID   3241652.
  4. Moshovos, Andreas; Memik, Gokhan; Falsafi, Babak; Choudhary, Alok (2001). "JETTY: Filtering snoops for reduced energy consumption in SMP servers" (PDF). Proceedings HPCA Seventh International Symposium on High-Performance Computer Architecture. pp. 85–96. doi:10.1109/HPCA.2001.903254. ISBN   0-7695-1019-1. S2CID   6811066.
  5. Somogyi, Stephen; Wenisch, Thomas F.; Ailamaki, Anastasia; Falsafi, Babak; Moshovos, Andreas (2006). "Spatial Memory Streaming". ACM Sigarch Computer Architecture News. 34 (2): 252–263. doi:10.1145/1150019.1136508. S2CID   3552110.
  6. Gniady, Chris; Falsafi, Babak; Vijaykumar, T. N. (1999). "Is SC + ILP = RC?". ACM Sigarch Computer Architecture News. 27 (2): 162–171. doi: 10.1145/307338.300993 .
  7. Wenisch, Thomas F.; Ailamaki, Anastasia; Falsafi, Babak; Moshovos, Andreas (2007). "Mechanisms for store-wait-free multiprocessors". Proceedings of the 34th annual international symposium on Computer architecture (PDF). Isca '07. Vol. 35. pp. 266–277. doi:10.1145/1250662.1250696. ISBN   9781595937063. S2CID   1024059.
  8. Hardavellas, Nikos; Ferdman, Michael; Falsafi, Babak; Ailamaki, Anastasia (2011). "Toward Dark Silicon in Servers". IEEE Micro. 31 (4): 6–15. doi:10.1109/MM.2011.77. S2CID   2765349.
  9. Ferdman, Michael; Adileh, Almutaz; Kocberber, Onur; Volos, Stavros; Alisafaee, Mohammad; Jevdjic, Djordje; Kaynak, Cansu; Popescu, Adrian Daniel; Ailamaki, Anastasia; Falsafi, Babak (March 2012). "Clearing the clouds: A study of emerging scale-out workloads on modern hardware". ACM Sigarch Computer Architecture News. 40 (1): 37–48. doi:10.1145/2189750.2150982.
  10. Falsafi, Babak. "ACM Fellows".
  11. "2012 elevated fellow" (PDF). IEEE Fellows Directory. Archived from the original (PDF) on February 15, 2012.