Uzi Vishkin

Uzi Vishkin
Uzi Vishkin
Born	1953; Tel Aviv, Israel
Alma mater	Hebrew University ; Technion
	Scientific career
Fields	parallel algorithms
Institutions	IBM Thomas J. Watson Research Center ; New York University ; Tel Aviv University ; University of Maryland, College Park
Doctoral advisor	Yossi Shiloach

Last updated January 01, 2025

Uzi Vishkin (born 1953) is a computer scientist at the University of Maryland, College Park, where he is Professor of Electrical and Computer Engineering at the University of Maryland Institute for Advanced Computer Studies (UMIACS). Uzi Vishkin is known for his work in the field of parallel computing. In 1996, he was inducted as a Fellow of the Association for Computing Machinery, with the following citation: "One of the pioneers of parallel algorithms research, Dr. Vishkin's seminal contributions played a leading role in forming and shaping what thinking in parallel has come to mean in the fundamental theory of Computer Science."^[1]

Biography

Uzi Vishkin was born in Tel Aviv, Israel. He completed his B.Sc. (1974) and M.Sc. in Mathematics at the Hebrew University, before earning his D.Sc. in Computer Science at the Technion (1981). He then spent a year working at the IBM Thomas J. Watson Research Center in Yorktown Heights, New York. From 1982 to 1984, he worked at the department of computer science at New York University and remained affiliated with it till 1988. From 1984 until 1997 he worked in the computer science department of Tel Aviv University, serving as its chair from 1987 to 1988. Since 1988 he is with the University of Maryland, College Park.

PRAM-on-chip

A notable rudimentary abstraction—that any single instruction available for execution in a serial program executes immediately—made serial computing simple. A consequence of this abstraction is a step-by-step (inductive) explication of the instruction available next for execution. The rudimentary parallel abstraction behind the PRAM-on-chip concept, dubbed Immediate Concurrent Execution (ICE) in Vishkin (2011), is that indefinitely many instructions available for concurrent execution execute immediately. A consequence of ICE is a step-by-step (inductive) explication (also known as lock-step) of the instructions available next for concurrent execution. Moving beyond the serial von Neumann computer (the only successful general purpose platform to date), the aspiration of the PRAM-on-chip concept is that computer science will again be able to augment mathematical induction with a simple one-line computing abstraction. A chronological overview of the evolution of the PRAM-on-chip concept and its hardware and software prototyping follow. In the 1980s and 1990s, Uzi Vishkin co-authored several articles that helped building a theory of parallel algorithms in a mathematical model called parallel random access machine (PRAM), which is a generalization for parallel computing of the standard serial computing model random-access machine (RAM). The parallel machines needed for implementing the PRAM model have not yet been built at the time, and quite a few challenged the ability to ever build such machines. Concluding in 1997^[2] that the transistor count on chip as implied by Moore's Law will allow building a powerful parallel computer on a single silicon chip within a decade, he developed a PRAM-On-Chip vision that called for building a parallel computer on a single chip that allows programmers to develop their algorithms for the PRAM model. He went on to invent the explicit multi-threaded (XMT) computer architecture that enables implementation of this PRAM theory, and led his research team to completing in January 2007 a 64-processor computer^[3] named Paraleap,^[4] that demonstrates the overall concept. The XMT concept was presented in Vishkin et al. (1998), Naishlos et al. (2003), the XMT 64-processor computer in Wen & Vishkin (2008), in Vishkin (2011) and most recently in Ghanim, Vishkin & Barua (2018), where it was shown that lock-step parallel programming (using ICE) can achieve the same performance as the fastest hand-tuned multi-threaded code on XMT systems. Such inductive lock-step approach stands in contrast to multi-threaded programming approaches of other many core systems that are known for challenging programmers. The demonstration of XMT comprised several hardware and software components, as well as teaching PRAM algorithms in order to program the XMT Paraleap, using a language called XMTC. Since making parallel programming easy is one of the biggest challenges facing computer science today, the demonstration also sought to include teaching the basics of PRAM algorithms and XMTC programming to students ranging from high-school to graduate school.

Following his XMT related inventions, Uzi Vishkin was named 2024 Fellow of the National_Academy_of_Inventors (NAI). A University of Maryland announcement ^[5] noted: "Two of Vishkin's 2005 patents integrating parallel processing accelerators into the CPU, or "brain” of the computer, led computer design into a new era. The best-known example is CPUs coupled with integrated graphics processing units, present in well over a billion devices including desktop and laptop computers built since the 2010s".

Parallel algorithms

In the field of parallel algorithms, Uzi Vishkin co-authored the paper Shiloach & Vishkin (1982b) that contributed the work-time (WT) (sometimes called work-depth) framework for conceptualizing and describing parallel algorithms. The WT framework was adopted as the basic presentation framework in the parallel algorithms books JaJa (1992) and Keller, Kessler & Traeff (2001), as well as in the class notes Vishkin (2009). In the WT framework, a parallel algorithm is first described in terms of parallel rounds. For each round, the operations to be performed are characterized, but several issues can be suppressed. For example, the number of operations at each round need not be clear, processors need not be mentioned and any information that may help with the assignment of processors to jobs need not be accounted for. Second, the suppressed information is provided. The inclusion of the suppressed information is, in fact, guided by the proof of a scheduling theorem due to Brent (1974). The WT framework is useful since while it can greatly simplify the initial description of a parallel algorithm, inserting the details suppressed by that initial description is often not very difficult. Similarly, first casting an algorithm in the WT framework can be very helpful for programming it in XMTC. Vishkin (2011) explains the simple connection between the WT framework and the more rudimentary ICE abstraction noted above.

In the field of parallel and distributed algorithms, one of the seminal papers co-authored by Uzi Vishkin is Cole & Vishkin (1986). This work introduced an efficient parallel technique for graph coloring. The Cole–Vishkin algorithm finds a vertex colouring in an n-cycle in O(log^* n) synchronous communication rounds. This algorithm is nowadays presented in many textbooks, including Introduction to Algorithms by Cormen et al.,^[6] and it forms the basis of many other distributed algorithms for graph colouring.^[7]

Other contributions by Uzi Vishkin and various co-authors include parallel algorithms for list ranking, lowest common ancestor, spanning trees, and biconnected components.

Selected publications

Shiloach, Yossi; Vishkin, Uzi (1982a), "An O(log n) parallel connectivity algorithm", Journal of Algorithms, 3: 57–67, doi:10.1016/0196-6774(82)90008-6 .
Shiloach, Yossi; Vishkin, Uzi (1982b), "An O(n² log n) parallel max-flow algorithm", Journal of Algorithms, 3 (2): 128–146, doi:10.1016/0196-6774(82)90013-X .
Mehlhorn, Kurt; Vishkin, Uzi (1984), "Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories", Acta Informatica, 21 (4): 339–374, doi:10.1007/BF00264615, S2CID 29789494 .
Tarjan, Robert; Vishkin, Uzi (1985), "An efficient parallel biconnectivity algorithm", SIAM Journal on Computing , 14 (4): 862–874, CiteSeerX 10.1.1.465.8898 , doi:10.1137/0214061, S2CID 7231609 .
Vishkin, Uzi (1985), "Optimal parallel pattern matching in strings", Information and Control, 67 (1–3): 91–113, doi: 10.1016/S0019-9958(85)80028-0 .
Cole, Richard; Vishkin, Uzi (1986), "Deterministic coin tossing with applications to optimal parallel list ranking", Information and Control, 70 (1): 32–53, doi: 10.1016/S0019-9958(86)80023-7 .
Vishkin, Uzi; Dascal, Shlomit; Berkovich, Efraim; Nuzman, Joseph (1998), "Explicit Multi-Threading (XMT) bridging models for instruction parallelism", Proc. 1998 ACM Symposium on Parallel Algorithms and Architectures (SPAA), pp. 140–151.
Naishlos, Dorit; Nuzman, Joseph; Tseng, Chau-Wen; Vishkin, Uzi (2003), "Towards a First Vertical Prototyping of an Extremely Fine-Grained Parallel Programming Approach" (PDF), Theory of Computing Systems, 36 (5): 551–552, doi:10.1007/s00224-003-1086-6, S2CID 1929495 .
Wen, Xingzhi; Vishkin, Uzi (2008), "FPGA-based prototype of a PRAM-on-chip processor", Proc. 2008 ACM Conference on Computing Frontiers (Ischia, Italy) (PDF), pp. 55–66, doi:10.1145/1366230.1366240, ISBN 978-1-60558-077-7, S2CID 11557669 .
Vishkin, Uzi (January 2011), "Using simple abstraction to reinvent computing for parallelism", Communications of the ACM, 54 (1): 75–85, doi:10.1145/1866739.1866757, S2CID 10279904 .
Ghanim, Fady; Vishkin, Uzi; Barua, Rajeev (February 2018), "Easy PRAM-Based High-Performance Parallel Programming with ICE", IEEE Transactions on Parallel and Distributed Systems, 29 (2): 377–390, doi: 10.1109/TPDS.2017.2754376 , hdl: 1903/18521 .

Notes

↑ ACM: Fellows Award / Uzi Vishkin.
↑ Vishkin, Uzi. Spawn-join instruction set architecture for providing explicit multithreading. U.S. Patent 6,463,527. See also Vishkin et al. (1998).
↑ University of Maryland, press release, June 26, 2007: "Maryland Professor Creates Desktop Supercomputer" Archived 2009-12-14 at the Wayback Machine .
↑ University of Maryland, A. James Clark School of Engineering, press release, November 28, 2007: "Next Big "Leap" in Computing Technology Gets a Name".
↑ Maryland Today, December 10, 2024: "2 Maryland Engineers Named to National Academy of Inventors: Tao, Vishkin Honored for ‘Creating Tangible Impacts’".
↑ 1st ed., Section 30.5.
↑ See, e.g., Goldberg, Plotkin & Shannon (1988).

Related Research Articles

Distributed computing is a field of computer science that studies distributed systems, defined as computer systems whose inter-communicating components are located on different networked computers.

Very long instruction word (VLIW) refers to instruction set architectures that are designed to exploit instruction-level parallelism (ILP). A VLIW processor allows programs to explicitly specify instructions to execute in parallel, whereas conventional central processing units (CPUs) mostly allow programs to specify instructions to execute in sequence only. VLIW is intended to allow higher performance without the complexity inherent in some other designs.

Parallel computing is a type of computation in which many calculations or processes are carried out simultaneously. Large problems can often be divided into smaller ones, which can then be solved at the same time. There are several different forms of parallel computing: bit-level, instruction-level, data, and task parallelism. Parallelism has long been employed in high-performance computing, but has gained broader interest due to the physical constraints preventing frequency scaling. As power consumption by computers has become a concern in recent years, parallel computing has become the dominant paradigm in computer architecture, mainly in the form of multi-core processors.

In computer science, a parallel algorithm, as opposed to a traditional serial algorithm, is an algorithm which can do multiple operations in a given time. It has been a tradition of computer science to describe serial algorithms in abstract machine models, often the one known as random-access machine. Similarly, many computer science researchers have used a so-called parallel random-access machine (PRAM) as a parallel abstract machine (shared-memory).

Instruction-level parallelism (ILP) is the parallel or simultaneous execution of a sequence of instructions in a computer program. More specifically, ILP refers to the average number of instructions run per step of this parallel execution.

In computer science, a parallel random-access machine is a shared-memory abstract machine. As its name indicates, the PRAM is intended as the parallel-computing analogy to the random-access machine (RAM). In the same way that the RAM is used by sequential-algorithm designers to model algorithmic performance, the PRAM is used by parallel-algorithm designers to model parallel algorithmic performance. Similar to the way in which the RAM model neglects practical issues, such as access time to cache memory versus main memory, the PRAM model neglects such issues as synchronization and communication, but provides any (problem-size-dependent) number of processors. Algorithm cost, for instance, is estimated using two parameters O(time) and O(time × processor_number).

Dataflow architecture is a dataflow-based computer architecture that directly contrasts the traditional von Neumann architecture or control flow architecture. Dataflow architectures have no program counter, in concept: the executability and execution of instructions is solely determined based on the availability of input arguments to the instructions, so that the order of instruction execution may be hard to predict.

<span class="mw-page-title-main">Hardware acceleration</span> Specialized computer hardware

Hardware acceleration is the use of computer hardware designed to perform specific functions more efficiently when compared to software running on a general-purpose central processing unit (CPU). Any transformation of data that can be calculated in software running on a generic CPU can also be calculated in custom-made hardware, or in some mix of both.

In computing, a parallel programming model is an abstraction of parallel computer architecture, with which it is convenient to express algorithms and their composition in programs. The value of a programming model can be judged on its generality: how well a range of different problems can be expressed for a variety of different architectures, and its performance: how efficiently the compiled programs can execute. The implementation of a parallel programming model can take the form of a library invoked from a programming language, as an extension to an existing languages.

In parallel algorithms, the list ranking problem involves determining the position, or rank, of each item in a linked list. That is, the first item in the list should be assigned the number 1, the second item in the list should be assigned the number 2, etc. Although it is straightforward to solve this problem efficiently on a sequential computer, by traversing the list in order, it is more complicated to solve in parallel. As Anderson & Miller (1990) wrote, the problem was viewed as important in the parallel algorithms community both for its many applications and because solving it led to many important ideas that could be applied in parallel algorithms more generally.

In computer science, the prefix sum, cumulative sum, inclusive scan, or simply scan of a sequence of numbers $x 0, x 1, x 2, ...$ is a second sequence of numbers $y 0, y 1, y 2, ...$ , the sums of prefixes of the input sequence:

Thread Level Speculation (TLS), also known as Speculative Multi-threading, or Speculative Parallelization, is a technique to speculatively execute a section of computer code that is anticipated to be executed later in parallel with the normal execution on a separate independent thread. Such a speculative thread may need to make assumptions about the values of input variables. If these prove to be invalid, then the portions of the speculative thread that rely on these input variables will need to be discarded and squashed. If the assumptions are correct the program can complete in a shorter time provided the thread was able to be scheduled efficiently.

Data parallelism is parallelization across multiple processors in parallel computing environments. It focuses on distributing the data across different nodes, which operate on the data in parallel. It can be applied on regular data structures like arrays and matrices by working on each element in parallel. It contrasts to task parallelism as another form of parallelism.

XMTC is a shared-memory parallel programming language. It is an extension of the C programming language which strives to enable easy PRAM-like programming based on the explicit multi-threading paradigm. It is developed as part of the XMT PRAM-On-Chip vision by a research team at the University of Maryland, College Park, led by Dr. Uzi Vishkin.

Kurt Mehlhorn is a German theoretical computer scientist. He has been a vice president of the Max Planck Society and is director of the Max Planck Institute for Computer Science.

Explicit Multi-Threading (XMT) is a computer science paradigm for building and programming parallel computers designed around the parallel random-access machine (PRAM) parallel computational model. A more direct explanation of XMT starts with the rudimentary abstraction that made serial computing simple: that any single instruction available for execution in a serial program executes immediately. A consequence of this abstraction is a step-by-step (inductive) explication of the instruction available next for execution. The rudimentary parallel abstraction behind XMT, dubbed Immediate Concurrent Execution (ICE) in Vishkin (2011), is that indefinitely many instructions available for concurrent execution execute immediately. A consequence of ICE is a step-by-step (inductive) explication of the instructions available next for concurrent execution. Moving beyond the serial von Neumann computer, the aspiration of XMT is that computer science will again be able to augment mathematical induction with a simple one-line computing abstraction.

In computing, algorithmic skeletons, or parallelism patterns, are a high-level parallel programming model for parallel and distributed computing.

In parallel computing, work stealing is a scheduling strategy for multithreaded computer programs. It solves the problem of executing a dynamically multithreaded computation, one that can "spawn" new threads of execution, on a statically multithreaded computer, with a fixed number of processors. It does so efficiently in terms of execution time, memory usage, and inter-processor communication.

Pointer jumping or path doubling is a design technique for parallel algorithms that operate on pointer structures, such as linked lists and directed graphs. Pointer jumping allows an algorithm to follow paths with a time complexity that is logarithmic with respect to the length of the longest path. It does this by "jumping" to the end of the path computed by neighbors.

In computer science, the analysis of parallel algorithms is the process of finding the computational complexity of algorithms executed in parallel – the amount of time, storage, or other resources needed to execute them. In many respects, analysis of parallel algorithms is similar to the analysis of sequential algorithms, but is generally more involved because one must reason about the behavior of multiple cooperating threads of execution. One of the primary goals of parallel analysis is to understand how a parallel algorithm's use of resources changes as the number of processors is changed.

References

Baase, Sara; Van Gelder, Allen (2000), Computer Algorithms Introduction to Design and Analysis (Third ed.), Addison-Wesley, ISBN 978-0-201-61244-8
Brent, Richard P. (1974), "The parallel evaluation of general arithmetic expressions", Journal of the ACM, 21 (2): 201–208, doi: 10.1145/321812.321815 , S2CID 16416106 .
Cormen, Thomas H.; Leiserson, Charles E.; Rivest, Ronald L. (1990), Introduction to Algorithms (First ed.), MIT Press and McGraw-Hill, ISBN 978-0-262-03141-7
Eppstein, David; Galil, Zvi (1988), "Parallel algorithmic techniques for combinatorial computation", Annu. Rev. Comput. Sci., 3: 233–283, doi:10.1146/annurev.cs.03.060188.001313

This survey paper cites 16 papers co-authored by Vishkin

Goldberg, Andrew V.; Plotkin, Serge A.; Shannon, Gregory E. (1988), "Parallel symmetry-breaking in sparse graphs", SIAM Journal on Discrete Mathematics , 1 (4): 434–446, CiteSeerX 10.1.1.39.269 , doi:10.1137/0401044
JaJa, Joseph (1992), An Introduction to Parallel Algorithms, Addison-Wesley, ISBN 978-0-201-54856-3

Cites 36 papers co-authored by Vishkin

Karp, Richard M.; Ramachandran, Vijaya (1988), "A Survey of Parallel Algorithms for Shared-Memory Machines", University of California, Berkeley, Department of EECS, Tech. Rep. UCB/CSD-88-408

This survey paper cites 20 papers co-authored by Vishkin

Keller, Jorg; Kessler, Cristoph W.; Traeff, Jesper L. (2001), Practical PRAM Programming, Wiley-Interscience, ISBN 978-0-471-35351-5

Cites 19 papers co-authored by Vishkin

Manber, Udi (1989), Introduction to Algorithms A Creative Approach, Addison-Wesley, ISBN 978-0-201-12037-0
Vishkin, Uzi (2009), Thinking in Parallel: Some Basic Data-Parallel Algorithms and Techniques, 104 pages (PDF), Class notes of courses on parallel algorithms taught since 1992 at the University of Maryland, College Park, Tel Aviv University and the Technion
Mathematics Genealogy Project: Uzi Vishkin.
ISI Web of Knowledge, highly cited researchers: Uzi Vishkin.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] ACM: Fellows Award / Uzi Vishkin.

[2] Vishkin, Uzi. Spawn-join instruction set architecture for providing explicit multithreading. U.S. Patent 6,463,527. See also Vishkin et al. (1998).

[3] University of Maryland, press release, June 26, 2007: "Maryland Professor Creates Desktop Supercomputer" Archived 2009-12-14 at the Wayback Machine .

[4] University of Maryland, A. James Clark School of Engineering, press release, November 28, 2007: "Next Big "Leap" in Computing Technology Gets a Name".

[5] Maryland Today, December 10, 2024: "2 Maryland Engineers Named to National Academy of Inventors: Tao, Vishkin Honored for ‘Creating Tangible Impacts’".

[6] 1st ed., Section 30.5.

[7] See, e.g., Goldberg, Plotkin & Shannon (1988).

[1]

[2]

[3]

[4]

[5]

[6]

[7]

Authority control databases
International	VIAF WorldCat
National	United States Israel
Academics	Mathematics Genealogy Project Association for Computing Machinery DBLP MathSciNet

Uzi Vishkin
Born	1953 Tel Aviv, Israel
Alma mater	Hebrew University Technion
Scientific career
Fields	parallel algorithms
Institutions	IBM Thomas J. Watson Research Center New York University Tel Aviv University University of Maryland, College Park
Doctoral advisor	Yossi Shiloach