List of important publications in concurrent, parallel, and distributed computing

Last updated

This is a list of important publications in concurrent, parallel, and distributed computing, organized by field.

Contents

Some reasons why a particular publication might be regarded as important:

Consensus, synchronization, and mutual exclusion

Synchronizing concurrent processes. Achieving consensus in a distributed system in the presence of faulty nodes, or in a wait-free manner. Mutual exclusion in concurrent systems.

Dijkstra: “Solution of a problem in concurrent programming control”

Dijkstra, E. W. (1965). "Solution of a problem in concurrent programming control". Communications of the ACM . 8 (9): 569. doi:10.1145/365559.365617.
This paper presented the first solution to the mutual exclusion problem. Leslie Lamport writes that this work “started the field of concurrent and distributed algorithms”. [1]

Pease, Shostak, Lamport: “Reaching agreement in the presence of faults”
Lamport, Shostak, Pease: “The Byzantine generals problem”

Pease, Marshall; Shostak, Robert; Lamport, Leslie (1980), "Reaching agreement in the presence of faults", Journal of the ACM , 27 (1): 228–234, CiteSeerX   10.1.1.68.4044 , doi:10.1145/322186.322188, S2CID   6429068 .
Lamport, Leslie; Shostak, Robert; Pease, Marshall (1982), "The Byzantine generals problem", ACM Transactions on Programming Languages and Systems , 4 (3): 382–401, CiteSeerX   10.1.1.64.2312 , doi:10.1145/357172.357176, S2CID   55899582 .
These two papers introduced and studied the problem that is nowadays known as Byzantine fault tolerance. The 1980 paper presented the classical lower bound that agreement is impossible if at least 1/3 of the nodes are faulty; it received the Edsger W. Dijkstra Prize in Distributed Computing in 2005. [2] The highly cited 1982 paper gave the problem its present name, and also presented algorithms for solving the problem. [3]

Herlihy, Shavit: “The topological structure of asynchronous computation”
Saks, Zaharoglou: “Wait-free k-set agreement is impossible …”

Herlihy, Maurice; Shavit, Nir (1999), "The topological structure of asynchronous computation" (PDF), Journal of the ACM , 46 (6): 858–923, CiteSeerX   10.1.1.78.1455 , doi:10.1145/331524.331529, S2CID   5797174 . Gödel prize lecture.
Saks, Michael; Zaharoglou, Fotios (2000), "Wait-free k-set agreement is impossible: The topology of public knowledge", SIAM Journal on Computing , 29 (5): 1449–1483, doi:10.1137/S0097539796307698 .
These two papers study wait-free algorithms for generalizations of the consensus problem, and showed that these problems can be analyzed by using topological properties and arguments. Both papers received the Gödel Prize in 2004. [4]

Foundations of distributed systems

Fundamental concepts such as time and knowledge in distributed systems.

Halpern, Moses: “Knowledge and common knowledge in a distributed environment”

Halpern, Joseph; Moses, Yoram (1990), "Knowledge and common knowledge in a distributed environment", Journal of the ACM , 37 (3): 549–587, arXiv: cs/0006009 , doi:10.1145/79147.79161, S2CID   52151232 .
This paper formalized the notion of “knowledge” in distributed systems, demonstrated the importance of the concept of “common knowledge” in distributed systems, and also proved that common knowledge cannot be achieved if communication is not guaranteed. The paper received the Gödel Prize in 1997 and the Edsger W. Dijkstra Prize in Distributed Computing in 2009. [5] [6]

Notes

  1. "PODC Influential Paper Award: 2002", ACM Symposium on Principles of Distributed Computing, retrieved 2009-08-24 Dijkstra (1965) did not receive the PODC Award or Dijkstra Prize but was nevertheless mentioned twice in the descriptions of the winning papers, in 2002 and in 2006.
  2. "Edsger W. Dijkstra Prize in Distributed Computing: 2005", ACM Symposium on Principles of Distributed Computing, retrieved 2009-08-24
  3. "Lamport: The Byzantine generals problem – 5295 citations", Google Scholar, retrieved 2018-10-14
  4. "2004 Gödel Prize", ACM SIGACT, retrieved 2009-08-29
  5. "1997 Gödel Prize", ACM SIGACT, retrieved 2009-08-24
  6. "Edsger W. Dijkstra Prize in Distributed Computing: 2009", ACM Symposium on Principles of Distributed Computing, retrieved 2009-08-24

Related Research Articles

<span class="mw-page-title-main">Edsger W. Dijkstra</span> Dutch computer scientist (1930–2002)

Edsger Wybe Dijkstra was a Dutch computer scientist, programmer, software engineer, systems scientist, and science essayist. He received the 1972 Turing Award for fundamental contributions to developing structured programming languages, and was the Schlumberger Centennial Chair of Computer Sciences at The University of Texas at Austin from 1984 until 2000.

<span class="mw-page-title-main">Leslie Lamport</span> American computer scientist and mathematician

Leslie B. Lamport is an American computer scientist and mathematician. Lamport is best known for his seminal work in distributed systems, and as the initial developer of the document preparation system LaTeX and the author of its first manual.

The Gödel Prize is an annual prize for outstanding papers in the area of theoretical computer science, given jointly by the European Association for Theoretical Computer Science (EATCS) and the Association for Computing Machinery Special Interest Group on Algorithms and Computational Theory. The award is named in honor of Kurt Gödel. Gödel's connection to theoretical computer science is that he was the first to mention the "P versus NP" question, in a 1956 letter to John von Neumann in which Gödel asked whether a certain NP-complete problem could be solved in quadratic or linear time.

Self-stabilization is a concept of fault-tolerance in distributed systems. Given any initial state, a self-stabilizing distributed system will end up in a correct state in a finite number of execution steps.

<span class="mw-page-title-main">Concurrency (computer science)</span> Ability to execute a task in a non-serial manner

In computer science, concurrency is the ability of different parts or units of a program, algorithm, or problem to be executed out-of-order or in partial order, without affecting the outcome. This allows for parallel execution of the concurrent units, which can significantly improve overall speed of the execution in multi-processor and multi-core systems. In more technical terms, concurrency refers to the decomposability of a program, algorithm, or problem into order-independent or partially-ordered components or units of computation.

A Byzantine fault is a condition of a computer system, particularly distributed computing systems, where components may fail and there is imperfect information on whether a component has failed. The term takes its name from an allegory, the "Byzantine generals problem", developed to describe a situation in which, in order to avoid catastrophic failure of the system, the system's actors must agree on a concerted strategy, but some of these actors are unreliable.

ACM SIGACT or SIGACT is the Association for Computing Machinery Special Interest Group on Algorithms and Computation Theory, whose purpose is support of research in theoretical computer science. It was founded in 1968 by Patrick C. Fischer.

A fundamental problem in distributed computing and multi-agent systems is to achieve overall system reliability in the presence of a number of faulty processes. This often requires coordinating processes to reach consensus, or agree on some data value that is needed during computation. Example applications of consensus include agreeing on what transactions to commit to a database in which order, state machine replication, and atomic broadcasts. Real-world applications often requiring consensus include cloud computing, clock synchronization, PageRank, opinion formation, smart power grids, state estimation, control of UAVs, load balancing, blockchain, and others.

The Edsger W. Dijkstra Paper Prize in Distributed Computing is given for outstanding papers on the principles of distributed computing, whose significance and impact on the theory and/or practice of distributed computing has been evident for at least a decade. The paper prize has been presented annually since 2000.

Maurice Peter Herlihy is a computer scientist active in the field of multiprocessor synchronization. Herlihy has contributed to areas including theoretical foundations of wait-free synchronization, linearizable data structures, applications of combinatorial topology to distributed computing, as well as hardware and software transactional memory. He is the An Wang Professor of Computer Science at Brown University, where he has been a member of the faculty since 1994.

The Symposium on Principles of Distributed Computing (PODC) is an academic conference in the field of distributed computing organised annually by the Association for Computing Machinery.

Larry Joseph Stockmeyer was an American computer scientist. He was one of the pioneers in the field of computational complexity theory, and he also worked in the field of distributed computing. He died of pancreatic cancer.

<span class="mw-page-title-main">Cynthia Dwork</span> American computer scientist

Cynthia Dwork is an American computer scientist at Harvard University, where she is Gordon McKay Professor of Computer Science, Radcliffe Alumnae Professor at the Radcliffe Institute for Advanced Study, and Affiliated Professor, Harvard Law School and Harvard's Department of Statistics.

<span class="mw-page-title-main">Nir Shavit</span> Israeli computer scientist

Nir Shavit is an Israeli computer scientist. He is a professor in the Computer Science Department at Tel Aviv University and a professor of electrical engineering and computer science at the Massachusetts Institute of Technology.

The Brooks–Iyengar algorithm or Brooks–Iyengar hybrid algorithm is a distributed algorithm that improves both the precision and accuracy of the interval measurements taken by a distributed sensor network, even in the presence of faulty sensors. The sensor network does this by exchanging the measured value and accuracy value at every node with every other node, and computes the accuracy range and a measured value for the whole network from all of the values collected. Even if some of the data from some of the sensors is faulty, the sensor network will not malfunction. The algorithm is fault-tolerant and distributed. It could also be used as a sensor fusion method. The precision and accuracy bound of this algorithm have been proved in 2016.

Daniel (Danny) Dolev is an Israeli computer scientist known for his research in cryptography and distributed computing. He holds the Berthold Badler Chair in Computer Science at the Hebrew University of Jerusalem and is a member of the scientific council of the European Research Council.

<span class="mw-page-title-main">Hagit Attiya</span> Israeli computer scientist

Hagit Attiya is an Israeli computer scientist who holds the Harry W. Labov and Charlotte Ullman Labov Academic Chair of Computer Science at the Technion – Israel Institute of Technology in Haifa, Israel. Her research is in the area of distributed computing.

Michel Raynal, is a French informatics scientist, professor at IRISA, University of Rennes, France. He is known for his contributions in the fields of algorithms, computability, and fault-tolerance in the context of concurrent and distributed systems. Michel Raynal is also Distinguished Chair professor at the Hong Kong Polytechnic University and editor of the “Synthesis Lectures on Distributed Computing Theory” published by Morgan & Claypool. He is a senior member of Institut Universitaire de France and a member of Academia Europaea.

<span class="mw-page-title-main">Robert Shostak</span> American computer scientist

Robert Eliot Shostak is an American computer scientist and Silicon Valley entrepreneur. He is most noted academically for his seminal work in the branch of distributed computing known as Byzantine Fault Tolerance. He is also known for co-authoring the Paradox Database, and most recently, the founding of Vocera Communications, a company that makes wearable, Star Trek-like communication badges.